Flash Memory Arrays Coming Out of the IPO Woodwork
Organization of Content
Benefits of Flash Memory
Categorization of Vendors
· F Flash Memory Arrays
· P PCIe SSD Cards
· L Large Traditional Vendors
· Use Cases Where Flash Memory Array Start Ups are Currently Successful
· B Broader Use Case List
· U Use Cases Flash Memory Arrays Start Ups Not Successful
Requirements for Flash Memory Arrays as Primary Storage
Discussion of Flash Memory Array Vendors
Appendix 1 - Expansion of Addressable Market
Flash Memory products are enjoying the limelight, and a number of flash memory based start-ups have either stated their intentions or executed their plans to IPO. The focus of this article will be upcoming IPOs for Flash Memory Arrays from Nimble Storage, Nimbus Storage, Pure Storage in 2014, and the Violin Memory (VMEM) IPO executed on Sept 27, 2013.
The purpose of this article is to help those who want to build their own financial models, as well as those who seek to understand the difference between the products in support of their own investment decisions. Finally, I will make my own recommendations as to which vendors have potential to move to the next level, as well as where their limitations to revenue exist. Violin's financials will not be reviewed here because they have already been reviewed by other authors.
Lastly, my intent is to sort out marketing claims from the practical realities of running a storage in a high-end data center environment. Please note that there are many other Flash Memory Arrays competing in this space, but discussion of those vendors is beyond the scope of this article.
From a practical standpoint, I recommend that Nimble Storage has reasonable prospects of being successful over the long run, with Pure Storage coming in second. I expect Nimbus Storage to be successful on the low end on non-mission critical applications because of its low price point, commodity parts, and claims about being full featured. Violin Memory will be successful in those use cases needing the highest performance but will have difficulty expanding out of its current areas of success. Its current financial losses were just one of several factors, which contributed to its underwhelming IPO of September 27, 2013.
All of the aforementioned vendors are currently successful in VDI (Virtual Desktop Infrastructure), Server Virtualization, and Analytics, as well as database and file sharing at the SMB (small medium business), workgroup, departmental, and divisional level.
Breaking out of these use cases and moving into more rigorous (and higher margin) use cases will be difficult, but Pure Storage and Nimble Storage have the potential of doing so in the medium term. See Appendix 1 for details.
Where they will not achieve success in the short and medium term is displacement of Primary Storage as currently dominated by EMC Symmetrix (EMC), HDS (Hitachi Data Systems) VSP (HTIHY.PK) or IBM DS 8000 products (IBM). EMC, HDS, and IBM have, or will soon have their own products, which will form a moat that these flash based start-ups will have difficulty crossing. The term Primary Storage is used to denote the large systems that are responsible for key operations supporting Fortune 500 corporate wide operations and are often directly linked to revenue acquisition. Examples include Retail banking CIF (Central Information File) and large Consumer Retail (databases supporting Point of Sales systems). In addition, Big Customers prefer to buy from Big Vendors.
From an investment standpoint, the takeaway is that this group of start-ups will have limitations as to how large their revenue opportunity is. The context of this statement is that I have observed a wide variety of market sizing perspectives. They range from flash memory, which is going to displace all hard disk drives, driving Seagate and Western Digital out of business, to flash memory array start-ups, which are going to put EMC out of business. As with many ideas, the practical reality lies somewhere in between extremes. And that in-between is the somewhat fuzzy line (at least to those that have not actually deployed these products) between where they are successful now and Primary Storage.
The bottom line is that the price you forecast or pay for these stocks should reflect these limitations.
In summary, these flash memory start-ups will experience success, up to a point. And that point is Primary Storage, dominated by incumbent vendors such as EMC, IBM, and HDS. The big vendors will be slow to market; they are always slow to market with new products. But they are slow because they understand it is a big job to build all of the ancillary features required for the Primary Storage environment, along with the normal problems of developing a new product. It takes more than NAND chips and high performance to be successful in Primary Storage.
Benefits of Flash Memory
Simply stated, flash memory offers substantial performance improvement over hard disk drives (HDD) while consuming less power, space, and cooling. Flash memory also provides more reliability, and excellent ability to process random reads and writes characteristic of OLTP (Online Transaction Processing) workloads. Note, the emphasis on performance, not capacity, claims about deduplication and compression increasing effective capacity notwithstanding (more on this later). The key point is that HDD has superior cost-effective capacity in both the short and medium term.
From a simplified perspective, performance is measured in terms of IOPs (Input/Output Operations second), latency (measure of response time), and throughput (big files transferred in terms of Gigabytes per second or GB/sec). A more detailed explanation is beyond the scope of this article. It is sufficient to say all vendors play games with their performance numbers; SNIA (Storage Networking Industry Associated) felt it necessary to publish a multi-page white paper on the subject to help sort out issues in a rational manner.
Let's take a moment to think about IOPs: a 800 GB Intel PCIe SSD can generate 180,000 read IOPs, or 75,000 write IOPs for about $4,000. A typical 600 GB Fibre Channel (FC) hard disk drive can do about 200 IOPs and costs about $600. So for a similar capacity, flash memory provides substantial performance advantages.
In a high performance transactional environment, the FC disk drive will be short stroked, which means the data is placed only on the outer edges of the platter, leaving the rest of the disk empty. While this provides performance, it is also very inefficient, and 15K FC are the hard disk drives that flash memory will displace.
On the other hand, from a capacity standpoint, a 4 TB hard disk drive costs about $100, or in rough numbers, about 4 cents per GB. Flash Memory at the low end commodity level, costs about one dollar. So for capacity applications, such as storing a lot of PowerPoint presentations, Word, Excel or photos, hard disk drives are very cost effective for storing a lot of data. Both flash memory and hard disk drives can have deduplication and compression software applied to them to reduce the amount of space consumed by data. In addition, there are several technologies under development with long-term plans of record (POR) to achieve 50-60 TB per hard disk drive, further improving the capacity and cost benefits of hard disk drives. Round numbers are used here for simplicity.
Also to put things in context, a banking or retail transaction is relatively small, about 50 bytes (about 50 characters) where a Word document can easily be 1000 characters (1000 bytes) or a PowerPoint can be 10 million bytes (10 MB).
So a highly active transactional systems needs less capacity, and systems processing a lot of files need more capacity.
In summary, both flash memory and hard disk drives have a useful role to play in storage systems. Flash memory is excellent where performance is needed to process active data, such as transaction processing, retrieving hot files, or as a fast read-only repository for static information. Hard disk drives are very useful when cost effective capacity is needed to store infrequently accessed data for a long time with reasonable retrieval times (compared to tape).
Categorization of Vendors
Flash memory products, of which Flash Memory Arrays are but one category, are categorized in this section. The purpose is to clarify which flash memory products do and do not compete.
A key point is that Flash Memory Array vendors in the majority of cases do not compete with PCIe SSD products.
Flash Memory Arrays
Violin Memory 3000 and 6000
PCIe SSD Cards
Fusion IO (FIO)
Virident (Acquired by Western Digital (WDC))
EMC VFCache Velocity
Violin Velocity product line
Large Traditional Vendors
EMC (purchased XtremeIO)
IBM (purchased Texas Memory Systems)
HDS (Hitachi Data Systems is expected to come out with their own flash memory arrays)
HP (HPQ) (Modified 3 PAR machines)
NetApp (NTAP) (retrofitted former LSI Engenio machines)
Use Cases Where Flash Memory Array Start Ups are currently successful
Virtual Desktop Infrastructure
In this use case, desktop images (operating systems and configuration) along with data are placed in flash memory.
Desktops can be updated and managed at a centralized location, and the flash memory handles a high volume of reads and writes from the desktops more efficiently than hard disk drives. VDI usually serve workgroups, departments, and divisions.
A flash memory array can serve as high performance storage for many virtual machines residing in a large number of physical servers. The problem this solves is that a large number of virtual machines generate such a large number of random reads and random writes that they can overwhelm hard disk drives. Flash memory, with its ability to handle randomness, is a good fit here.
Analyze data quickly and create reports faster than on disk drives. Reports that once took hours to run can be processed far more quickly. In some (but not all) analytics applications, the only characteristic needed is performance.
High availability, data protection, and a host of other storage management software modules are not needed to be successful. The reason for this is that the initial data capture occurs on a Primary Storage System, which is extremely capable, and a copy is given to the analytics application for processing. If the analytics systems corrupts or loses the data, that is acceptable. Because there is another master copy stored safely on the really expensive Primary Storage system, and a copy is reloaded into the analytics system for reprocessing. With that said, there are analytics applications which do require all the storage management software bells and whistles, and it is these class of analytics applications that will pose a barrier to flash memory array vendors that do not have complete or effective storage management software.
SMB (Small Medium Business) Database
Database for small medium business often used for transaction processing. SMB typically do not run 24/7, and for that reason product technical requirements are less rigorous. Products that are designed to run 24/7 with high availability and extensive storage management software stacks are typically far more expensive.
SMB (Small Medium Business) file caching
Hot data is kept in flash memory (the cache) and mathematical algorithms keep track of frequently used data and place frequently accessed data in cache and less frequently used data on hard disk drives. In practice, this is successfully used in smaller installations and the algorithms are less effective on very large data sets (mostly true, there are a few very large NAS vendors with specialized high performance features). File caching should not be compared to Tiered Storage Management software, despite the marketing claims of some vendors.
Broader List of Use Cases
Online transaction processing
There are two levels here.
Primary Storage used for Tier 1 Online transaction processing (OLTP). This is what large Fortune 500 companies use to process revenue on a corporate wide basis. Examples include large retail banks and consumer retailers. Since money is being processed on a 24/7 global basis for an entire company (as opposed to a department or a division), systems must always be operational and protect the data from loss. RAS (Reliability, Availability, and Serviceability) is top priority and product requirements are very rigorous and expensive to implement.
SMB (Small Medium Business) database, sometimes referred to as Tier 1.5 OLTP are characterized as smaller operations not running 24/7 and supporting activities at the workgroup, departmental, divisional or small medium business level. Since there is less at stake, technical requirements of the product are less rigorous, and the product is less expensive. In other words, it is easier to enter this market space than Primary Storage.
Sharing files such as Word, Excel, PowerPoint, photo, video. Also referred to as NAS, or Network Attached Storage.
Hot data is kept in flash memory (the cache) and mathematical algorithms keep track of frequently used data and place frequently accessed data in cache and less frequently used data on hard disk drives. In practice, this is successfully used in smaller installations and the algorithms are less effective on very large data sets (mostly true, there are a few very large NAS vendors with specialized high performance features), File caching should not be compared to Tiered Storage Management software, despite the marketing claims of some vendors.
Analyze data quickly and create reports faster than processing data from hard disk drives. Reports that once took hours to run can be processed far more quickly. In some (but not all) analytics applications, the only characteristic you need is performance.
High availability, data protection, and a host of other storage management software are not necessary to be successful in this application. The reason for this is that the initial data capture occurs on a Primary Storage System, which is extremely capable, and a copy is given to the analytics application for processing. If the analytics systems, fails, corrupts or loses the data, that is acceptable. Because there is another master copy stored safely on the really expensive Primary Storage system, and a copy is reloaded into the analytics system for reprocessing. With that said, there are analytics applications which do require all the storage management software bells and whistles, and it is these class of analytics applications that will pose a barrier to flash memory array vendors that do not have high availability or complete or effective storage management software.
Tiered Storage Management
Migrate data to the appropriate media based on performance, cost, and persistence. For example, move data from conventional DRAM to high performance flash memory, to hard disk, to tape. Data movement is based on policies. An example of a policy is "move all data in Volume 1 from Flash Memory" to hard disk drives after 30 days." This is just one very simple example. Tiered Storage Management software is not the same as caching software, and the two software packages fulfill different missions and should not be positioned as one being better or worse than the other. It really depends on the customers' business, operational, and technical requirements.
Server Side Virtualization
A flash memory array can serve as high performance storage shared across many virtual machines residing in a large number of physical servers. The problem this solves is that a large number of virtual machines generate such a large number of random reads and random writes that they can overwhelm hard disk drives. Flash memory, with its ability to handle randomness very efficiently is a good fit here.
This is an Amazon (AMZN), Google (GOOG), or Facebook (FB) type of architecture. They use white box Intel-architecture servers and run inexpensive open source software and configuration tools which allow them to manage thousands of boxes and software in an automated manner. The Flash Memory Arrays start-up products are typically not used here; these customers usually use PCIe SSD cards, such as those provided by Fusion IO, LSI, Intel, and others.
Where Flash Memory Arrays are Not Successful at This Time
Tier 1 OLTP (this encompasses big databases)
Big File Serving (Multi-Petabyte scale)
Requirements for Flash Memory Arrays as Primary Storage
Highly available controllers
Non-disruptive firmware upgrade
Data Protection (snapshots, clones, consistency groups)
Data reduction (Compression, Deduplication)
Disaster Recovery (Asynchronous Communications)
Hot Site (Synchronous communications)
Tiered Storage Management
Systems Monitoring and Management
The basic benefit of Flash Memory is performance, and all of these start-ups provide excellent performance relative to hard disk drives. Amongst the flash memory array start-ups it is harder to determine who has the highest performance due to non-standard benchmarks being used by these vendors. But they are all fast, and Violin apparently has the reputation for performance based on their custom (and expensive) built flash memory modules.
But it takes more than performance to meet the needs of the commercial enterprise. Requirements for Flash Memory Arrays moving to the next level include items listed above.
However, the ability of these vendors to displace EMC, IBM, and HDS in Primary Storage are limited due to moats they are building against these start-ups.
As mentioned earlier, the Flash Memory Array vendors are successful in a specific set of use cases. The question from an investment perspective is "Which of these vendors have the ability to break out and expand their use cases?"
Nimble Storage possesses several elements needed to be successful. These include highly available controllers, non-disruptive firmware upgrades, data protection, data reduction software, disaster recovery, and other storage management software. Nimble markets the use of its products differently than the other vendors, emphasizing read only use. The golden copy of the data is held on disk, and can be reloaded if the read-only copy is lost or corrupted. While not as sexy as promoting performance, this illustrates very practical thinking.
The key distinguishing point is that they are one of the few flash memory array vendors to combine both flash memory and hard disk drives. From my experience, I think this demonstrates practical thinking and understanding of the storage business by Nimble Storage. Holding Nimble Storage back will be the lack of Tiered Storage Management software, which can use policies to move data from flash to disk as data ages. This is just one example of a simple policy. Nimble Storage uses caching, and it has been my experience that the algorithms used to pull data off of the slower data and place it in faster flash memory or main memory (DRAM) may become less effective in placing hot data in either flash memory or conventional DRAM as datasets get bigger. Figuring this out in field deployments is more art than science. The fact that Nimble Storage is not fully featured at this point of time should not be viewed as a negative; it's done quite a bit on high availability and storage management software. As a small company, I think it will need more time, money, and people to improve its capability over time.
Pure Storage also possesses several elements, which can make it successful. It deployed active/active symmetric highly available controllers (a very high end form of controller high availability), non-disruptive firmware upgrades, and deployed data protection in the early part of their product lifecycle. Pure Storage places substantial emphasis on deduplication and compression technologies to support their contention that flash memory is just as cost effective as disk. Pure Storage understands and is serious about high availability by the deployment of symmetric active/active high available controllers. High Availability is a way of saying that the systems can recover quickly from failure, and keep processing. Its CEO, Scott Dietzen is very serious about storage management software, and this is key to breaking out and serving the needs of very large Fortune 500 customers. I see them being successful in department and divisional level. Limiting its market expansion is a lack of hard disk drive support for capacity, and asynchronous communications for disaster recovery. I also disagree with their viewpoint that deduplication and compression makes flash memory as cost effective as hard disk drives; hard disk drives use these data reduction technologies also.
The fact that Pure Storage is not fully featured at this point of time should not be viewed as a negative; they've done quite a bit of work on high availability and storage management software. As a small company, I think they will need more time, money, and people to improve their capability over time.
The third company in this list is Nimbus Data. I view this company as very practical from an expense control standpoint. Generally speaking, it is successful in the same areas as the other flash memory vendors. It uses off the shelf SSD cards rather than invest in or take the risk building its own SSDs, and it outsources its software development to India. It deployed dual controller failover, non-disruptive firmware upgrades, data protection, and data reduction, and disaster recovery. This company will be successful doing blocking and tackling use cases, and because of their emphasis on managing expenses, they will be profitable and will address (and be competitive) at the lower end of the market. Sometimes good enough is, well, good enough.
From a product standpoint, Violin products are high performance and highly available, and they are successful in the same areas as the other Flash Memory Array vendors. It is in the early stages of deploying their storage software to complement their hardware. One element of this deployment is data protection based on Symantec Storage Foundation Suite. Another is the Maestro Memory Suite. Since these are relatively new retrofits on to the Violin hardware products, it would be wise to wait and listen for reports from actual users, marketing claims notwithstanding. The ability of Violin to break out and substantially expand their market presence will be substantially impacted by the success (or lack thereof) of these recently added features.
The Symantec Storage Foundation Suite was originally designed to work with disk drives, and has been qualified to work with the Violin storage controllers. The Maestro Memory Suite Software is provided by attaching external hardware appliances to the Violin Memory Arrays. While Violin marketing has compared Maestro EMC FAST (a true Tiered Storage Management package), Maestro is really a caching software package and is not directly comparable to EMC FAST in terms of function. Caching algorithms, which are designed to bring frequently used data into flash memory from other media (such as hard disk drives) are typically successful in smaller installations, and are usually used for file sharing as opposed to the dedicated Tier 1 OLTP capabilities of an EMC Symmetrix, HDS VSP, and IBM DS 8000. Because of these variables, it would be wise to hear from actual customers as to the effectiveness of these software packages. Maestro has also been positioned as a way to deploy flash memory systems into legacy hard disk environments. However, there are typically issues when mixing different vendors' equipment and the field level systems engineering work to do this may be considered an issue by customers. In addition, large customers typically (but not always) look for a complete solution from their vendors rather than mix and match. Note also, that adding external hardware appliances (Maestro) adds more devices that increase rack space, and consume more power and cooling. While not a big issue when supporting a small installation (workgroup, department, division, or small business), it becomes a big issue in high-end data centers, which purchase 50 or 100 flash memory arrays, and would have to add many more physical devices. At large scale, this contradicts the 15-year trend to consolidating into fewer devices, not more. More devices drives up systems monitoring, management operational expenses. In addition, competitors offer similar functionalities designed into their systems in the beginning rather than retrofitting capability.
In both cases (Symantec and Maestro), it would be wise to wait for published use cases to evaluate the success of these retrofitted software capabilities. Care should be exercised when reviewing published use cases though; often small customers with non-mission critical requirements are represented as high-end data use cases by vendors.
In another initiative, Violin has committed to a PCIe SSD product, the Velocity product line. This is the same market space as Fusion IO, which is forecasting declining margins. This area of the market is being entered by major commodity manufacturers - LSI (LSI), Intel, Samsung (OTC:SSNLF), Seagate, Western Digital to name just a few.
The hyperscale market requirements are more about low cost, less capable products, and represents a challenging opportunity. There is no clear data that Violin can take substantial market share in a market that is rapidly approaching commoditization by big manufacturers with tremendous economies of scale. This is another case of waiting for clear results in what appears to be a risky endeavor.
In conclusion, especially in context of the recent underwhelming IPO, Violin may have some difficulty expanding their revenues in this difficult competitive environment. I would watch for clear positive financial results rather than making speculative commitment of funds to this stock.
Current revenue in the flash memory space come from VDI, Server side virtualization, analytics, and SMB (Small/Medium business) data bases. The key investment issue going forward is "Which vendors will be able to expand their share of this market, as well as expand into new revenue opportunities."
New opportunities can be characterized as "moving up the food chain." Examples include:
1. Primary Storage which includes Tier 1 OLTP
2. Large File Serving use cases
3. Corporate wide use cases with both Hot Sites and Disaster Recovery sites processing millions of dollars of revenue.