Sometimes, long after I listen to a vendor’s discussion, I come away wondering why they do what they do. Oftentimes, it passes but after a recent session with Pure Storage at SFD10, it lingered.
Why engineer storage hardware?
In the last week or so, executives at Hitachi mentioned that they plan to reduce hardware R&D activities for their high end storage. There was much confusion what it all meant but from what I hear, they are ahead now, and maybe it makes more sense to do less hardware and more software for their next generation high end storage. We have talked about hardware vs. software innovation a lot (see recent post: TPU and hardware vs. software innovation [round 3]).
In contrast, Pure Storage just released their FlashBlade system with a home grown, server-flash storage blade configuration integrating raw NAND, with their own flash controller logic in a blade server card form factor inside a blade chasis they designed. That’s a lot of new and unique hardware. They use SDN to connect the blades and FlashBlade chasis’s together into one scale out configuration. And all this just for file, unstructured data?
There’s a market for unstructured flash storage
In the SFD10 session, Par Botes, VP Products at Pure talked about the market emerging that incorporates rich media, data analytics and tech computing. Sounds like HPC/big data customers. He mentioned log analysis, genomics, reservoir simulations, financial models, risk models, etc. and then talked about technical computing with software and hardware development, design verification, deep learning and simulation compute farms. Ok, the market for unstructured data exists and if anything getting larger.
So why all the new hardware?
Here are some of the tradeoffs they made during development of FlashBlade:
- How much compute per flash storage – early on they had a server/flash mini-card-deck that fit in a drive bay, they could tweak with more or less compute or flash. It turned out that the SSD Flash Translation Layer (FTL) has become a bottleneck in most SSDs reducing IOPS/throughput. By moving FTL to the software and hardware running on the server, they could parallelize this activity. But knowing how much compute, hardware and software to use for a given amount of flash was not a given and the feeling is that all this had to scale with the amount of flash.
- How much NVRam for concurrent write IO – it turns out that NVRam can also be a bottleneck for system write IO activity. You can put more NVRam into a system but eventually it needs to be destined and increasing the number of concurrent write streams always required more NVRam. So, NVRam was another factor that they wanted to scale with the the amount of flash storage.
- How much metadata activity was needed for large file systems– given the need to support millions of files, open at the same time, metadata access was sure to become a critical path for IO throughput. Metadata is both compute and IO intensive. So, being able to scale up metadata IO, with the storage became another requirement.
They made use of FPGAs for their hardware design to make it flexible and amenable to change and quicker to develop.
For metadata, they partitioned the name space across a gaggle of controller blades so a portion of the servers had control over a portion of the metadata. The metadata state was maintained in NVRam and flash. Each metadata partition manager had global access to their reserved portion of NVRam throughout the FlashBlades in a chasis. Partition management was distributed across multiple server blades for high availability.
Does compute scaleability make sense?
All this compute scaleability provides much more concurrent access to storage. This was evident as Rob Lee, Chief Architect demonstrated in a slide showing how they were able to triple the number of concurrent builds (for a major car company’s software) while cutting build time by a factor of 6 or more.
Compute scaleability will become even more important when storage class memories (SCM, 3D XPoint, 3DX) start coming out (Intel says by the end of 2016) that does even more IOPS, at a faster clip. With such speed, storage controller-metadata compute requirements will go up significantly to keep up, not down. By having separate storage controller servers attached to each storage element, Pure’s well positioned when the time comes to implement SCM. 3DX will no doubt require new hardware, with new control logic, maybe different FTL and probably different ECC as well, but all this is in currently in FPGA’s and can be changed relatively easily. So what I see here, is one company positioning themselves to take maximum advantage of the next big challenge in storage technology, supporting SCM/3DX as storage.
Some other commentary on Pure Storage SFD10 sessions:
- Pure Storage really aren’t a one-trick pony by @PenguinPunk (Dan Firth)
- SFD10 Preview: Pure Storage by @ChrisMEvans
- Flash needs a highway by @NetworkingNerd (Tom Hollingsworth)