QoM1610: Will NVMe over Fabric GA in enterprise AFA by Oct’2017

NVMeNVMe over fabric (NVMeoF) was a hot topic at Flash Memory Summit last August. Facebook and others were showing off their JBOF (see my Facebook moving to JBOF post) but there were plenty of other NVMeoF offerings at the show.

NVMeoF hardware availability

When Brocade announced their Gen6 Switches they made a point of saying that both their Gen5 and Gen6 switches currently support NVMeoF protocols. In addition to Brocade’s support, in Dec 2015 Qlogic announced support for NVMeoF for select HBAs. Also, as of  July 2016, Emulex announced support for NVMeoF in their HBAs.

From an Ethernet perspective, Qlogic has a NVMe Direct NIC which supports NVMe protocol offload for iSCSI. But even without NVMe Direct, Ethernet 40GbE & 100GbE with  iWARP, RoCEv1-v2, iSCSI SER, or iSCSI RDMA all could readily support NVMeoF on Ethernet. The nice thing about NVMeoF for Ethernet is not only do you get support for iSCSI & FCoE, but CIFS/SMB and NFS as well.

InfiniBand and Omni-Path Architecture already support native RDMA, so they should already support NVMeoF.

So hardware/firmware is already available for any enterprise AFA customer to want NVMeoF for their data center storage.

NVMeoF Software

Intel claims that ~90% of the software driver functionality of NVMe is the same for NVMeoF. The primary differences between the two seem to be the NVMeoY discovery and queueing mechanisms.

There are two fabric methods that can be used to implement NVMeoF data and command transfers: capsule mode where NVMe commands and data are encapsulated in normal fabric packets or fabric dependent mode where drivers make use of native fabric memory transfer mechanisms (RDMA, …) to transfer commands and data.

12679485_245179519150700_14553389_nA (Linux) host driver for NVMeoF is currently available from Seagate. And as a result, support for NVMeoF for Linux is currently under development, and  not far from release in the next Kernel (I think). (Mellanox has a tutorial on how to compile a Linux kernel with NVMeoF driver support).

With Linux coming out, Microsoft Windows and VMware can’t be far behind. However, I could find nothing online, aside from base NVMe support, for either platform.

NVMeoF target support is another matter but with NICs/HBAs & switch hardware/firmware and drivers presently available, proprietary storage system target drivers are just a matter of time.

Boot support is a major concern. I could find no information on BIOS support for booting off of a NVMeoF AFA. Arguably, one may not need boot support for NVMeoF AFAs as they are probably not a viable target for storing App code or OS software.

From what I could tell, normal fabric multi-pathing support should work fine with NVMeoF. This should allow for HA NVMeoF storage, a critical requirement for enterprise AFA storage systems these days.

NVMeoF advantages/disadvantages

Chelsio and others have shown that NVMeoF adds ~8μsec of additional overhead beyond native NVMe SSDs, which if true would warrant implementation on all NVMe AFAs. This may or may not impact max IOPS depending on scale-ability of NVMeoF.

For instance, servers (PCIe bus hardware) typically limit the number of private NVMe SSDs to 255 or less. With an NVMeoF, one could potentially have 1000s of shared NVMe SSDs accessible to a single server. With this scale, one could have a single server attached to a scale-out NVMeoF AFA (cluster) that could supply ~4X the IOPS that a single server could perform using private NVMe storage.

Base level NVMe SSD support and protocol stacks are starting to be available for most flash vendors and operating systems such as, Linux, FreeBSD, VMware, Windows, and Solaris. If Intel’s claim of 90% common software between NVMe and NVMeoF drivers is true, then it should be a relatively easy development project to provide host NVMeoF drivers.

The need for special Ethernet hardware that supports RDMA may delay some storage vendors from implementing NVMeoF AFAs quickly. The lack of BIOS boot support may be a minor irritant in comparison.

NVMeoF forecast

AFA storage systems, as far as I can tell, are all about selling high IOPS and very-low latency IOs. It would seem that NVMeoF would offer early adopter AFA storage vendors a significant performance advantage over slower paced competition.

In previous QoM/QoW posts we have established that there are about 13 new enterprise storage systems that come out each year. Probably 80% of these will be AFA, given the current market environment.

Of the 10.4 AFA systems coming out over the next year, ~20% of these systems pride themselves on being the lowest latency solutions in the market, and thus command high margins. One would think these systems would be the first to adopt NVMeoF. But, most of these systems have their own, proprietary flash modules and do not use standard (NVMe) SSDs and can use their own proprietary interface to their proprietary flash storage. This will delay any implementation for them until they can convert their flash storage to NVMe which may take some time.

On the other hand, most (70%) of the other AFA systems, that currently use SAS/SATA SSDs, could boost their IOP counts and drastically reduce their IO  response times, by implementing NVMe SSDs and NVMeoF. But converting SAS/SATA backends to NVMe will take time and effort.

But, there are a select few (~10%) of AFA systems, that already use NVMe SSDs in their AFAs, and for these few, they would seem to have a fast track towards implementing NVMeoF. The fact that NVMeoF is supported over all fabrics and all storage interface protocols make it even easier.

Moreover, NVMeoF has been under discussion since the summer of 2015, which tells me that astute AFA vendors have already had 18+ months to develop it. With NVMeoF host drivers & hardware available since Dec. 2015, means hardware and software exist to test and validate against.

I believe that NVMeoF will be GA’d within the next 12 months by at least one enterprise AFA system. So my QoM1610 forecast for NVMeoF is YES, with a 0.83 probability.





Who’s the next winner in data storage?

Strange Clouds by michaelroper (cc) (from Flickr)
Strange Clouds by michaelroper (cc) (from Flickr)

“The future is already here – just not evenly distributed”, W. Gibson

It starts as it always does outside the enterprise data center. In the line of businesses, in the development teams, in the small business organizations that don’t know any better but still have an unquenchable need for data storage.

It’s essentially an Innovator’s Dillemma situation. The upstarts are coming into the market at the lower end, lower margin side of the business that the major vendors don’t seem to care about, don’t service very well and are ignoring to their peril.

Yes, it doesn’t offer all the data services that the big guns (EMC, Dell, HDS, IBM, and NetApp) have. It doesn’t offer the data availability and reliability that enterprise data centers have come to demand from their storage. require. And it doesn’t have the performance of major enterprise data storage systems.

But what it does offer, is lower CapEx, unlimited scaleability, and much easier to manage and adopt data storage, albeit using a new protocol. It does have some inherent, hard to get around problems not the least of which is speed of data ingest/egress, highly variable latency and eventual consistency. There are other problems which are more easily solvable, with work, but the three listed above are intrinsic to the solution and need to be dealt with systematically.

And the winner is …

It has to be cloud storage providers and the big elephant in the room has to be Amazon. I know there’s a lot of hype surrounding AWS S3 and EC2 but you must admit that they are growing, doubling year over year. Yes it is starting from a much lower capacity point and yes, they are essentially providing “rentable” data storage space with limited or even non-existant storage services. But they are opening up whole new ways to consume storage that never existed before. And therein lies their advantage and threat to the major storage players today, unless they act to counter this upstart.

On AWS’s EC2 website there must be 4 dozen different applications that can be fired up in the matter of a click or two. When I checked out S3 you only need to signup and identify a bucket name to start depositing data (files, objects). After that, you are charged for the storage used, data transfer out (data in is free), and the number of HTTP GETs, PUTs, and other requests that are done on a per month basis. The first 5GB is free and comes with a judicious amount of gets, puts, and out data transfer bandwidth.

… but how can they attack the enterprise?

Aside from the three systemic weaknesses identified above, for enterprise customers they seem to lack enterprise security, advanced data services and high availability storage. Yes, NetApp’s Amazon Direct addresses some of the issues by placing enterprise owned, secured and highly available storage to be accessed by EC2 applications. But to really take over and make a dent in enterprise storage sales, Amazon needs something with enterprise class data services, availability and security with an on premises storage gateway that uses and consumes cloud storage, i.e., a cloud storage gateway. That way they can meet or exceed enterprise latency and services requirements at something that approximates S3 storage costs.

We have talked about cloud storage gateways before but none offer this level of storage service. An enterprise class S3 gateway would need to support all storage protocols, especially block (FC, FCoE, & iSCSI) and file (NFS & CIFS/SMB). It would need enterprise data services, such as read-writeable snapshots, thin provisioning, data deduplication/compression, and data mirroring/replication (synch and asynch). It would need to support standard management configuration capabilities, like VMware vCenter, Microsoft System Center, and SMI-S. It would need to mask the inherent variable latency of cloud storage through memory, SSD and hard disk data caching/tiering.. It would need to conceal the eventual consistency nature of cloud storage (see link above). And it would need to provide iron-clad, data security for cloud storage.

It would also need to be enterprise hardened, highly available and highly reliable. That means dually redundant, highly serviceable hardware FRUs, concurrent code load, multiple controllers with multiple, independent, high speed links to the internet. Todays, highly-available data storage requires multi-path storage networks, multiple-independent power sources and resilient cooling so adding multiple-independent, high-speed internet links to use Amazon S3 in the enterprise is not out of the question. In addition to the highly available and serviceable storage gateway capabilities described above it would need to supply high data integrity and reliability.

Who could build such a gateway?

I would say any of the major and some of the minor data storage players could easily do an S3 gateway if they desired. There are a couple of gateway startups (see link above) that have made a stab at it but none have it quite down pat or to the extent needed by the enterprise.

However, the problem with standalone gateways from other, non-Amazon vendors is that they could easily support other cloud storage platforms and most do. This is great for gateway suppliers but bad for Amazon’s market share.

So, I believe Amazon has to invest in it’s own storage gateway if they want to go after the enterprise. Of course, when they create an enterprise cloud storage gateway they will piss off all the other gateway providers and will signal their intention to target the enterprise storage market.

So who is the next winner in data storage – I have to believe its going to be and already is Amazon. Even if they don’t go after the enterprise which I feel is the major prize, they have already carved out an unbreachable market share in a new way to implement and use storage. But when (not if) they go after the enterprise, they will threaten every major storage player.

Yes but what about others?

Arguably, Microsoft Azure is in a better position than Amazon to go after the enterprise. Since their acquisition of StorSimple last year, they already have a gateway that with help, could be just what they need to provide enterprise class storage services using Azure. And they already have access to the enterprise, already have the services, distribution and goto market capabilities that addresses enterprise needs and requirements. Maybe they have it all but they are not yet at the scale of Amazon. Could they go after this – certainly, but will they?

Google is the other major unknown. They certainly have the capability to go after enterprise cloud storage if they want. They already have Google Cloud Storage, which is priced under Amazon’s S3 and provides similar services as far as I can tell. But they have even farther to go to get to the scale of Amazon. And they have less of the marketing, selling and service capabilities that are required to be an enterprise player. So I think they are the least likely of the big three cloud providers to be successful here.

There are many other players in cloud services that could make a play for enterprise cloud storage and emerge out of the pack, namely Rackspace, Savvis, Terremark and others. I suppose DropBox, Box and the other file sharing/collaboration providers might also be able to take a shot at it, if they wanted. But I am not sure any of them have enterprise storage on their radar just yet.

And I wouldn’t leave out the current major storage, networking and server players as they all could potentially go after enterprise cloud storage if they wanted to. And some are partly there already.



Enhanced by Zemanta