There’s a new cluster filesystem on the block, Elastifile

At SFD12 last month we talked with the team from Elastifile. They are a new startup out of Israel working on a better cluster file system.

Elastifile was designed to support 1000s of nodes, 100,000 of users/client and 1000s of data containers (file systems/mount points), together with an infinite (64 bit) number of files and directories and up to Exabytes (10**18) in capacity. They also offer a 100% SSD file store capability. I encourage you to view the videos of their presentations at SFD12 to learn more.

Elastifile features

Elastifile supports data compression and optionally deduplication with NAND/Flash (e. g., low-/high-endurance) storage tiering, cloud storage tiering and multi-site storage. They also provide NFSv3/v4, SMB, AWS S3 and HDFS as native access protocols for their file storage.

They also offer non-disruptive hardware/software upgrades, n-way (2- or 3-way) data and metadata redundancy, self-healing capabilities, snapshots, and synchronous/asynchronous data replication or mirroring. Further, they provide multi-tenancy and QoS support.

Elastifile can be used in hyper converged mode as well as a dedicated storage server mode. For backend storage, they support heterogeneous, physical (block, I think?) storage systems as well as direct access storage in cluster nodes

Internals matter

Elastifile’s architecture supports accessor, owner and data nodes. But these can all be colocated on the same server or segregated across different servers.

Owner nodes, own all the metadata objects for a file or directory and caches the metadata working set in i’s memory. Ownership file or directory metadata may change in the case of hardware failures.

Elastifile supports a dynamic write data path, which means they determine, in real time, where to write file data rather than having the data locations identified before hand. They call this distributed write anywhere semantics.

Notably they don’t do data caching (with NVMe it doesn’t make sense) however, as noted above, they do use metadata caching

Internally, Elastifile uses variable length objects for both file data and metadata.

  • File data is composed of three object types: a file metadata (FileMD) object, mapping data objects, and file data objects. FileMD’s hold the normal file metadata (name, file size, create, access & modify ToDs, etc.) as well as pointing to all the Mapping Object (OIDs). Mapping objects exist for each 0.5MB of file data and consist of a 128 element table, each element mapping 4KB of file address space to a data object (OID). Each  data object holds the 4KB of compressed file data and journal log entries.
  • Director metadata is composed of directory metadata (DirMD) object and Directory listing objects. Directory listing objects maps file/directory names to FileMD or DirMD OIDs. Directory listing objects are accessed via an extensible hash table and contain a list of filenames/directory names within the directory

The Elastifile software architecture consists of three layers:

  • A protocol layer which terminates file system access protocols and translates requests into internal requests. The hashing and data compression of file data occur at this level.
  • A metadata layer which provides file system/directory name mapping to objects for owned files/directories and maintains file/directory metadata updates/journals/checkpoints.
  • A data layer which provides transaction consistency and a n-way redundant persistent data storage for (file or metadata) objects.

Metadata operations are persisted via journaled transactions and which are distributed across the cluster. For instance the journal entries for a mapping data object updates are written to the same file data object (OID) as the actual file data, the 4KB compressed data object.

There’s plenty of discussion on how they manage consistency for their metadata across cluster nodes. Elastifile invented and use Bizur, a key-value consensus based DB. Their chief architect Ezra Hoch (@EzraHoch) did a blog post and paper on Bizur for more information


New file systems generally take many years to mature and get out into the market, cluster file systems even longer. Elastifile started in 2013, by some very smart engineers, is already on the market, just 4 years later. That’s impressive enough, but with their list of advanced functionality plus cloud storage tiering and multi-site operations all shipping in the current product is mind-blowing.

One lingering question is, does a market exist for another cluster file system? All flash is interesting but most of the current CFS’s do this and ship this today. Cloud storage tiering is interesting and a long term need but some CFSs already have this and others are no doubt implementing it as we speak. CFS’s use of objects for internal data and metadata management is not new and may make internals cleaner but don’t really provide a lot of customer benefit.

Exascale raw capacity, support for 100K users, 1000s of nodes, 1000s of file systems and an infinite # of files/directories is interesting. But most CFSs claim this level of support already, although this is more aspirational for some. And proving support at this scale is difficult, if not impossible.

On the other hand, Bizur is really neat. Its primary benefit is during recovery from hardware failures. For a CFS with 1000s of nodes, failures likely occur quite often. So Bizur’s advantage here may pay significant customer dividends.

Is that enough to to market a new CFS?

To see what other SFD12 bloggers have written on Elastifile, please see:

Hardware vs. software innovation – round 4

We, the industry and I, have had a long running debate on whether hardware innovation still makes sense anymore (see my Hardware vs. software innovation – rounds 1, 2, & 3 posts).

The news within the last week or so is that Dell-EMC cancelled their multi-million$, DSSD project, which was a new hardware innovation intensive, Tier 0 flash storage solution, offering 10 million of IO/sec at 100µsec response times to a rack of servers.

DSSD required specialized hardware and software in the client or host server, specialized cabling between the client and the DSSD storage device and specialized hardware and flash storage in the storage device.

What ultimately did DSSD in, was the emergence of NVMe protocols, NVMe SSDs and RoCE (RDMA over Converged Ethernet) NICs.

Last weeks post on Excelero (see my 4.5M IO/sec@227µsec … post) was just one example of what can be done with such “commodity” hardware. We just finished a GreyBeardsOnStorage podcast (GreyBeards podcast with Zivan Ori, CEO & Co-founder, E8 storage) with E8 Storage which is yet another approach to using NVMe-RoCE “commodity” hardware and providing amazing performance.

Both Excelero and E8 Storage offer over 4 million IO/sec with ~120 to ~230µsec response times to multiple racks of servers. All this with off the shelf, commodity hardware and lots of software magic.

Lessons for future hardware innovation

What can be learned from the DSSD to NVMe(SSDs & protocol)-RoCE technological transition for future hardware innovation:

  1. Closely track all commodity hardware innovations, especially ones that offer similar functionality and/or performance to what you are doing with your hardware.
  2. Intensely focus any specialized hardware innovation to a small subset of functionality that gives you the most bang, most benefits at minimum cost and avoid unnecessary changes to other hardware.
  3. Speedup hardware design-validation-prototype-production cycle as much as possible to get your solution to the market faster and try to outrun and get ahead of commodity hardware innovation for as long as possible.
  4. When (and not if) commodity hardware innovation emerges that provides  similar functionality/performance, abandon your hardware approach as quick as possible and adopt commodity hardware.

Of all the above, I believe the main problem is hardware innovation cycle times. Yes, hardware innovation costs too much (not discussed above) but I believe that these costs are a concern only if the product doesn’t succeed in the market.

When a storage (or any systems) company can startup and in 18-24 months produce a competitive product with only software development and aggressive hardware sourcing/validation/testing, having specialized hardware innovation that takes 18 months to start and another 1-2 years to get to GA ready is way too long.

What’s the solution?

I think FPGA’s have to be a part of any solution to making hardware innovation faster. With FPGA’s hardware innovation can occur in days weeks rather than months to years. Yes ASICs cost much less but cycle time is THE problem from my perspective.

I’d like to think that ASIC development cycle times of design, validation, prototype and production could also be reduced. But I don’t see how. Maybe AI can help to reduce time for design-validation. But independent FABs can only speed the prototype and production phases for new ASICs, so much.

ASIC failures also happen on a regular basis. There’s got to be a way to more quickly fix ASIC and other hardware errors. Yes some hardware fixes can be done in software but occasionally the fix requires hardware changes. A quicker hardware fix approach should help.

Finally, there must be an expectation that commodity hardware will catch up eventually, especially if the market is large enough. So an eventual changeover to commodity hardware should be baked in, from the start.


In the end, project failures like this happen. Hardware innovation needs to learn from them and move on. I commend Dell-EMC for making the hard decision to kill the project.

There will be a next time for specialized hardware innovation and it will be better. There are just too many problems that remain in the storage (and systems) industry and a select few of these can only be solved with specialized hardware.


Picture credit(s): Gravestones by Sherry NelsonMotherboard 1 by Gareth Palidwor; Copy of a DSSD slide photo taken from EMC presentation by Author (c) Dell-EMC

4.5M IO/sec@227µsec 4KB Read on 100GBE with 24 NVMe cards #SFD12

At Storage Field Day 12 (SFD12) this week we talked with Excelero, which is a startup out of Israel. They support a software defined block storage for Linux.

Excelero depends on NVMe SSDs in servers (hyper converged or as a storage system), 100GBE and RDMA NICs. (At the time I wrote this post, videos from the presentation were not available, but the TFD team assures me they will be up on their website soon).

I know, yet another software defined storage startup.

Well yesterday they demoed a single storage system that generated 2.5 M IO/sec random 4KB random writes or 4.5 M IO/Sec random 4KB reads. I didn’t record the random write average response time but it was less than 350µsec and the random read average response time was 227µsec. They only did these 30 second test runs a couple of times, but the IO performance was staggering.

But they used lots of hardware, right?

No. The target storage system used during their demo consisted of:

  • 1-Supermicro 2028U-TN24RT+, a 2U dual socket server with up to 24 NVMe 2.5″ drive slots;
  • 2-2 x 100Gbs Mellanox ConnectX-5 100Gbs Ethernet (R[DMA]-NICs); and
  • 24-Intel 2.5″ 400GB NVMe SSDs.

They also had a Dell Z9100-ON Switch  supporting 32 X 100Gbs QSFP28 ports and I think they were using 4 hosts but all this was not part of the storage target system.

I don’t recall the CPU processor used on the target but it was a relatively lowend, cheap ($300 or so) dual core, Intel standard CPU. I think they said the total target hardware cost $13K or so.

I priced out an equivalent system. 24 400GB 2.5″ NVMe Intel 750 SSDs would cost around $7.8K (Newegg); the 2 Mellanox ConnectX-5 cards $4K (Neutron USA); and the SuperMicro plus an Intel Cpu around $1.5K. So the total system is close to the ~$13K.

But it burned out the target CPU, didn’t it?

During the 4.5M IO/sec random read benchmark, the storage target CPU was at 0.3% busy and the highest consuming process on the target CPU was the Linux “Top” command used to display the PS status.

Excelero claims that the storage target system consumes absolutely no CPU processing to service an 4K read or write IO request. All of IO processing is done by hardware (the R(DMA)-NICs, the NVMe drives and PCIe bus) which bypasses the storage target CPU altogether.

We didn’t look at the host cpu utilization but driving 4.5M IO/sec would take a high level of CPU power even if their client software did most of this via RDMA messaging magic.

How is this possible?

Their client software running in the Linux host is roughly equivalent to an iSCSI initiator but talks a special RDMA protocol (patent pending by Excelero, RDDA protocol) that adds an IO request to the NVMe device submission queue and then rings the doorbell on the target system device and the SSD then takes it off the queue and executes it. In addition to the submission queue IO request they preprogram the PCIe MSI interrupt request message to somehow program (?) the target system R-NIC to send the read data/write status data back to the client host.

So there’s really no target CPU processing for any NVMe message handling or interrupt processing, it’s all done by the client SW and is handled between the NVMe drive and the target and client R-NICs.

The result is that the data is sent back to the requesting host automatically from the drive to the target R-NIC over the target’s PCIe bus and then from the target system to the client system via RDMA across 100GBE and the R-NICS and then from the client R-NIC to the client IO memory data buffer over the client’s PCIe bus.

Writes are a bit simpler as the 4KB write data can be encapsulated into the submission queue command for the write operation that’s sent to the NVMe device and the write IO status is relatively small amount of data that needs to be sent back to the client.

NVMe optimized for 4KB IO

Of course the NVMe protocol is set up to transfer up to 4KB of data with a (write command) submission queue element. And the PCIe MSI interrupt return message can be programmed to (I think) write a command in the R-NIC to cause the data transfer back for a read command directly into the client’s memory using RDMA with no CPU activity whatsoever in either operation. As long as your IO request is less than 4KB, this all works fine.

There is some minor CPU processing on the target to configure a LUN and set up the client to target connection. They essentially only support replicated RAID 10 protection across the NVMe SSDs.

They also showed another demo which used the same drive both across the 100Gbs Ethernet network and in local mode or direct as a local NVMe storage. The response times shown for both local and remote were within  5µsec of each other. This means that the overhead for going over the Ethernet link rather than going local cost you an additional 5µsec of response time.

Disaggregated vs. aggregated configuration

In addition to their standalone (disaggregated) storage target solution they also showed an (aggregated) Linux based, hyper converged client-target configuration with a smaller number of NVMe drives in them. This could be used in configurations where VMs operated and both client and target Excelero software was running on the same hardware.

Simply amazing

The product has no advanced data services. no high availability, snapshots, erasure coding, dedupe, compression replication, thin provisioning, etc. advanced data services are all lacking. But if I can clone a LUN at lets say 2.5M IO/sec I can get by with no snapshotting. And with hardware that’s this cheap I’m not sure I care about thin provisioning, dedupe and compression.  Remote site replication is never going to happen at these speeds. Ok HA is an important consideration but I think they can make that happen and they do support RAID 10 (data mirroring) so data mirroring is there for an NVMe device failure.

But if you want 4.5M 4K random reads or 2.5M 4K random writes on <$15K of hardware and happen to be running Linux, I think they have a solution for you. They showed some volume provisioning software but I was too overwhelmed trying to make sense of their performance to notice.

Yes it really screams for 4KB IO. But that covers a lot of IO activity these days. And if you can do Millions of them a second splitting up bigger IOs into 4K should not be a problem.

As far as I could tell they are selling Excelero software as a standalone product and offering it to OEMs. They already have a few customers using Excelero’s standalone software and will be announcing  OEMs soon.

I really want one for my Mac office environment, although what I’d do with a millions of IO/sec is another question.


Intel’s Optane (3D Xpoint) SSD specs in the wild

Read an article the other day in Ars Technica (Specs for 1st Intel 3DX SSD…) about a preview of the Intel Octane specs for their 375GB 3D Xpoint (3DX) flash card. The device is NVMe compliant, PCIe Gen3 add in card, that’s in a half height, half length, low profile form factor.

Intel’s Optane SSD vs. the competition

A couple of items from the Intel Optane spec sheet of interest to me as a storage guru:

  • 30 Drive writes per day/12.3 PBW (written) – 3DX, at launch, had advertised that it would have 1000 times the endurance of (2D-MLC?) NAND. Current flash cards (see Samsung SSD PRO NVMe 256GB Flash card specs) offer about 200TBW (for 256GB card) or 400TBW (for 512GB card). The Samsung PRO is based on 3D (V-)NAND, so its endurance is much better than  2D-MLC at these densities. That being said, the Octane drive is still ~40X the write endurance of the PRO 950. Not quite 1000 but certainly significantly better.
  • Sequential (bandwidth) performance (R/W) of 2400/2000 MB/sec – 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards (see Samsung SSD PRO above) above offers (R/W) 2200/900 MB/sec for an NVMe device. The Optane’s read bandwidth is a slight improvement but the write bandwidth is a 2.2X improvement over current competitive devices.
  • Random 4KB IOPs performance (R/W) of 550K/500K – Similar to the previous bulleted item, 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards like the Samsung SSD PRO offer Random 4KB IOPs performance  (R/W) of 270K/85K IOPS (@4 threads). Optane’s read random 4KB IOPs performance is 2X the PRO 950 but its write performance is ~5.9X better.
  • IO latency of <10 µsec. – 3DX advertised 10X better latency than the current (2D-MLC, non-NVMe) flash drives. According to storage review (Samsung 950 Pro M.2), the Samsung PRO 950 had a latency of ~22 µsec. Optane has at least 2X better latency than the current competition.
  • Density 375GB/HH-HL-LP – 3DX advertised 1000X the density of (then current DRAM). Today Micron offers a 4GiB DDR4/288 pin DIMM which is probably 1/2 the size of the HH flash drive. So maybe in the same space this could be 8GiB. This says that the Optane is about 100X denser than today’s DRAM.

Please note, when 3DX was launched, ~2 years ago, the then current NAND technology was 2D-MLC and NVMe was just a dream. So comparing launch claims against today’s current 3D-NAND, NVMe drives is not a fair comparison.

Nevertheless, the Optane SSD performs considerably better than current competitive NVMe drives and has significantly better endurance than current 3D (V-)NAND flash drives. All of which is a great step in the right direction.

What about DRAM replacement?

At launch, 3DX was also touted as a higher density, potential replacement for DRAM. But so far we haven’t seen any specs for what 3DX NVM looks like on a memory bus. It has much better density than DRAM, but we would need to see 3DX memory access times under 50ns to have a future as a DRAM replacement. Optane’s NVMe SSD at 10 µsec. is about 200X too slow, but then again it’s not a memory device configuration nor is it attached to a memory bus.


Photo Credit(s):  Intel Optane Spec sheet from Ars Technica Article,  DDR4 DRAM from Wikimedia user:Dsimic

QoM1610: Will NVMe over Fabric GA in enterprise AFA by Oct’2017

NVMeNVMe over fabric (NVMeoF) was a hot topic at Flash Memory Summit last August. Facebook and others were showing off their JBOF (see my Facebook moving to JBOF post) but there were plenty of other NVMeoF offerings at the show.

NVMeoF hardware availability

When Brocade announced their Gen6 Switches they made a point of saying that both their Gen5 and Gen6 switches currently support NVMeoF protocols. In addition to Brocade’s support, in Dec 2015 Qlogic announced support for NVMeoF for select HBAs. Also, as of  July 2016, Emulex announced support for NVMeoF in their HBAs.

From an Ethernet perspective, Qlogic has a NVMe Direct NIC which supports NVMe protocol offload for iSCSI. But even without NVMe Direct, Ethernet 40GbE & 100GbE with  iWARP, RoCEv1-v2, iSCSI SER, or iSCSI RDMA all could readily support NVMeoF on Ethernet. The nice thing about NVMeoF for Ethernet is not only do you get support for iSCSI & FCoE, but CIFS/SMB and NFS as well.

InfiniBand and Omni-Path Architecture already support native RDMA, so they should already support NVMeoF.

So hardware/firmware is already available for any enterprise AFA customer to want NVMeoF for their data center storage.

NVMeoF Software

Intel claims that ~90% of the software driver functionality of NVMe is the same for NVMeoF. The primary differences between the two seem to be the NVMeoY discovery and queueing mechanisms.

There are two fabric methods that can be used to implement NVMeoF data and command transfers: capsule mode where NVMe commands and data are encapsulated in normal fabric packets or fabric dependent mode where drivers make use of native fabric memory transfer mechanisms (RDMA, …) to transfer commands and data.

12679485_245179519150700_14553389_nA (Linux) host driver for NVMeoF is currently available from Seagate. And as a result, support for NVMeoF for Linux is currently under development, and  not far from release in the next Kernel (I think). (Mellanox has a tutorial on how to compile a Linux kernel with NVMeoF driver support).

With Linux coming out, Microsoft Windows and VMware can’t be far behind. However, I could find nothing online, aside from base NVMe support, for either platform.

NVMeoF target support is another matter but with NICs/HBAs & switch hardware/firmware and drivers presently available, proprietary storage system target drivers are just a matter of time.

Boot support is a major concern. I could find no information on BIOS support for booting off of a NVMeoF AFA. Arguably, one may not need boot support for NVMeoF AFAs as they are probably not a viable target for storing App code or OS software.

From what I could tell, normal fabric multi-pathing support should work fine with NVMeoF. This should allow for HA NVMeoF storage, a critical requirement for enterprise AFA storage systems these days.

NVMeoF advantages/disadvantages

Chelsio and others have shown that NVMeoF adds ~8μsec of additional overhead beyond native NVMe SSDs, which if true would warrant implementation on all NVMe AFAs. This may or may not impact max IOPS depending on scale-ability of NVMeoF.

For instance, servers (PCIe bus hardware) typically limit the number of private NVMe SSDs to 255 or less. With an NVMeoF, one could potentially have 1000s of shared NVMe SSDs accessible to a single server. With this scale, one could have a single server attached to a scale-out NVMeoF AFA (cluster) that could supply ~4X the IOPS that a single server could perform using private NVMe storage.

Base level NVMe SSD support and protocol stacks are starting to be available for most flash vendors and operating systems such as, Linux, FreeBSD, VMware, Windows, and Solaris. If Intel’s claim of 90% common software between NVMe and NVMeoF drivers is true, then it should be a relatively easy development project to provide host NVMeoF drivers.

The need for special Ethernet hardware that supports RDMA may delay some storage vendors from implementing NVMeoF AFAs quickly. The lack of BIOS boot support may be a minor irritant in comparison.

NVMeoF forecast

AFA storage systems, as far as I can tell, are all about selling high IOPS and very-low latency IOs. It would seem that NVMeoF would offer early adopter AFA storage vendors a significant performance advantage over slower paced competition.

In previous QoM/QoW posts we have established that there are about 13 new enterprise storage systems that come out each year. Probably 80% of these will be AFA, given the current market environment.

Of the 10.4 AFA systems coming out over the next year, ~20% of these systems pride themselves on being the lowest latency solutions in the market, and thus command high margins. One would think these systems would be the first to adopt NVMeoF. But, most of these systems have their own, proprietary flash modules and do not use standard (NVMe) SSDs and can use their own proprietary interface to their proprietary flash storage. This will delay any implementation for them until they can convert their flash storage to NVMe which may take some time.

On the other hand, most (70%) of the other AFA systems, that currently use SAS/SATA SSDs, could boost their IOP counts and drastically reduce their IO  response times, by implementing NVMe SSDs and NVMeoF. But converting SAS/SATA backends to NVMe will take time and effort.

But, there are a select few (~10%) of AFA systems, that already use NVMe SSDs in their AFAs, and for these few, they would seem to have a fast track towards implementing NVMeoF. The fact that NVMeoF is supported over all fabrics and all storage interface protocols make it even easier.

Moreover, NVMeoF has been under discussion since the summer of 2015, which tells me that astute AFA vendors have already had 18+ months to develop it. With NVMeoF host drivers & hardware available since Dec. 2015, means hardware and software exist to test and validate against.

I believe that NVMeoF will be GA’d within the next 12 months by at least one enterprise AFA system. So my QoM1610 forecast for NVMeoF is YES, with a 0.83 probability.





Facebook moving to JBOF (just a bunch of flash)

At Flash Memory Summit (FMS 2016) this past week, Vijay Rao, Director of Technology Strategy at Facebook gave a keynote session on some of the areas that Facebook is focused on for flash storage. One thing that stood out as a significant change of direction was a move to JBOFs in their datacenters.

As you may recall, Facebook was an early adopter of (FusionIO’s) server flash cards to accelerate their applications. But they are moving away from that technology now.

Insane growth at Facebook

Why? Vijay started his talk about some of the growth they have seen over the years in photos, videos, messages, comments, likes, etc. Each was depicted as a animated bubble chart, with a timeline on the horizontal axis and a growth measurement in % on the vertical axis, with the size of the bubble being the actual quantity of each element.

Although the user activity growth rates all started out small at different times and grew at different rates during their individual timelines, by the end of each video, they were all almost at 90-100% growth, in 4Q15 (assume this is yearly growth rate but could be wrong).

Vijay had similar slides showing the growth of their infrastructure, i.e.,  compute, storage and networking. But although infrastructure grew less quickly than user activity (messages/videos/photos/etc.), they all showed similar trends and ended up (as far as I could tell) at ~70% growth.
Continue reading “Facebook moving to JBOF (just a bunch of flash)”

Intel Cloud Day 2016 news and views

 A couple of weeks back I was at Intel Cloud Day 2016 with the rest of the TFD team. We listened to a number of presentations from Intel Management team mostly about how the IT world was changing and how they planned to help lead the transition to the new cloud world.

The view from Intel is that any organization with 1200 to 1500 servers has enough scale to do a private cloud deployment that would be more economical than using public cloud services. Intel’s new goal is to facilitate (private) 10,000 clouds, being deployed across the world.

In order to facilitate the next 10,000, Intel is working hard to introduce a number of new technologies and programs that they feel can make it happen. One that was discussed at the show was the new OpenStack scheduler based on Google’s open sourced, Kubernetes technologies which provides container management for Google’s own infrastructure but now supports the OpenStack framework.

Another way Intel is helping is by building a new 1000 (500 now) server cloud test lab in San Antonio, TX. Of course the servers will be use the latest Xeon chips from Intel (see below for more info on the latest chips). The other enabling technology discussed a lot at the show was software defined infrastructure (SDI) which applies across the data center, networking and storage.

According to Intel, security isn’t the number 1 concern holding back cloud deployments anymore. Nowadays it’s more the lack of skills that’s governing how quickly the enterprise moves to the cloud.

At the event, Intel talked about a couple of verticals that seemed to be ahead of the pack in adopting cloud services, namely, education and healthcare.  They also spent a lot of time talking about the new technologies they were introducing today.
Continue reading “Intel Cloud Day 2016 news and views”

(Storage QoM 16-001): Will we see NVM Express (NVMe) drives GA’d in enterprise storage over the next year

NVMeFirst, let me state that QoM stands for Question of the Month. Doing these forecast can be a lot of work, and rather than focusing my whole blog on weekly forecast questions and answers, I would like to do something else as well. So, from now on we are doing only one new forecast a month.

So for the first question of 2016, we will forecast whether NVMe SSDs will be GA’d in enterprise storage over the next year.

NVM Express (NVMe) means the new PCIe interface for SSD storage. Wikipedia has a nice description of NVMe. As discussed there, NVMe was designed for higher performance and enhanced parallelism which comes with the PCI Express (PCIe) bus. The current version of the NVMe spec is 1.2a (available here).

GA means generally available for purchase by any customer.

Enterprise storage systems refers to mid-range and enterprise class storage systems from major AND non-major storage vendors, which includes startups.

Over the next year means by 19 January 2017.

Special thanks to Kacey Lai (@mrdedupe), Primary Data for suggesting this months question.

Current and updates to previous forecasts


Update on QoW 15-001 (3DX) forecast:

News out today indicates that 3DX (3D XPoint non-volatile memory) samples may be available soon but it could take another 12 to 18 months to get it into production. 3DX manufacturing is more challenging than current planar NAND technology and uses about 100 new materials, many of which are currently single sourced. We already built into our 3DX forecast potential delays in reaching production in 6 months. The news above says this could be worse than  expected. As such, I feel even stronger that there is less of a possibility of 3DX shipping in storage systems by next December. So I would update my forecast for QoW 15-001 to NO with an 0.75 probability at this time.

So current forecasts for QoW 15-001 are:

A) YES with 0.85 probability; and

B) NO with 0.75 probability

Current QoW 15-002 (3D TLC) forecast

We have 3 active participants, current forecasts are:

A) Yes with 0.95 probability;

B) No with 0.53 probability; and

C) Yes with 1.0 probability

Current QoW 15-003 (SMR disk) forecast

We have 1 active participant, current forecast is:

A) Yes with 0.85 probability