MCS, UltraDIMMs and memory IO, the new path ahead – part 2

IMG_2337In part 1 (see previous post here), we discussed the underlying technology for SanDisk‘s UltraDIMMs based on Diablo Technologies MCS hardware and software. IBM will be shipping UltraDIMMs in their high end servers later this year as their new eXFlash.

In this segment we will discuss what SanDisk has put on top of the Diablo Technology’s MCS to supply SSD storage.

SanDisk UltraDIMM SSD storage

In the UltraDIMM package, SanDisk supports 200 or 400GB of 19nm MLC NAND SSD storage that is accessed via SATA [corrected after this went out, Ed.] internally, but the main interface is the 1600MHz, DDR3 to the UltraDIMMs.  As each UltraDIMM card plugs into any DDR3 memory slot you can potentially support multiples of these cards in a single server. I believe the maximum number is 7 UltraDIMMs, not sure if IBM supports this many [corrected after this went out, Ed.] dependent on the number of memory slots in your server. IBM on their x3850 and x3950 can support up to 32 UltraDIMMs per server.

SanDisk uses their Guardian Technology to enhance NAND endurance beyond what’s possible with native NAND controllers. One of the things that Guardian Technology does is to vary the voltage used to program the NAND bits over the life of the bit cells/pages. So early on when the cell is fresh, they can use less voltage and as it ages they increase the voltage to insure that the bits are properly programmed. With other NAND controllers, using the same voltage across the whole NAND lifetime it will unduly stress the NAND bits early on and later as they age, it will be unable to program properly and will need to be flagged as bad.  The NAND chips/bits are characterized so that SanDisk Guardian Technology can use an optimum voltage curve over the chips lifetime.

The UltraDIMMs also have powerloss protection. This means that any write to an UltraDIMM memory that’s been acknowledged to the server is guaranteed to have sufficient power to make it all the way to the SSD storage.

Another thing that MCS memory interface brings to the picture is Error Correction Circuitry (ECC). Data written to UltraDIMMs has ECC protection throughout the data path up from the server DRAM memory, through the DIMM socket, all the way to the SSD flash.

As discussed extensively in Part 1 of this post, access times for UltraDIMM storage is on the order 7µsec, which is ~7X faster than best of class PCIe Flash storage and a single UltraDIMM card is capable of sustaining 20GB/second of data throughput. I know of enterprise class storage systems that can’t do half that in throughput.

On the other hand, one problem with UltraDIMM storage is that they are not hot swappable. This is primarily a memory interface problem and not an UltraDIMM issue but nonetheless, you can’t swap an UltraDIMM module until the server is powered down. And who would want to do such a thing when the server is powered anyway?

SanDisk long history in NAND

SanDisk1 SanDisk2 SanDisk3As you can see from the three photos at right SanDisk seems to have been involved in flash/NAND technology innovation since the early 1990’s.  At the time NOR and NAND were competing for almost the same market.

But sometime in the mid to late 1990’s NAND found a niche in consumer cameras and never looked back. Not sure where NOR marketis today but it’s a drop in the bucket compared to the NAND market

UltraDIMMs is just the latest platform to support NAND storage access.  It happens to be one with blazingly fast access times and high IO parallelism, but in the end it just represents another way to obtain the benefits of NAND for IT customers.

Also, SanDisk’s commercial NAND (Memory Card) business seems to be very healthy. What with higher resolution photos/video/audio coming online over the next decade or so it doesn’t seem to be going away anytime soon.

SanDisk is in a new joint venture (JV) with Toshiba to produce 3D NAND flash. But in the mean time they are still using 2D flash for their current SSD storage. Toshiba and SanDisk in their current JV together manufacture about 1/2 the NAND bits in the world today.

The rest of SanDisk NAND business also seem to be doing well. And the aforementioned JV with Toshiba on 3D NAND looks positioned to take all of this NAND to the next level of density as well which should make all of us happy.

SanDisk acquiring FusionIO

SanDisk was in the news lately as they have recently filed to acquire FusionIO, a prominent and early PCIe flash supplier that in recent years has broadened their portfolio to include enterprise storage with their acquisition of NexGen storage (renamed IO Control).

When FusionIO IPO’d the stock sold at ~$19/share and SanDisk is purchasing the company in an all cash deal for $11.25/share almost a 40% reduction in share price in 3 years (June’11 IPO) – ouch.  At IPO the company was valued at ~$2B, (some pundits said this was ~$1.5B, so there’s some debate on the original valuation). SanDisk is buying the company for ~$1.1B in cash. Any way you look at it, they paid significantly less than what the company was worth at IPO. Granted, it was valued at 41X earnings then and its recent stock price at $11.59 represents a 3.3P/E (ttm).

Not exactly certain what happened. Analysts seem to indicate that Apple and Facebook, FusionIO’s biggest customers were buying less FusionIO product. I also happen to think that the PCIe flash space has gotten pretty crowded over the last 3 years with entrants from Micron Technologies, Intel, LSI, Verident/Western Digital, and others.

In addition, for PCIe flash to broaden its market there’s a serious need to surround it with sophisticated caching software to enable a more general purpose IO solution (see Pernix Data, Proximal Data, and others). These general purpose, caching solutions have finally reached high levels of sophistication and just now are becoming more widely available.

~~~~

Originally, part 3 of this series was going to be on IBM’s release of the UltraDIMM technology  as their new eXFlash. However, I am somewhat surprised not to see other vendors taking up the MCS/UltraDIMM technology but IBM may have a limited exclusivity to it.

The only other thing thats this interesting happening in solid state storage is HP’s Memristor Machine which is still a ways off.

Nonetheless, a new much faster memory card based SSD is hitting the market and if history is any indication, it won’t be long until the data storage world will sit up and take notice.

Comments?

STEC’s MLC enterprise SSD

So many choices by Robert S. Donovan
So Many Choices by Robert S. Donovan

I haven’t seen much of a specification on STEC’s new enterprise MLC SSD but it should be interesting.  So far everything I have seen seems to indicate that it’s a pure MLC drive with no SLC  NAND.  This is difficult for me to believe but could easily be cleared up by STEC or their specifications.  Most likely it’s a hybrid SLC-MLC drive similar, at least from the NAND technology perspective, to FusionIO’s SSD drive.

MLC write endurance issue

My difficulty with a pure MLC enterprise drive is the write endurance factor.  MLC NAND can only endure around 10,000 erase/program passes before it starts losing data.  With a hybrid SLC-MLC design one could have the heavy write data go to SLC NAND which has a 100,000 erase/program pass lifecycle and have the less heavy write data go to MLC.  Sort of like a storage subsystem “fast write” which writes to cache first and then destages to disk but in this case the destage may never happen if the data is written often enough.

The only flaw in this argument is that as the SSD drives get bigger (STEC’s drive is available supporting up to 800GB) this becomes less of an issue. Because with more raw storage the fact that a small portion of the data is very actively written gets swamped by the fact that there is plenty of storage to hold this data.  As such, when one NAND cell gets close to its lifetime another, younger cell can be used.  This process is called wear leveling. STEC’s current SLC Zeus drive already has sophisticated wear leveling to deal with this sort of problem with SLC SSDs and doing this for MLCs just means having larger tables to work with.

I guess at some point, with multi-TB per drives, the fact that MLC cannot sustain more than 10,000 erase/write passes becomes moot.  Because there just isn’t that much actively written data out there in an enterprise shop. When you amortize the portion of highly written data as a percentage of a drive, the more drive capacity, the smaller the active data percentages become. As such, as SSD drive capacities gets larger this becomes less of an issue.  I figure with 800GB drives, active data proportion might still be high enough to cause a problem but it might not be an issue at all.

Of course with MLC it’s also cheaper to over provision NAND storage to also help with write endurance. For an 800GB MLC SSD, you could easily add another 160GB (20% over provisioning) fairly cheaply. As such, over provisioning will also allow you to sustain an overall drive write endurance that is much higher than the individual NAND write endurance.

Another solution to the write endurance problem is to increase the power of ECC to handle write failures. This would probably take some additional engineering and may or may not be in the latest STEC MLC drive but it would make sense.

MLC performance

The other issue about MLC NAND is that it has slower read and erase/program cycle times.  Now these are still order’s of magnitude faster than standard disk but slower than SLC NAND.  For enterprise applications SLC SSDs are blistering fast and are often performance limited by the subsystem they are attached to. So, the fact that MLC SSDs are somewhat slower than SLC SSDs may not even be percieved by enterprise shops.

MLC performance is slower because it takes longer to read a cell with multiple bits in it than it takes with just one. MLC, in one technology I am aware of, encodes 2-bits in the voltage that is programmed in or read out from a cell, e.g., VoltageA = “00”, VoltageB=”01″, VoltageC=”10″, and VoltageD=”11″. This gets more complex with 3 or more bits per cell but the logic holds.  With multiple voltages, determining which voltage level is present is more complex for MLC and hence, takes longer to perform.

In the end I would expect STEC’s latest drive to be some sort of SLC-MLC hybrid but I could be wrong. It’s certainly possible that STEC have gone with just an MLC drive and beefed up the capacity, over provisioning, ECC, and wear leveling algorithms to handle its lack of write endurance

MLC takes over the world

But the major issue with using MLC in SSDs is that MLC technology is driving the NAND market. All those items in the photo above are most probably using MLC NAND, if not today then certainly tomorrow. As such, the consumer market will be driving MLC NAND manufacturing volumes way above anything the SLC market requires. Such volumes will ultimately make it unaffordable to manufacture/use any other type of NAND, namely SLC in most applications, including SSDs.

So sooner or later all SSDs will be using only MLC NAND technology. I guess the sooner we all learn to live with that the better for all of us.

Toshiba’s New MLC NAND Flash SSDs

Toshiba has recently announced a new series of SSD’s based on MLC NAND (Yahoo Biz story). This is only the latest in a series of MLC SSDs which Toshiba has released.

Historically, MLC (multi-level cell) NAND has supported higher capacity but has been slower and less reliable than SLC (single-level cell) NAND. The capacity points supplied for the new drive (64, 128, 256, & 512GB) reflect the higher density NAND. Toshiba’s performance numbers for new drives also look appealing but are probably overkill for most desktop/notebook/netbook users

Toshiba’s reliability specifications were not listed in the Yahoo story and probably would be hard to find elsewhere (I looked on the Toshiba America website and couldn’t locate any). However the duty cycle for a desktop/notebook data drive are not that severe. So the fact that MLC can only endure ~1/10th the writes that SLC can endure is probably not much of an issue.

SNIA is working on SSD (or SSS as SNIA calls it, see SNIA SSSI forum website) reliability but have yet to publish anything externally. Unsure whether they will break out MLC vs SLC drives but it’s certainly worthy of discussion.

But the advantage of MLC NAND SSDs is that they should be 2 to 4X cheaper than SLC SSDs, depending on the number (2, 3 or 4) of bits/cell, and as such, more affordable. This advantage can be reduced by the need to over-provision the device and add more parallelism in order to improve MLC reliability and performance. But both of these facilities are becoming more commonplace and so should be relatively straight forward to support in an SSD.

The question remains, given the reliability differences, when and if MLC NAND will ever become reliable enough for enterprise class SSDs. Although many vendors make MLC NAND SSDs for the notebook/desktop market (Intel, SanDISK, Samsung, etc.), FusionIO is probably one of the few using a combination of SLC and MLC NAND for enterprise class storage (see FusionIO press release). Although calling the FusionIO device an SSD is probably a misnomer. And what FusionIO does to moderate MLC endurance issues is not clear but buffering write data to SLC NAND must certainly play some part.