Intel’s Optane (3D Xpoint) SSD specs in the wild

Read an article the other day in Ars Technica (Specs for 1st Intel 3DX SSD…) about a preview of the Intel Octane specs for their 375GB 3D Xpoint (3DX) flash card. The device is NVMe compliant, PCIe Gen3 add in card, that’s in a half height, half length, low profile form factor.

Intel’s Optane SSD vs. the competition

A couple of items from the Intel Optane spec sheet of interest to me as a storage guru:

  • 30 Drive writes per day/12.3 PBW (written) – 3DX, at launch, had advertised that it would have 1000 times the endurance of (2D-MLC?) NAND. Current flash cards (see Samsung SSD PRO NVMe 256GB Flash card specs) offer about 200TBW (for 256GB card) or 400TBW (for 512GB card). The Samsung PRO is based on 3D (V-)NAND, so its endurance is much better than  2D-MLC at these densities. That being said, the Octane drive is still ~40X the write endurance of the PRO 950. Not quite 1000 but certainly significantly better.
  • Sequential (bandwidth) performance (R/W) of 2400/2000 MB/sec – 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards (see Samsung SSD PRO above) above offers (R/W) 2200/900 MB/sec for an NVMe device. The Optane’s read bandwidth is a slight improvement but the write bandwidth is a 2.2X improvement over current competitive devices.
  • Random 4KB IOPs performance (R/W) of 550K/500K – Similar to the previous bulleted item, 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards like the Samsung SSD PRO offer Random 4KB IOPs performance  (R/W) of 270K/85K IOPS (@4 threads). Optane’s read random 4KB IOPs performance is 2X the PRO 950 but its write performance is ~5.9X better.
  • IO latency of <10 µsec. – 3DX advertised 10X better latency than the current (2D-MLC, non-NVMe) flash drives. According to storage review (Samsung 950 Pro M.2), the Samsung PRO 950 had a latency of ~22 µsec. Optane has at least 2X better latency than the current competition.
  • Density 375GB/HH-HL-LP – 3DX advertised 1000X the density of (then current DRAM). Today Micron offers a 4GiB DDR4/288 pin DIMM which is probably 1/2 the size of the HH flash drive. So maybe in the same space this could be 8GiB. This says that the Optane is about 100X denser than today’s DRAM.

Please note, when 3DX was launched, ~2 years ago, the then current NAND technology was 2D-MLC and NVMe was just a dream. So comparing launch claims against today’s current 3D-NAND, NVMe drives is not a fair comparison.

Nevertheless, the Optane SSD performs considerably better than current competitive NVMe drives and has significantly better endurance than current 3D (V-)NAND flash drives. All of which is a great step in the right direction.

What about DRAM replacement?

At launch, 3DX was also touted as a higher density, potential replacement for DRAM. But so far we haven’t seen any specs for what 3DX NVM looks like on a memory bus. It has much better density than DRAM, but we would need to see 3DX memory access times under 50ns to have a future as a DRAM replacement. Optane’s NVMe SSD at 10 µsec. is about 200X too slow, but then again it’s not a memory device configuration nor is it attached to a memory bus.

Comments?

Photo Credit(s):  Intel Optane Spec sheet from Ars Technica Article,  DDR4 DRAM from Wikimedia user:Dsimic

(Storage QoM 16-001): Will we see NVM Express (NVMe) drives GA’d in enterprise storage over the next year

NVMeFirst, let me state that QoM stands for Question of the Month. Doing these forecast can be a lot of work, and rather than focusing my whole blog on weekly forecast questions and answers, I would like to do something else as well. So, from now on we are doing only one new forecast a month.

So for the first question of 2016, we will forecast whether NVMe SSDs will be GA’d in enterprise storage over the next year.

NVM Express (NVMe) means the new PCIe interface for SSD storage. Wikipedia has a nice description of NVMe. As discussed there, NVMe was designed for higher performance and enhanced parallelism which comes with the PCI Express (PCIe) bus. The current version of the NVMe spec is 1.2a (available here).

GA means generally available for purchase by any customer.

Enterprise storage systems refers to mid-range and enterprise class storage systems from major AND non-major storage vendors, which includes startups.

Over the next year means by 19 January 2017.

Special thanks to Kacey Lai (@mrdedupe), Primary Data for suggesting this months question.

Current and updates to previous forecasts

 

Update on QoW 15-001 (3DX) forecast:

News out today indicates that 3DX (3D XPoint non-volatile memory) samples may be available soon but it could take another 12 to 18 months to get it into production. 3DX manufacturing is more challenging than current planar NAND technology and uses about 100 new materials, many of which are currently single sourced. We already built into our 3DX forecast potential delays in reaching production in 6 months. The news above says this could be worse than  expected. As such, I feel even stronger that there is less of a possibility of 3DX shipping in storage systems by next December. So I would update my forecast for QoW 15-001 to NO with an 0.75 probability at this time.

So current forecasts for QoW 15-001 are:

A) YES with 0.85 probability; and

B) NO with 0.75 probability

Current QoW 15-002 (3D TLC) forecast

We have 3 active participants, current forecasts are:

A) Yes with 0.95 probability;

B) No with 0.53 probability; and

C) Yes with 1.0 probability

Current QoW 15-003 (SMR disk) forecast

We have 1 active participant, current forecast is:

A) Yes with 0.85 probability

 

(Storage-QoW 15-002) 3D TLC NAND GA’d in major vendor storage next year – NO 0.53

Latest forecast question is: Will 3D TLC NAND be GA’d in major storage products in 12 months?

Splitting up the QoW into more answerable questions:

A) Will any vendor be shipping 3D TLC NAND SSDs/PCIe cards over the next 9 months?

Samsung will is reportedly already shipping 3D TLC NAND SSDs and PCIe cards as of August 13, 2015 and will be producing 48 layer 256Gb 3D TLC NAND memory soon.  Unclear what 3D TLC NAND technology will be shipping in the next generation drives due out soon but they are all spoken of as read-intensive/write-light storage.

One consideration is that major storage vendors typically will not introduce new storage technologies unless it’s available from multiple suppliers. This is not always the case and certainly not for internally developed storage but has been a critical criteria for most major vendors. But in the above reference, it was reported that SK Hynix and Toshiba are gearing up for 2016 shipments of 48 layer 3D TLC NAND as well, how long it takes to get these into SSD/PCIe cards is another question.

A number of startups are rumored to be using 3D TLC and Kamanario has publicly announced that their systems already use 3D TLC.

My probability of a second source for 3D TLC storage coming out within the first 9 months of next year is 0.75 

B) What changes will be required for storage vendors to utilize 3D TLC NAND storage?

The important changes will be SSD endurance and IO performance.

NAND endurance is rated at DWPD (drive writes per day). Current Samsung 3D TLC SSDs are reportedly rated anywhere from 1.3 to 3.5 DWPD for a 5 year warranty period and newer 3D TLC SSDs are rated at 5 DWPD (unknown warranty period). Current enterprise (800GB) MLC drives are reportedly rated at 10-25 DWPD (for 5 years). So if we use 3.5 DWPD for 3D TLC and 17.5 DWPD for MLC, 3D TLC NAND has a ~5X reduction in endurance.

As for performance, if we use the Samsung reported performance of 160K random reads and 18K random writes vs. an HGST 800GB MLC SSD that has 145K random read and 100K random write performance. There is a reduction of ~5.6X in write performance.  Read performance is actually better with 3D TLC NAND.

In order for major vendors to handle, a reduction in 3D TLC endurance, they will need to limit the amount of data written to these devices. Conveniently, in order for major vendors to deal with the reduction in 3D TLC write performance, they will also have to limit the amount of data written to these devices.

Hence, one potential solution is a multi-tiering, all flash array which uses standard MLC SSD/PCIe cards to absorb the heavy write activity and data from this tier, that is relatively unused, could be archived (?) over time to a 2nd tier of storage consisting of 3D TLC SSD/PCIe cards.

This is not that unusual and it’s being done today for hybrid (disk-SSD) systems with automated storage tiering. Only in this case, data is moved to SSD only if it’s accessed frequently. For 3D TLC the tiering policy should be changed from access frequency to time since last access. Doing so in a hybrid array with disk, MLC SSD and TLC SSD, would require the creation of an additional pool of storage and could be accomplished with software changes alone. There are current major vendor storage systems that already support 3 tiers of storage. And some which already support archiving to cloud storage, so these sorts of changes are present in current shipping product.

So yes there’s a reduction in endurance and yes it has worse write performance but it’s still much faster than disk and most major vendors already have software to be able to handle diverse performance storage. So accomodating the new 3D TLC storage shouldn’t be much of a problem.

New storage technology like this usually doesn’t require a hardware change to use. So the only thing that needs to be changed to accomodate the new 3D TLC is software functionality

So if the 3D TLC 2nd source was available there’s a 0.9 probability that some major storage vendor would adopt the technology over the next year.

3) What are the advantages of 3D TLC storage?

Price should be cheaper than MLC storage and the density (GB/volume) should be better. So in this case, it’s a reduction in cost/GB and increase GB/volume. So for these reasons alone it should probably be adopted.

The advantages are good and would certainly give a major vendor an edge in capacity density and in $/GB or at least get them to parity (barring any  functionality differential) with startups adopting the technology.

So given the advantages present in the technology, I would say there should be a 0.7 probability of adoption within the next 12 months.  

Forecast for QoW 15-002 is:

0.75*0.90*0.70 = 0.47 probability of YES adoption or .53 probability of NO adoption of 3D TLC NAND in major storage vendor products over the next 12 months

Update on QoW 15-001 forecast:

I have an update to my post that forecast for QoW 15-001 as a No with 0.62 probability. This question was on the adoption of 3D XPoint (3DX) technology in any enterprise storage vendor product within a year.

It has been brought to my attention that Intel mentioned the cost of producing 3DX was somewhere between 1/2 and 1/4 the cost of DRAM. Also, recent information has come to light that Intel-Micron will price 3DX between 3D NAND and DRAM. So my analysis as to the cost differential for caching technologies is way off (20X). So there would be a significant cost advantage in using the technology for volatile and non-volatile cache. Even if the chips cost nothing, it might be on the order of $3-5K cheaper with 3DX than battery/superCap backed up DRAM and volatile DRAM caching. So it exists but less than a significant cost saver.  So this being the case, I would have to adjust my 0.35 probability of adoption in this use up to 0.65.  I failed to incorporate this parameter in my final forecast, so all that analysis was for nothing. 

Another potential use is as a non-volatile write buffer for SSDs and even more important for 3D TLC NAND (see above). As this is in an SSD, software and hardware integration is commonplace so there’s a higher probability of adoption there as well. And as there are more SSDs than DRAM caching the cost differential could be more significant. Then again, it would depend on two technologies being adopted (TLC and 3DX) so it’s less likely than any one alone.

The other news (to me) was that Intel announced they would incorporate proprietary changes in DIMM bus to support 3DX as one approach. This does not lend credence to widespread adoption.  But probably only applies to server support for the technology, so I would reduce my probability there to 0.55

Updated forecast for QoW 15-001 is now:

  1. chip in production stays at .85, so there’s still 2.6 potential systems that could adopt the technology directly
  2. 0.85 probability that chips in production * 0.55 probability of servers with the technology  * 0.65 probability that a storage vendor would adopt the technology to replace caching, so (=) ~0.30 probability of server adoption in storage, and with 18 potential vendors thats another 5.5 systems potentially adopting the technology.
  3. Add in the two-three startups that likely will emerge, with similar probability of adoption, or 0.30, which is another 0.9 systems

For a total of 2.6+5.5+0.9=9 systems out of ~24 or 0.38 probability of adoption.

So my updated forecast still stands at No with a .62 probability.

SCI’s (Storage QoW 15-001) 3D XPoint in next years storage, forecast=NO with 0.62 probability

20147811875_413b041e3f_z
So as to my forecast for the first question of the week: (#Storage-QoW 2015-001) – Will 3D XPoint be GA’d in  enterprise storage systems within 12 months?

I believe the answer will be Yes with a 0.38 probability or conversely, No with a 0.62 probability.

We need to decompose the question to come up with a reasonable answer.

1. How much of an advantage will 3D XPoint provide storage systems?

The claim is 1000X faster than NAND, 1000X endurance of NAND, & 10X density of DRAM. But, I believe the relative advantage of the new technology depends mostly on its price. So now the question is what would 3D XPoint technology cost ($/GB).

It’s probably going to be way more expensive than NAND $/GB (@2.44/64Gb-MLC or ~$0.31/GB). But how will it be priced relative to  DRAM (@$2.23/4Gb DDR4 or ~$4.46/GB) and (asynch) SRAM (@$7.80/ 16Mb or $3900.00/GB)?

More than likely, it’s going to cost more than DRAM because it’s non-volatile and almost as fast to access. As for how it relates to SRAM, the pricing gulf between DRAM and asynch SRAM is so huge, I think pricing it even at 1/10th SRAM costs, would seriously reduce the market. And I don’t think its going to be too close to DRAM, so maybe ~10X the cost of DRAM, or $44.60/GB.  [Probably more like a range of prices with $44.60 at 0.5 probable, $22.30 at 0.25 and $66.90 at 0.1. Unclear how I incorporate such pricing variability into a forecast.]

At $44.60/GB, what could 3D XPoint NVM replace in a storage system: 1) non-volatile cache; 2) DRAM caches, 3) Flash caches; 4) PCIe flash storage or 5) SSD storage in storage control units.

Non-volatile caching uses battery backed DRAM (with or without SSD offload) and SuperCap backed DRAM with SSD offload. Non-volatile caches can be anywhere from 1/16 to 1/2 total system cache size. The average enterprise class storage has ~412GB of cache, so non-volatile caching could be anywhere from 26 to 206GB or lets say ~150GB of 3D XPoint, which at ~$45/GB, would cost $6.8K in chips alone, add in $1K of circuitry and it’s $7.8K

  • For battery backed DRAM – 150GB of DRAM would cost ~$670 in chips, plus an SSD (~300GB) at ~$90, and 2 batteries (8hr lithium battery costs $32) so $64. Add charging/discharging circuitry, battery FRU enclosures, (probably missing something else) but maybe all the extras come to another $500 or ~$1.3K total. So the at $45/GB the 3D Xpoint non-volatile cache would run ~6.0X the cost of battery backed up DRAM.
  • For superCAP backed DRAM – similarly, a SuperCAP cache would have the same DRAM and SSD costs ($670 & $90 respectively). The costs for SuperCAPS in equivalent (Wh) configurations, run 20X the price of batteries, so $1.3K. Charging/discharging circuitry and FRU enclosures would be simpler than batteries, maybe 1/2 as much, so add $250 for all the extras, which means a total SuperCAP backed DRAM cost of ~$2.3K., which puts 3D Xpoint at 3.4X the cost of superCAP backed DRAM.

In these configurations a 3D XPoint non-volatile memory would replace lot’s of circuitry (battery-charging/discharging & other circuitry or SuperCAP-charging/discharging & other circuitry) and the SSD. So, 3D XPoint non-volatile cache could drastically simplify hardware logic and also software coding for power outages/failures. Less parts and coding has some intrinsic value beyond pure cost, difficult to quantify, but substantive, nonetheless.

As for using 3D XPoint to replace volatile DRAM cache another advantage is you wouldn’t need to have a non-volatile cache and systems wouldn’t have to copy data between caches. But at $45/GB, costs would be significant. A 412GB DRAM cache would cost $1.8K in DRAM chips and maybe another $1K in circuitry, so~ $2.8K. Doing one in 3D XPoint would run $18K in chips and the same $1K in circuitry, so $19K.  But we eliminate the non-volatile cache. Factoring that in, the all 3D XPoint cache would run ~$19K vs. DRAM volatile and (SuperCAP backed) non-volatile cache $2.8K+$2.3K= $5.1 or ~3.7X higher costs.

Again, the parts cost differential is not the whole story. But replacing volatile cache AND non-volatile cache would probably require more coding not less.

As for using 3D XPoint as a replacement or FlashCache I don’t think it’s likely because the cost differential at $45/GB is ~100X Flash costs (not counting PCIe controller and other logic) . Ditto for PCIe Flash and SSD storage.

Being 1000X denser than DRAM is great, but board footprint is not a significant storage system cost factor today.

So at a $45/GB price maybe there’s a 0.35 likelihood that storage systems would adopt the technology.

2. How many vendors are likely to GA new enterprise storage hardware in the next 12 months?

We can use major vendors to help estimate this. I used IBM, EMC, HDS, HP and NetApp as representing the major vendors for this analysis.

IBM (2 for 4) 

  • They just released a new DS8880 last fall and their prior version DS8870 came out in Oct. 2013, so the DS8K seems to be on a 24 month development cycle. So, its very unlikely we will see a new DS8K be released in next 12 month. 
  • SVC engine hardware DH8 was introduced in May 2014. SVC CG8 engine was introduced in May 2011. So SVC hardware seems to be on a 36 month cycle. So, its very unlikely we will see a new SVC hardware engine will be released in the next 12 months.
  • FlashSystem 900 hardware was just rolled out 1Q 2015  and FlashSystem 840 was introduced in January of 2014. So FlashSystem hardware is on a ~15 month hardware cycle. So, it is very likely that a new FlashSystem hardware will be released in the next 12 months. 
  • XIV Gen 3 hardware was introduced in July of 2011. Unclear when Gen2 was rolled out but IBM acquired XIV in Jan of 2008 and released an IBM version in August, 2008. So XIV’s on a ~36 month cycle. So, it is very likely that a new generation of XIV will be released in the next 12 months. 

EMC ([4] 3 for 4) 

  • VMAX3 was GA’d in 3Q (Sep) 2014. VMAX2 was available Sep 2012, which puts VMAX on 24 month cycle. So, it’s very likely that a new VMAX will be released in the next 12 months.
  • VNX2 was announced May, 2013 and GA’d Sep 2013. VNX 1 was announced Jan ,2011 and GA’d by May 2011. So that puts VNX on a ~28 month cycle. Which means we have should have already seen a new one, so it’s very likely we will see a new version of VNX in the next 12 months.  
  • XtremIO hardware was introduced in Mar, 2013 with no new significant hardware changes since. With a lack of history to guide us let’s assume a 24 month cycle. So, it’s very likely we will see a new version of XtremIO hardware in the next 12 months.
  • Isilon S200/X200 was introduced April, 2011 and X400 was released in May, 2012. Which put Isilon on a 13 month cycle then but nothing since.  So, it’s very likely we will see a new version of Isilon hardware in the next 12 months. 

However, having EMC’s unlikely to update all their storage hardware in the same 12 moths. That being said, XtremIO could use a HW boost as IBM and the startups are pushing AFA technology pretty hard here. Isilon is getting long in the tooth, so that’s another likely changeover. Since VNX is more overdue than VMAX, I’d have to say it’s likely new VNX, XtremIO & Isilon hardware will be seen over the next year. 

HDS (1 of 3) 

  • Hitachi VSP G1000 came out in Apr of 2014. HDS VSP came out in Sep of 2010. So HDS VSP is on a 43 month cycle. So it’s very unlikely we will see a new VSP in 12 months. 
  • Hitachi HUS VM came out in Sep 2012.  As far as I can tell there were no prior generation systems. But HDS just came out with the G200-G800 series, leaving the HUS VM as the last one not updated so, it’s very likely we will see a new version of HUS VM in the next 12 months.
  • Hitachi VSP G800, G600, G400, G200 series came out in Nov of 2015. Hitachi AMS 2500 series came out in April, 2012. So the mid-range systems seem to be on an 43 month cycle. So it’s very unlikely we will see a new version of HDS G200-G800 series in the next 12 months.

HP (1 of 2) 

  • HP 3PAR 20000 was introduced August, 2015 and the previous generation system, 3PAR 10000 was introduced in June, 2012. This puts the 3PAR on a 38 month cycle. So it’s very unlikely we will see a new version of 3PAR in the next 12 months. 
  • MSA 1040 was introduced in Mar 2014. MSA 2040 was introduced in May 2013. This puts the MSA on ~10 month cycle. So it’s very likely we will see a new version of MSA in the next 12 months. 

NetApp (2 of 2)

  • FAS8080 EX was introduced June, 2014. FAS6200 was introduced in Feb, 2013. Which puts the highend FAS systems on a 16 month cycle. So it’s very likely we will see a new version high-end FAS in the next 12 months.
  • NetApp FAS8040-8060 series scale out systems were introduced in Feb 2014. FAS3200 series was introduced in Nov of 2012. Which puts the FAS systems on a 15 month cycle. A new midrange release seems overdue, so it’s very likely we will see a new version of mid-range FAS in the next 12 months.

Overall the likelihood of new hardware being released by major vendors is 2+3+1+1+2=9/15 or ~0.60 probability of new hardware in the next 12 months.

Applying 0.60 to non-major storage vendors that typically only have one storage system GA’d at a time, which includes Coho Data, DataCore, Data Gravity, Dell, DDN, Fujitsu, Infinidat, NEC, Nexenta, NexGen Storage, Nimble, Pure, Qumulo, Quantum, SolidFire, Tegile, Tintri, Violin Memory, X-IO, and am probably missing a couple more. So of these ~21 non-major/startup vendors, we are likely to see ~13 new (non-major) hardware systems in the next 12 months. 

Some of these non-major systems are based on standard off-the-shelf, Intel server hardware and some vendors (Infinidat, Violin Memory & X-IO) have their own hardware designed systems. Of the 9 major vendor products identified above, six (IBM XIV, EMC VNX, EMC Isilon, EMC XtremIO, HP MSA and NetApp mid-range) use off the shelf, server hardware.

So all told my best guess is we should see (9+13=)22 new enterprise storage systems introduced in next 12 months from major and non-major storage vendors. 

3. How likely is it that Intel-Micron will come out with GA chip products in the next 6 months?

They claimed they were sampling products to vendors back at Flash Summit in August 2015. So it’s very likely (0.85 probability) that Intel-Micron will produce 3D XPoint chips in the next 12 months.

Some systems (IBM FlashSystems, NetApp high-end, and HUS VM) could make use of raw chips or even a new level of storage connected to a memory bus. But all of them could easily take advantage of a 3D XPoint device that was an NVMe PCIe connected storage.

But to be useable for most vendor storage systems being GA’d over the next year, any new chip technology has to be available for use in 6 months at the latest.

4. How likely is it that Intel-Micron will produce servers with 3D XPoint in the next 6 months?

Listening in at Flash Summit this seems to be their preferred technological approach to market. And as most storage vendors use standard Intel Servers this would seem to be an easiest way to adopt it. If the chips are available, I deem it 0.65 probability that Intel will GA server hardware in the next 6 months with 3D XPoint technology. 

Not sure any of the major or non-major vendors above could possible use server hardware introduced later than 6 months but Qumulo uses Agile development and releases GA code every 2 weeks, so they could take this on later than most.

But given the chip pricing, lack of significant advantage, and coding update requirements, I deem it 0.33 probability that vendors will adopt the technology even if it’s in a new server that they can use.

Summary

So there’s a 0.85 probability of chips available within 6 months for 3 potential major system that leaves us with 2.6 systems using 3D XPoint chip technology directly. 

With a 0.65 probability of servers coming out in 6 months using 3D XPoint and a 0.45 of new storage systems adopting the technology for caching. That says there’s a 0.29 probability and with 18 new systems coming out. That says 5.2 systems could potentially adopt the server technology.

For a total of 7.8 systems out of a potential 22 new systems or a 0.35 probability. 

That’s just the known GA non-major and storage startups what about the stealth(ier) startups without GA storage like Primary Data. There’s probably 2 or 3 non-GA storage startups. And if we assume the same 0.6 vendors will have GA hardware next year that is an additional 1.8 systems. More than likely these will depend on standard servers, so the 0.65 probability of Intel servers probability applies. So it’s likely we will see an additional 1.2 systems here or a total of 9.0 new systems that will adopt 3D XPoint tech in the next 12 months.

So it’s 9 systems out of 23.8 or ~0,38 probable. So my forecast is Yes at 0.38 probable. 

Pricing is a key factor here. I assumed a single price but it’s more likely a range of possibilities and factoring in a pricing range would be more accurate but I don’t know how, yet.

~~~~

I could go on for another 1000 words and still be no closer to an estimate. Somebody please check my math.

Comments?

Photo Credit(s): (iTech Androidi) 3D XPoint – Intel’s new Storage chip is 1000 faster than flash memory

An analyst forecasting contest ala SuperForecasting & 1st #Storage-QoW

71619318_80d2135743_zI recently read the book SuperForecasting: the art and science of prediction by P. E. Tetlock & D. Gardner. Their Good Judgement Project has been running for years now and the book is the results of their experiments.  I thought it was a great book.

But it also got me to thinking, how can industry analysts do a better job at forecasting storage trends and events?

Impossible to judge most analyst forecasts

One thing the book mentioned was that typically analyst/pundit forecasts are too infrequent, vague and time independent to be judge-able as to their accuracy. I have committed this fault as much as anyone in this blog and on our GreyBeards on Storage podcast (e.g. see our Yearend podcast videos…).

What do we need to do differently?

The experiments documented in the book show us the way. One suggestion is to start putting time durations/limits on all forecasts so that we can better assess analyst accuracy. The other is to start estimating a probability for a forecast and updating your estimate periodically when new information becomes available. Another is to document your rational for making your forecast. Also, do post mortems on both correct and incorrect forecasts to learn how to forecast better.

Finally, make more frequent forecasts so that accuracy can be assessed statistically. The book discusses Brier scores as a way of scoring the accuracy of forecasters.

How to be better forecasters?

In the back of the book the author’s publish a list of helpful hints or guidelines to better forecasting which I will summarize here (read the book for more information):

  1. Triage – focus on questions where your work will pay off.  For example, try not to forecast anything that’s beyond say 5 years out, because there’s just too much randomness that can impact results.
  2. Split intractable problems into tractable ones – the author calls this Fermizing (after the physicist) who loved to ballpark answers to hard questions by breaking them down into easier questions to answer. So decompose problems into simpler (answerable) problems.
  3. Balance inside and outside views – search for comparisons (outside) that can be made to help estimate unique events and balance this against your own knowledge/opinions (inside) on the question.
  4. Balance over- and under-reacting to new evidence – as forecasts are updated periodically, new evidence should impact your forecasts. But a balance has to be struck as to how much new evidence should change forecasts.
  5. Search for clashing forces at work – in storage there are many ways to store data and perform faster IO. Search out all the alternatives, especially ones that can critically impact your forecast.
  6. Distinguish all degrees of uncertainty – there are many degrees of knowability, try to be as nuanced as you can and properly aggregate your uncertainty(ies) across aspects of the question to create a better overall forecast.
  7. Balance under/over confidence, prudence/decisiveness – rushing to judgement can be as bad as dawdling too long. You must get better at both calibration (how accurate multiple forecasts are) and resolution (decisiveness in forecasts). For calibration think weather rain forecasts, if rain tomorrow is 80% probably then over time rain probability estimates should be on average correct. Resolution is no guts no glory, if all your estimates are between 0.4 and 0.6 probable, your probably being to conservative to really be effective.
  8. During post mortems, beware of hindsight bias – e.g., of course we were going to have flash in storage because the price was coming down, controllers were becoming more sophisticated, reliability became good enough, etc., represents hindsight bias. What was known before SSDs came to enterprise storage was much less than this.

There are a few more hints than the above.  In the Good Judgement Project, forecasters were put in teams and there’s one guideline that deals with how to be better forecasters on teams. Then, there’s another that says don’t treat these guidelines as gospel. And a third, on trying to balance between over and under compensating for recent errors (which sounds like #4 above).

Again, I would suggest reading the book if you want to learn more.

Storage analysts forecast contest

I think we all want to be better forecasters. At least I think so. So I propose a multi-year long contest, where someone provides a storage question of the week and analyst,s such as myself, provide forecasts. Over time we can score the forecasts by creating a Brier score for each analysts set of forecasts.

I suggest we run the contest for 1 year to see if there’s any improvements in forecasting and decide again next year to see if we want to continue.

Question(s) of the week

But the first step in better forecasting is to have more frequent and better questions to forecast against.

I suggest that the analysts community come up with a question of the week. Then, everyone would get one week from publication to record their forecast. Over time as the forecasts come out we can then score analysts in their forecasting ability.

I would propose we use some sort of hash tag to track new questions, “#storage-QoW” might suffice and would stand for Question of the week for storage.

Not sure if one question a week is sufficient but that seems reasonable.

(#Storage-QoW 2015-001): Will 3D XPoint be GA’d in  enterprise storage systems within 12 months?

3D XPoint NVM was announced last July by Intel-Micron (wrote a post about here). By enterprise storage I mean enterprise and mid-range class, shared storage systems, that are accessed as block storage via Ethernet or Fibre Channel as SCSI device protocols or as file storage using SMB or NFS file access protocols. By 12 months I mean by EoD 12/8/2016. By GA’d, I mean announced as generally available and sellable in any of the major IT regions of the world (USA, Europe, Asia, or Middle East).

I hope to have my prediction in by next Monday with the next QoW as well.

Anyone interested in participating please email me at Ray [at] SilvertonConsulting <dot> com and put QoW somewhere in the title. I will keep actual names anonymous unless told otherwise. Brier scores will be calculated starting after the 12th forecast.

Please email me your forecasts. Initial forecasts need to be in by one week after the QoW goes live.  You can update your forecasts at any time.

Forecasts should be of the form “[YES|NO] Probability [0.00 to 0.99]”.

Better forecasting demands some documentation of your rational for your forecasts. You don’t have to send me your rational but I suggest you document it someplace you can use to refer back to during post mortems.

Let me know if you have any questions and I will try to answer them here

I could use more storage questions…

Comments?

Photo Credits: Renato Guerreiro, Crystalballer

Next generation NVM, 3D XPoint from Intel + Micron

cross_point_image_for_photo_capsuleEarlier this week Intel-Micron announced (see webcast here and here)  a new, transistor-less NVM with 1000 time the speed (10µsec access time for NAND) of NAND [~10ns (nano-second) access times] and at 10X the density of DRAM (currently 16Gb/DRAM chip). They call the new technology 3D XPoint™ (cross-point) NVM (non-volatile memory).

In addition to the speed and density advantages, 3D XPoint NVM also doesn’t have the endurance problems associated with todays NAND. Intel and Micron say that it has 1000 the endurance of today’s NAND (MLC NAND endurance is ~3000 write (P/E) cycles).

At that 10X current DRAM density it’s roughly equivalent to todays MLC/TLC NAND capacities/chip. And at 1000 times the speed of NAND, it’s roughly equivalent in performance to DDR4 DRAM. Of course, because it’s non-volatile it should take much less power to use than current DRAM technology, no need for power refresh.

We have talked about the end of NAND before (see The end of NAND is here, maybe). If this is truly more scaleable than NAND it seems to me that the it does signal the end of NAND. It’s just a matter of time before endurance and/or density growth of NAND hits a wall and then 3D XPoint can do everything NAND can do but better, faster and more reliably.

3D XPoint technology

The technology comes from a dual layer design which is divided into columns and at the top and bottom of the columns are accessor connections in an orthogonal pattern that together form a grid to access a single bit of memory.  This also means that 3D Xpoint NVM can be read and written a bit at a time (rather than a “page” at a time with NAND) and doesn’t have to be initialized to 0 to be written like NAND.

The 3D nature of the new NVM comes from the fact that you can build up as many layers as you want of these structures to create more and more NVM cells. The microscopic pillar  between the two layers of wiring include a memory cell and a switch component which allows a bit of data to be accessed (via the switch) and stored/read (memory cell). In the photo above the yellow material is a switch and the green material is a memory cell.

A memory cell operates by a using a bulk property change of the material. Unlike DRAM (floating gates of electrons) or NAND (capacitors to hold memory values). As such it uses all of the material to hold a memory value which should allow 3D XPoint memory cells to scale downwards much better than NAND or DRAM.

Intel and Micron are calling the new 3D XPoint NVM storage AND memory. That is suitable for fast access, non-volatile data storage and non-volatile processor memory.

3D XPoint NVM chips in manufacturing today

First chips with the new technology are being manufactured today at Intel-Micron’s joint manufacturing fab in Idaho. The first chips will supply 128Gb of NVM and uses just two layers of 3D XPoint memory.

Intel and Micron will independently produce system products (read SSDs or NVM memory devices) with the new technology during 2016. They mentioned during the webcast that the technology is expected to be attached (as SSDs) to a PCIe bus and use NVMe as an interface to read and write it. Although if it’s used in a memory application, it might be better attached to the processor memory bus.

The expectation is that the 3D XPoint cost/bit will be somewhere in between NAND and DRAM, i.e. more expensive than NAND but less expensive than DRAM. It’s nice to be the only companies in the world with a new, better storage AND memory technology.

~~~~

Over the last 10 years or so, SSDs (solid state devices) all used NAND technologies of one form or another, but after today SSDs can be made from NAND or 3D XPoint technology.

Some expected uses for the new NVM is in gaming applications (currently storage speed and memory constrained) and for in-memory databases (which are memory size constrained).  There was mention on the webcast of edge analytics as well.

Welcome to the dawn of a new age of computer storage AND memory.

Photo Credits: (c) 2015 Intel and Micron, from Intel’s 3D XPoint website