Research reveals ~liquid nitrogen temperature molecular magnets with 100X denser storage

Must be on a materials science binge these days. I read another article this week in on “Major leap towards data storage at the molecular level” reporting on a Nature article “Molecular magnetic hysteresis at 60K“, where researchers from University of Manchester, led by Dr David Mills and Dr Nicholas Chilton from the School of Chemistry, have come up with a new material that provides molecular level magnetics at almost liquid nitrogen temperatures.

Previously, molecular magnets only operated at from 4 to 14K (degrees Kelvin) from research done over the last 25 years or so, but this new  research shows similar effects operating at ~60K or close to liquid nitrogen temperatures. Nitrogen freezes at 63K and boils at ~77K, and I would guess, is liquid somewhere between those temperatures.

What new material

The new material, “hexa-tert-butyldysprosocenium complex—[Dy(Cpttt)2][B(C6F5)4], with Cpttt = {C5H2tBu3-1,2,4} and tBu = C(CH3)3“, dysprosocenium for short was designed (?) by the researchers at Manchester and was shown to exhibit magnetism at the molecular level at 60K.

The storage effect is hysteresis, which is a materials ability to remember the last (magnetic/electrical/?) field it was exposed to and the magnetic field is measured in oersteds.

The researchers claim the new material provides magnetic hysteresis at a sweep level of 22 oersteds. Not sure what “sweep level of 22 oersteds” means but I assume a molecule of the material is magnetized with a field strength of 22 oersteds and retains this magnetic field over time.

Reports of disk’s death, have been greatly exaggerated

While there seems to be no end in sight for the densities of flash storage these days with 3D NAND (see my 3D NAND, how high can it go post or listen to our GBoS FMS2017 wrap-up with Jim Handy podcast), the disk industry lives on.

Disk industry researchers have been investigating HAMR, ([laser] heat assisted magnetic recording, see my Disk density hits new record … post) for some time now to increase disk storage density. But to my knowledge HAMR has not come out in any generally available disk device on the market yet. HAMR was supposed to provide the next big increase in disk storage densities.

Maybe they should be looking at CAMMR, or cold assisted magnetic molecular recording (heard it here, 1st).

According to Dr Chilton using the new material at 60K in a disk device would increase capacity by 100X. Western Digital just announced a 20TB MyBook Duo disk system for desktop storage and backup. With this new material, at 100X current densities, we could have 2PB Mybook Duo storage system on your desktop.

That should keep my ever increasing video-photo-music library in fine shape and everything else backed up for a little while longer.


Photo Credit(s): Molecular magnetic hysteresis at 60K, Nature article


Intel’s Optane (3D Xpoint) SSD specs in the wild

Read an article the other day in Ars Technica (Specs for 1st Intel 3DX SSD…) about a preview of the Intel Octane specs for their 375GB 3D Xpoint (3DX) flash card. The device is NVMe compliant, PCIe Gen3 add in card, that’s in a half height, half length, low profile form factor.

Intel’s Optane SSD vs. the competition

A couple of items from the Intel Optane spec sheet of interest to me as a storage guru:

  • 30 Drive writes per day/12.3 PBW (written) – 3DX, at launch, had advertised that it would have 1000 times the endurance of (2D-MLC?) NAND. Current flash cards (see Samsung SSD PRO NVMe 256GB Flash card specs) offer about 200TBW (for 256GB card) or 400TBW (for 512GB card). The Samsung PRO is based on 3D (V-)NAND, so its endurance is much better than  2D-MLC at these densities. That being said, the Octane drive is still ~40X the write endurance of the PRO 950. Not quite 1000 but certainly significantly better.
  • Sequential (bandwidth) performance (R/W) of 2400/2000 MB/sec – 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards (see Samsung SSD PRO above) above offers (R/W) 2200/900 MB/sec for an NVMe device. The Optane’s read bandwidth is a slight improvement but the write bandwidth is a 2.2X improvement over current competitive devices.
  • Random 4KB IOPs performance (R/W) of 550K/500K – Similar to the previous bulleted item, 3DX advertised 1000 times the performance of (2D-MLC,  non-NVMe?) NAND. Current 3D (V-)NAND cards like the Samsung SSD PRO offer Random 4KB IOPs performance  (R/W) of 270K/85K IOPS (@4 threads). Optane’s read random 4KB IOPs performance is 2X the PRO 950 but its write performance is ~5.9X better.
  • IO latency of <10 µsec. – 3DX advertised 10X better latency than the current (2D-MLC, non-NVMe) flash drives. According to storage review (Samsung 950 Pro M.2), the Samsung PRO 950 had a latency of ~22 µsec. Optane has at least 2X better latency than the current competition.
  • Density 375GB/HH-HL-LP – 3DX advertised 1000X the density of (then current DRAM). Today Micron offers a 4GiB DDR4/288 pin DIMM which is probably 1/2 the size of the HH flash drive. So maybe in the same space this could be 8GiB. This says that the Optane is about 100X denser than today’s DRAM.

Please note, when 3DX was launched, ~2 years ago, the then current NAND technology was 2D-MLC and NVMe was just a dream. So comparing launch claims against today’s current 3D-NAND, NVMe drives is not a fair comparison.

Nevertheless, the Optane SSD performs considerably better than current competitive NVMe drives and has significantly better endurance than current 3D (V-)NAND flash drives. All of which is a great step in the right direction.

What about DRAM replacement?

At launch, 3DX was also touted as a higher density, potential replacement for DRAM. But so far we haven’t seen any specs for what 3DX NVM looks like on a memory bus. It has much better density than DRAM, but we would need to see 3DX memory access times under 50ns to have a future as a DRAM replacement. Optane’s NVMe SSD at 10 µsec. is about 200X too slow, but then again it’s not a memory device configuration nor is it attached to a memory bus.


Photo Credit(s):  Intel Optane Spec sheet from Ars Technica Article,  DDR4 DRAM from Wikimedia user:Dsimic

5D storage for humanity’s archive

5D data storage.jpg_SIA_JPG_fit_to_width_INLINEA group of researchers at the University of Southhampton in the UK have  invented a new type of optical recording, based on femto-second laser pulses and silica/quartz media that can store up to 300TB per (1″ diameter) disc platter with thermal stability at up to 1000°C or a media life of up to 13.8B years at room temperature (190°C?). The claim is that the memory device could outlive humanity and maybe the universe.

The new media/recording technique was used recently to create copies of text files (Holy Bible, pictured above). Other significant humanitarian, political and scientific treatise have also been stored on the new media. The new device has been nicknamed “Superman Memory Crystal”, due to the memory glass (quartz) likeness to Superman’s memory crystals.

We have written before on long term archives(See Super Long Term Archive and Today’s data and the 1000 year archive posts) but this one beats them all by many orders of magnitude.
Continue reading “5D storage for humanity’s archive”

(Storage QoW 15-003): SMR disks in GA enterprise storage in 12 months? Yes@.85 probability

Hard Disk by Jeff Kubina (cc) (from Flickr)
Hard Disk by Jeff Kubina (cc) (from Flickr)

(Storage QoW 15-003): Will we see SMR (shingled magnetic recording) disks in GA enterprise storage systems over the next 12 months?

Are there two vendors of SMR?

Yes, both Seagate and HGST have announced and currently shipping (?) SMR drives, HGST has a 10TB drive and Seagate has an 8TB drive on the market since last summer.

One other interesting fact is that SMR will be the common format for all future disk head technologies including HAMR, MAMR, & BPMR (see presentation).

What would storage vendors have to do to support SMR drives?

Because of the nature of SMR disks, writes overlap other tracks so they must be written, at least in part, sequentially (see our original post on Sequential only disks). Another post I did reported on recent work by Garth Gibson at CMU (Shingled Magnetic Recording disks) which showed how multiple bands or zones on an SMR disk could be used some of which could be written randomly and others which could be written sequentially but all could be read randomly. With such an approach you could have a reasonable file system on an SMR device with a metadata partition (randomly writeable) and a data partition (sequentially writeable).

In order to support SMR devices, changes have been requested for the T10 SCSI  & T13 ATA command protocols. Such changes would include:

  • SMR devices support a new write cursor for each SMR sequential band.
  • SMR devices support sequential writes within SMR sequential bands at the write cursor.
  • SMR band write cursors can be read, statused and reset to 0. SMR sequential band LBA writes only occur at the band cursor and for each LBA written, the SMR device increments the band cursor by one.
  • SMR devices can report their band map layout.

The presentation refers to multiple approaches to SMR support or SMR drive modes:

  • Restricted SMR devices – where the device will not accept any random writes, all writes occur at a band cursor, random writes are rejected by the device. But performance would be predictable. 
  • Host Aware SMR devices – where the host using the SMR devices is aware of SMR characteristics and actively manages the device using write cursors and band maps to write the most data to the device. However, the device will accept random writes and will perform them for the host. This will result in sub-optimal and non-predictable drive performance.
  • Drive managed SMR devices – where the SMR devices acts like a randomly accessed disk device but maps random writes to sequential writes internally using virtualization of the drive LBA map, not unlike SSDs do today. These devices would be backward compatible to todays disk devices, but drive performance would be bad and non-predictable.

Unclear which of these drive modes are currently shipping, but I believe Restricted SMR device modes are already available and drive manufacturers would be working on Host Aware and Drive managed to help adoption.

So assuming Restricted SMR device mode availability and prototypes of T10/T13 changes are available, then there are significant but known changes for enterprise storage systems to support SMR devices.

Nevertheless, a number of hybrid storage systems already implement Log Structured File (LSF) systems on their backends, which mostly write sequentially to backend devices, so moving to a SMR restricted device modes would be easier for these systems.

Unclear how many storage systems have such a back end, but NetApp uses it for WAFL and just about every other hybrid startup has a LSF format for their backend layout. So being conservative lets say 50% of enterprise hybrid storage vendors use LSF.

The other 60% would have more of a problem implementing SMR restricted mode devices, but it’s only a matter of time before  all will need to go that way. That is assuming they still use disks. So, we are primarily talking about hybrid storage systems.

All major storage vendors support hybrid storage and about 60% of startups support hybrid storage, so adding these to together, maybe about 75% of enterprise storage vendors have hybrid.

Using analysis on QoW 15-001, about 60% of enterprise storage vendors will probably ship new hardware versions of their systems over the next 12 months. So of the 13 likely new hardware systems over the next 12 months, 75% have hybrid solutions and 50% have LSF, or ~4.9 new hardware systems will be released over the next 12 months that are hybrid and have LSF backends already.

What are the advantages of SMR?

SMR devices will have higher storage densities and lower cost. Today disk drives are running 6-8TB and the SMR devices run 8-10TB so a 25-30% step up in storage capacity is possible with SMR devices.

New drive support has in the past been relatively easy because command sets/formats haven’t changed much over the past 7 years or so, but SMR is different and will take more effort to support. The fact that all new drives will be SMR over time gives more emphasis to get on the band wagon as soon as feasible. So, I would give a storage vendor a 80% likelihood of implementing SMR, assuming they have new systems coming out, are already hybrid and are already using LSF.

So of the ~4.9 systems that are LSF/Hybrid/being released *.8, says ~3.9 systems will introduce SMR devices over the next 12 months.

For non-LSF hybrid systems, the effort seems much harder, so I would give the likelihood of implementing SMR about a 40% chance. So of the ~8.1 systems left that will be introduced in next year, 75% are hybrid or ~6.1 systems and they have a 40% likelihood of implementing SMR so ~2.4 of these non-LSF systems will probably introduce SMR devices.

There’s one other category that we need to consider and that would be startups in stealth. These could have been designing their hybrid storage for SMR from the get go. In QoW 15-001 analysis I assumed another ~1.8 startup vendors would emerge to GA over the next 12 months. And if we assume that 0.75% of these were hybrid then there’s ~1.4 startups vendors that could be using SMR technology in their hybrid storage for a (4.9+2.4+1.4(1.8*.75)= 8.7 systems have a high probability of SMR implementation over the next 12 months in GA enterprise storage products.


So my forecast of SMR adoption by enterprise storage is Yes for .85 probability (unclear what the probability should be, but it’s highly probable).



Microsoft Exchange database backup performance – chart of the month

Microsoft Exchange 1001-5000 mailboxes, top 10 database backup per server
In last month’s Storage Intelligence newsletter we discussed the latest Exchange storage system performance for 1001 to 5000 mailboxes. One  charts we updated was the above Exchange database backup on a per server basis. The were two new submissions for this quarter, and both the Dell PowerEdge R730xd (#2 above) and the HP D3600 drive shelf with P441 storage controller (#10) ranked well on this metric.

This ESRP reported metric only measures backup throughput at a server level. However, because these two new submissions only had one server, it’s not as much of a problem here.

The Dell system had a SAS connected JBOD with 14-4TB 7200RPM disks and the HP system had a SAS connected JBOD with 11-6TB 7200RPM disks. The other major difference is that the HP system had 4GB of “flash backed write cache” and the Dell system only had 2GB of  “flash backed cache”.

As far as I can tell the fact that the Dell storage managed ~2.3GB/sec. and the HP storage only managed ~1.1GB/sec is probably mostly due to their respective drive configurations than anything else.

RAID 0 vs. RAID 1

One surprising characteristic of the HP setup is that they used RAID 0 while the Dell system used RAID1. This would offer a significant benefit to the Dell system during heavy read activity, but as I understand it, the database backup activity is run with a standard email stress environment. So in this case, there is a healthy mix of reads/writes going on at the time the backup activity. So the Dell system would have an advantage for reads and a penalty for writes (writing two copies of all data). So Dell’s RAID advantage is probably a wash.

Whether RAID 0 vs. RAID 1 would have made any difference to other ESRP metrics (database transfers per second, read/write/log access latencies, log processing, etc.) is subject for another post.

Of course,  with Exchange DAG’s there’s built in database redundancy so maybe RAID 0 is an OK configuration for some customers. Software based redundancy does seem to be Microsoft’s direction, at least since Exchange 2010, so maybe I’m the one that’s out of touch.

Still for such a small configuration I’m not sure I would have gone with RAID 0…


Facebook down to 1.08 PUE and counting for cold storage

prineville-servers-470Read a recent article in ArsTechnica about Facebook’s cold storage archive and their sustainable data centers (How Facebook puts petabytes of old cat pix on ice in the name of sustainability). In the article there was a statement that Facebook had achieved a 1.08 PUE (Power Usage Effectiveness) for one of these data centers. This means for every 100 Watts used to power up racks, Facebook needed to add 8 Watts for other overhead.

Just last year I wrote a paper for a client where I interviewed the CEO of an outsourced data center provider (DuPont Fabros Technology) whose state of the art new data centers were achieving a PUE of from 1.14 to 1.18. For Facebook to run their cold storage data centers at 1.08 PUE is even better.

At the moment, Facebook has two cold storage data centers one at Prineville, OR and the other at Forest City, NC (Forest City achieved the 1.08 PUE). The two cold data storage sites add to the other Facebook data centers that handle everything else in the Facebook universe.

MAID to the rescue

First off these are just cold storage data centers, over an EB of data, but still archive storage, racks and racks of it. How they decide something is cold or hot seems to depend on last use. For example, if a picture has been referenced recently then it’s warm, if not then it’s cold.

Second, they have taken MAID (massive array of idle disks) to a whole new data center level. That is each 1U (Knox storage tray) shelf has 30 4TB drives and a rack has 16 of these storage trays, holding 1.92PB of data. At any one time, only one drive in each storage tray is powered up at a time. The racks have dual servers and only one power shelf (due to the reduced power requirements).

They also use pre-fetch hints provided by the Facebook application to cache user data.  This means they will fetch some images ahead of time,when users areis paging through photos in stream in order to have them in cache when needed. After the user looks at or passes up a photo, it is jettisoned from cache, the next photo is pre-fetched. When the disks are no longer busy, they are powered down.

Less power conversions lower PUE

Another thing Facebook is doing is reducing the number of power conversions that need to happen to power racks. In a typical data center power comes in at 480 Volts AC,  flows through the data center UPS and then is dropped down to 208 Volts AC at the PDU which flows to the rack power supply which is then converted to 12 Volts DC.  Each conversion of electricity generally sucks up power and in the end only 85% of the energy coming in reaches the rack’s servers and storage.

In Facebooks data centers, 480 Volts AC is channeled directly to the racks which have an in rack battery backup/UPS and rack’s power bus converts the 480 Volt AC to 12 Volt DC or AC directly as needed. By cutting out the data center level UPS and the PDU energy conversion they save lots of energy overhead which can be used to better power the racks.

Free air cooling helps

Facebook data centers like Prineville also make use of “fresh air cooling” that mixes data center air with outside air, that flows through through “wetted media” to cool which is then sent down to cool the racks by convection.  This process keeps the rack servers and storage within the proper temperature range but probably run hotter than most data centers this way. How much fresh air is brought in depends on outside temperature, but during most months, it works very well.

This is in contrast to standard data centers that use chillers, fans and pumps to keep the data center air moving, conditioned and cold enough to chill the equipment. All those fans, pumps and chillers can consume a lot of energy.

Renewable energy, too

Lately, Facebook has made obtaining renewable energy to power their data centers a high priority. One new data center close to the Arctic Circle was built there because of hydro-power, another in Iowa and one in Texas were built in locations with wind power.

All of this technology, open sourced

Facebook has open sourced all of it’s hardware and data center systems. That is the specifications for all the hardware discussed above and more is available from the Open Compute Organization, including the storage specification(s), open rack specification(s) and data center specification(s) for these data centers.

So if you want to build your own cold storage archive that can achieve 1.08 PUE, just pick up their specs and have at it.


Picture Credits: DataCenterKnowledge.Com


3D NAND, how high can it go?

450_x_492_3d_nand_32_layer_stackI was at the Flash Memory Summit a couple of weeks ago and a presenter (from Hynix, I think) got up and talked about how 3D NAND was going to be the way forward for all NAND technology. I always thought we were talking about a handful of layers. But on the slide he had what looked to be a skyscraper block with 20-40 layers of NAND.

Currently shipping 3D NAND

It seems all the major NAND fabs are shipping 30+ layer 3D NAND. Samsung last year said they were shipping 32-layer 3D (V-)NANDToshiba announced earlier this year that they had 48-layer 3D NANDHynix is shipping 36-layer 3D NAND.  Micron-Intel is also shipping 32-layer 3D NAND. Am I missing anyone?

Samsung also said that they will be shipping a 32GB, 48-layer V-NAND chip later this year. Apparently, Samsung is also working on 64-layer V-NAND in their labs and are getting good results.  In an article on Samsung’s website they mentioned the possibility of 100 layers of NAND in a 3D stack.

The other NAND fabs are also probably looking at adding layers to their 3D NAND but aren’t talking as much about it. i5QVjaOmlEZHmjM34GrH3NFORjU9A-xAk_JUvkzS8Os

Earlier this year on a GreyBeards on Storage Podcast we talked with Jim Handy, Director at Objective Analysis on what was going on in NAND fabrication. Talking with Jim was fascinating but one thing he said was that with 3D NAND, building a hole with the right depth, width and straight enough was a key challenge. At the time I was thinking a couple of layers deep. Boy was I wrong.

How high/deep can 3D NAND go?

On the podcast, Jim said he thought that 3D NAND would run out of gas around 2023. Given current press releases, it seems NAND fabs are adding ~16 layers a year to their 3D-NAND.

So if 32 to 48 layers is todays 3D-NAND and we can keep adding 16 layers/year through 2023 that’s 8 years *16 layers or an additional 128 layers  to the 32  to 48 layers currently shipping. With that rate we should get to 160 to 176 layer 3D NAND chips. And if 48 layers is 32GB then we maybe we could see  ~+100GB  3D NAND chips.

This of course means that there is no loss in capacity as we increase layers. Also that the industry can continue to add 16 layers/year to 3D-NAND chips.

I suppose there’s one other proviso, that nothing else comes along that is less expensive to fabricate while still providing ever increasing capacity of lightening fast, non-volatile storage (see a recent post on 3D XPoint NVM technology).

Photo Credit(s):

  1. Micron’s press release on 3D NAND, (c) 2015 Micron
  2. Toshiba’s press release as reported by AnandTech, (c) 2015 Toshiba

Next generation NVM, 3D XPoint from Intel + Micron

cross_point_image_for_photo_capsuleEarlier this week Intel-Micron announced (see webcast here and here)  a new, transistor-less NVM with 1000 time the speed (10µsec access time for NAND) of NAND [~10ns (nano-second) access times] and at 10X the density of DRAM (currently 16Gb/DRAM chip). They call the new technology 3D XPoint™ (cross-point) NVM (non-volatile memory).

In addition to the speed and density advantages, 3D XPoint NVM also doesn’t have the endurance problems associated with todays NAND. Intel and Micron say that it has 1000 the endurance of today’s NAND (MLC NAND endurance is ~3000 write (P/E) cycles).

At that 10X current DRAM density it’s roughly equivalent to todays MLC/TLC NAND capacities/chip. And at 1000 times the speed of NAND, it’s roughly equivalent in performance to DDR4 DRAM. Of course, because it’s non-volatile it should take much less power to use than current DRAM technology, no need for power refresh.

We have talked about the end of NAND before (see The end of NAND is here, maybe). If this is truly more scaleable than NAND it seems to me that the it does signal the end of NAND. It’s just a matter of time before endurance and/or density growth of NAND hits a wall and then 3D XPoint can do everything NAND can do but better, faster and more reliably.

3D XPoint technology

The technology comes from a dual layer design which is divided into columns and at the top and bottom of the columns are accessor connections in an orthogonal pattern that together form a grid to access a single bit of memory.  This also means that 3D Xpoint NVM can be read and written a bit at a time (rather than a “page” at a time with NAND) and doesn’t have to be initialized to 0 to be written like NAND.

The 3D nature of the new NVM comes from the fact that you can build up as many layers as you want of these structures to create more and more NVM cells. The microscopic pillar  between the two layers of wiring include a memory cell and a switch component which allows a bit of data to be accessed (via the switch) and stored/read (memory cell). In the photo above the yellow material is a switch and the green material is a memory cell.

A memory cell operates by a using a bulk property change of the material. Unlike DRAM (floating gates of electrons) or NAND (capacitors to hold memory values). As such it uses all of the material to hold a memory value which should allow 3D XPoint memory cells to scale downwards much better than NAND or DRAM.

Intel and Micron are calling the new 3D XPoint NVM storage AND memory. That is suitable for fast access, non-volatile data storage and non-volatile processor memory.

3D XPoint NVM chips in manufacturing today

First chips with the new technology are being manufactured today at Intel-Micron’s joint manufacturing fab in Idaho. The first chips will supply 128Gb of NVM and uses just two layers of 3D XPoint memory.

Intel and Micron will independently produce system products (read SSDs or NVM memory devices) with the new technology during 2016. They mentioned during the webcast that the technology is expected to be attached (as SSDs) to a PCIe bus and use NVMe as an interface to read and write it. Although if it’s used in a memory application, it might be better attached to the processor memory bus.

The expectation is that the 3D XPoint cost/bit will be somewhere in between NAND and DRAM, i.e. more expensive than NAND but less expensive than DRAM. It’s nice to be the only companies in the world with a new, better storage AND memory technology.


Over the last 10 years or so, SSDs (solid state devices) all used NAND technologies of one form or another, but after today SSDs can be made from NAND or 3D XPoint technology.

Some expected uses for the new NVM is in gaming applications (currently storage speed and memory constrained) and for in-memory databases (which are memory size constrained).  There was mention on the webcast of edge analytics as well.

Welcome to the dawn of a new age of computer storage AND memory.

Photo Credits: (c) 2015 Intel and Micron, from Intel’s 3D XPoint website