Shingled magnetic recording disks

A couple of weeks ago I attended a day of the SNIA Storage Developers Conference (SDC) where Garth Gibson of Carnegie Mellon University Parallel Data Lab (CMU PDL) and Panasas was giving a talk of what they are up to at CMU’s storage lab.  His talk at the conference was on shingled magnetic recording (SMR) disks. We have discussed this topic before in our posts on Sequential only disks?!  and in Disk trends revisited.  SMR may require a re-thinking of how we currently access disk storage.

Recall that shingled magnetic recording uses a write head that overwrites multiple tracks at a time (see graphic above), with one track being properly written and the adjacent (inward) tracks being overwritten. As the head moves to the next track, that track can be properly written but more adjacent (inward) tracks are overwritten, etc. In this fashion data can be written sequentially, on overlapping write passes.  In contrast, read heads can be much narrower and are able to read a single track.

In my post, I assumed that this would mean that the new shingled magnetic recording disks would need to be accessed sequentially not unlike tape. Such a change would need a massive rewrite to only write data sequentially.  I had suggested this could potentially work if one were to add some SSD or other NVRAM to the device to help manage the mapping of the data to the disk.  Possibly that plus a very sophisticated drive controller, not unlike SSD wear leveling today, could handle mapping a physically sequentially accessed disk to a virtually randomly accessed storage protocol.

Garth’s approach to the SMR dilemma

Garth and his team of researchers are taking another tack at the problem. In his view there are multiple groups of tracks on an SMR disk (zones or bands).  Each band can be either written sequentially or randomly but all bands can be read randomly.  One can break up the disk to include sections of multiple shingled bands, that are sequentially written and less, non-shingled bands that can be randomly written. Of course there would be a gap between the shingled bands in order not to overwrite adjacent bands. And there would also be gaps between the randomly written tracks in a non-shingled partition to allow for the wider track writing that occurs with the SMR write head.

His pitch at the conference dealt with some characteristics of such a multi-band disk device.  Such as

  • How to determine the density for a device that has multiple bands of both shingled write data and randomly written data.
  • How big or small a shingled band should be in order to support “normal” small block and randomly accessed file IO.
  • How many randomly written tracks or what the capacity of the non-shingled bands would need to be to support “normal” file IO activity.

For maximum areal density one would want large shingled bands.  There are other interesting considerations that were not as obvious but I won’t go into here.

SCSI protocol changes for SMR disks

The other, more interesting section of Garth’s talk was on recent proposed T10 and T13 changes to support SMR disks that supported shingled and non-shingled partitions and what needed to be done to support SMR devices.

The SCSI protocol changes being considered to support SMR devices include:

  • A new write cursor for shingled write bands that indicates the next LBA to be written.  The write cursor starts out at a relative band address of 0 and as each LBA is written consecutively in the band it’s incremented by one.
  • A write cursor can be reset (to zero) indicating that the band has been erased
  • Each drive maintains the band map and current cursor position within each band and this can be requested by SCSI drivers to understand the configuration of the drive.

Probably other changes are required as well but these seem sufficient to flesh out the problem.

SMR device software support

Garth and his team implemented an SMR device, emulated in software using real random accessed devices.  They then implemented an SMR device driver that used the proposed standards changes and finally, implemented a ShingledFS file system to use this emulated SMR disk to see how it would work.  (See their report on Shingled Magnetic Recording for Big Data Applications for more information.)

The CMU team implemented a log structured file system for the ShingledFS that only wrote data to the emulated SMR disk shingled partition sequentially, except for mapping and meta-data information which was written and updated randomly in a non-shingled partition.

You may recall that a log structured file system is essentially written as a sequential stream of data (not unlike a log).  But there is additional mapping required that indicates where file data is located in the log which allows for randomly accessing the file data.

In their report and at the conference, Garth presented some benchmark results for a big data application called Terasort (essentially Teragen, Terasort and Teravalidate) which seems to use Hadoop to sort a large body of data.   Not sure I can replicate this information here but suffice it to say at the moment the emulated SMR device with ShingledFS did not beat a base EXT3 or FUSE using the same hardware for these applications.

Now the CMU project wAs done by a bunch of smart researchers but it’s still relatively new and not necessarily that optimized.  Thus, there’s probably some room for improvement in the ShingledFS and maybe even the emulated SMR device and/or the SMR device driver.

At the moment Garth and his team seem to believe that SMR devices are certainly feasible and would take only modest changes to the SCSI protocols to support such devices.  As for file system support there is plenty of history surrounding log structured file systems so these are certainly doable but would require probably extensive development to implemented in various OS to support an SMR device.  The device driver changes don’t seem to be as significant.

~~~~

It certainly looks like there’s going to be SMR devices in our future.  It’s just a question whether they will be ever as widely supported as the randomly accessed disk device we know and love today.  Possibly, this could all be behind a storage subsystem that makes the technology available as networked storage capacity and over time maybe SMR devices could be implemented in more standard OS device drivers and file systems.  Nevertheless, to keep capacity and areal density on their current growth trajectory, SMR disks are coming, it’s just a matter of time.

Comments?

Image: (c) 2012 Hitachi Global Storage Technologies, from IEEE SCV Magnetics Society presentation by Roger Wood

 

Disk density hits new record, 1Tb/sqin with HAMR

Seagate has achieved 1Tb/sqin recording (source: http://www.gizmag.com)
Seagate has achieved 1Tb/sqin recording (source: http://www.gizmag.com)

Well I thought 36TB on my Mac was going to be enough.  Then along comes Seagate with this weeks announcement of reaching 1Tb/sqin (1 Trillion bits per square inch) using their new HAMR (heat assisted magnetic recording) technology.

Current LFF drive technology runs at about 620Gb/sqin providing a  3.5″ drive capacity of around 3TB or about 500Gb/sqin for 2.5″ drives supporting ~750GB.  The new 1Tb/sqin drives will easily double these capacities.

But the exciting part is that with the new HAMR or TAR (thermally assisted recording) heads and media, the long term potential is even brighter.  This new technology should be capable of 5 to 10Tb/sqin which means 3.5″ drives of 30 to 60TB and 2.5″ drives of 10 t0 20TB.

HAMR explained

HAMR uses both lasers and magnetic heads to record data in even smaller spaces than current PMR (perpendicular magnetic recording) or vertical recording heads do today.   You may recall that PMR was introduced in 2006 and now, just 6 years later we are already seeing the next generation head and media technologies in labs.

Denser disks requires smaller bits and with smaller bits disk technology runs into three problems readability, writeability and stability, AKA the magnetic recording trilemma.  Smaller bits require better stability, but better stability makes it much harder to write or change a bits magnetic orientation.  Enter the laser in HAMR, with laser heating the bits can become much more maleable.  These warmed bits can be more easily written bypassing the stability-writeability problem, at least for now.

However, just as in any big technology transition there are other competing ideas with the potential to win out.  One possibility we have discussed previously is shingled writes using bit patterned media (see my Sequential only disk post) but this requires a rethinking/re-architecting of disk storage.  As such, at best it’s an offshoot of today’s disk technology and at worst, it’s a slight detour on the overall technology roadmap.

Of course PMR is not going away any time soon. Other vendors (and proboblf Seagate) will continue to push PMR technology as far as it can go.  After all, it’s a proven technology, inside millions of spinning disks today.  But, according to Seagate, it can achieve 1Tb/sqin but go no further.

So when can I get HAMR disks

There was no mention in the press release as to when HAMR disks would be made available to the general public, but typically the drive industry has been doubling densities every 18 to 24 months.  Assuming they continue this trend across a head/media technology transition like HAMR, we should have those 6GB hard disk drives sometime around 2014, if not sooner.

HAMR technology will likely make it’s first appearance in 72oorpm drives.  Bigger capacities seem to always first come out in slower performing disks (see my Disk trends, revisited post)

HAMR performance wasn’t discussed in the Seagate press release, but with 2Mb per linear track inch and 15Krpm disk drives, the transfer rates would seem to need to be on the order of at least 850MB/sec at the OD (outer diameter) for read data transfers.

How quickly HAMR heads can write data is another matter. The fact that the laser heats the media before the magnetic head can write it seems to call for a magnetic-plus-optical head contraption where the laser is in front of the magnetics (see picture above).

How long it takes to heat the media to enable magnetization is one critical question in write performance. But this could potential be mitigated by the strength of the laser pulse and how far the  laser has to be in front of the recording head.

With all this talk of writing, there hasn’t been lots of discussion on read heads. I guess everyone’s assuming the current PMR read heads will do the trick, with a significant speed up of course, to handle the higher linear densities.

What’s next?

As for what comes after HAMR, checkout another post I did on using lasers to magnetize (write) data (see Magnetic storage using lasers alone).  The advantage of this new “laser-only” technology was a significant speed up in transfer speeds.  It seems to me that HAMR could easily be an intermediate step on the path to laser-only recording having both laser optics and magnetic recording/reading heads in one assembly.

~~~~

Lets see 6TB in 2014, 12TB in 2016 and 24TB in 2018, maybe I won’t need that WD Thunderbolt drive string as quickly as I thought.

Comments?

 

 

Sequential only disk?!

St Sadurni d'Anoia - Cordoniu Grid - Shoes on Wires by Shoes on Wires (from flickr) (cc)
St Sadurni d'Anoia - Cordoniu Grid - Shoes on Wires by Shoes on Wires (from flickr) (cc)

Was at a Rocky Mountain Magnetics (IEEE) seminar a couple of weeks ago and a fellow from HDS GST was discussing recent advances in bit patterned media (BPM).  They had shown some success at 45nm by 45 nm bit cells which corresponded to about 380Gb/sqin a little less than current technology is capable of without BPM.  The session was on some of the methodology used to create BPM, some of the magnetic characteristics and parameters that BPM is capable of and some other aspects of the “challenges” inherent in moving to BPM.   I have written before on some of the challenges inherent in the the coming hard drive capacity wall.

But one thing that caught my interest was that even at the 45x45nm spacing, they were forced to use shingled writes to modify the bit cells.  Apparently today’s read-write heads are bigger than 45x45nm in at least width dimension.  Thus, they were forced to write two tracks at a time and then go back and re-write the 2nd (and 3rd) track on the next pass, and then the 3rd and 4th track, etc.  In this fashion they shingle wrote the whole media sample.

This seems to imply to me that the only way BPM can be written with todays head technology is sequentially.  What would this mean to the world of data processing.  There are already other media today that only support sequential, i.e, tape and optical.  And yet one significant advantage of disk at least in the past was that they could support random writes.

Today’s disk, at least SATA, high capacity disk, is already taking over from tape in the first tier backup solution. Any sequential only disk with even higher capacities would be a likely future revision of the current SATA disks in this application.

However there is more to data processing than purely backup.

How would we use a sequential only disk device?

Perhaps this would be an opening to support a hybrid disk like device, one that could support a limited amount of randomly written data while supporting a vast sequential address space.  This sounds like a new device architecture which would take some time to support but it’s not that different from data base and file system structures that exist today.

For file systems, file data is written sequentially through an contiguous sequence of blocks.  File meta-data, e.g., directory entries with file name, date, location, etc. is written randomly.

Database systems are a bit more complex.  Yes there are indexes similar to file meta-data above and tables are typically created sequentially.  But, table data can also be updated randomly.  It might take some effort to change this to be purely sequentially updated but that’s what would be needed to support such a sequential only disk.

Time for hybrid disks to re-appear

A couple of years back, when SSDs were expensive and relatively un-known, there was a version of disks for PCs and laptops that supported a relatively small amount of SSD and a large disk in a single 3.5″ form factor.  This was known as a hybrid disk and had some of the performance of pure SSD with the economics of disk.

Now BPM combined with SSD’s could be configured as a similar device with the SSDs portion supporting the random written data with the BPM disk supporting the sequentially written data.  But one difference between the old hybrid disk and this one is that the random data would only (maybe) exist in the SSD storage alone.

However, with BPM its possible that a portion (maybe a zone or two) of the disk surface could be BPM and a portion using (non-BPM) current technology.  This could also be done on a platter surface by surface basis if doing so on the same platter was too complex.  But such a device would also support hybrid random and sequential write operations without the need for NAND flash.

In any event, this is all relatively new and depends on the relative sizes of write heads and BPM bit cells.  But in order to get to 4Tb/sqin or higher technologists are talking BPM bit cells of 12nm by 12nm.  At that size shingled writes with todays head size would write span 8 or 9 tracks at a time.  Even taking current write head dimensions down by a factor of 5 would still leave one with a dual track width head.

One technique to reduce write size is to use thermally assisted recording (TAR) heads. This involves using a focused laser to heat up a single bit cell for writing.  The laser beam can be focused much smaller than the write head and could be used to isolate the writing to a single track.  Of course TAR heads are yet another new technology that would then have to be integrated into the new disk package.  But maybe this is the way to get back to a truly randomly written disk device.

Who knows this is all new technology and what’s published may not be a true representation of what’s available in the labs. But to get beyond todays capacity limitations there may be a new storage technology architecture on our horizon…