Disk trends, revisited

A head assembly on a Seagate disk drive by Robert Scoble (cc) (from flickr)
A head assembly on a Seagate disk drive by Robert Scoble (cc) (from flickr)

An interesting guest post on Claus’s Blog (Claus Mikkelsen of HDS) by Ian Vogelesang of HGST provided some technical/economic insights on why specific disk drives are more economically feasible than others.

It’s a bit hard going and more technical than a typical blog post, but it certainly makes a number of interesting points.

  1. There is an interaction between recording density, performance and $/GB when introducing a new, smaller form factor.  Most often drive vendors are trying to maximize GB per drive while IO performance is not as much of a concern.  So they most often try to first come out with their densest drive they can in any new form factor.  I think this is what we have seen with the SFF disks today, i.e., most vendors came out with 10Krpm drives, leaving their faster drives to the LFF.  As recording density for a new technology continues to improve, GB/drive is no longer the driving factor which is when performance rises to the top. At that point then we see the introduction of higher speed drives in a form factor.
  2. Enterprise SATA drives perform worse than equivalent capacity SAS drives. In HDS’s case there are two reasons for this: 1) For enterprise storage they append ECC plus other LBA integrity checks to each 512 byte block however, SATA doesn’t support anything but 2**n block size, thus multiple IOs are required to read/validate a block and 2) SAS hardware supports a larger tag command queue than SATA and thus, a better optimized IO queue for multiple IO requests.
  3. Global access density requirements are 600IOPs/TB of storage. This is stated as a matter of fact in the post without any background information but is another key factor driving disk changes.

I would love to know more about that last point 600 IOPS/TB. But there wasn’t much else there.  (It seem to me this should have changed over time. It’s certainly worthy of a research study if anybody’s listening out there.)

Shingled writes

One other thing I found interesting is a few statements at the end regarding emerging disk recording technology.  It seems thermally assisted recording (TAR) is not coming along as fast as everyone in the industry thought it would.  As such, the disk industry is considering moving to shingled writes (see my post Sequential Only Disk) which may cause them to abandon random writes.

But there is another solution to non-random writes besides sequential only disk and that is implementing a log structured file for blocks on the disk.  Similar to NetApp’s Data ONTAP, where the system supports random writes but actually writes data on disk drives sequentially.

This requires more smarts in the drive controller but it’s nothing like what’s in SSDs today for wear leveling and is a viable alternative.  The nice thing about a log structured file on disk, is that there is no need to change any IO drivers as the disk drive continues to support random writes (from the server/storage system perspective) but the drives write sequential on the platter.

I would suspect most drive vendors considering shingled writes are busily working on doing something similar to this and it wouldn’t surprise me to see the next generation disks support shingled writes using an onboard log structured file.

What this will do for read sequential IO is another question entirely.

Luckily, data that is read sequentially is often written sequentially and even with a log structured file layout on disk, will more than likely be positioned close together on a disk platter.

—-

Comments?

 

Sequential only disk?!

St Sadurni d'Anoia - Cordoniu Grid - Shoes on Wires by Shoes on Wires (from flickr) (cc)
St Sadurni d'Anoia - Cordoniu Grid - Shoes on Wires by Shoes on Wires (from flickr) (cc)

Was at a Rocky Mountain Magnetics (IEEE) seminar a couple of weeks ago and a fellow from HDS GST was discussing recent advances in bit patterned media (BPM).  They had shown some success at 45nm by 45 nm bit cells which corresponded to about 380Gb/sqin a little less than current technology is capable of without BPM.  The session was on some of the methodology used to create BPM, some of the magnetic characteristics and parameters that BPM is capable of and some other aspects of the “challenges” inherent in moving to BPM.   I have written before on some of the challenges inherent in the the coming hard drive capacity wall.

But one thing that caught my interest was that even at the 45x45nm spacing, they were forced to use shingled writes to modify the bit cells.  Apparently today’s read-write heads are bigger than 45x45nm in at least width dimension.  Thus, they were forced to write two tracks at a time and then go back and re-write the 2nd (and 3rd) track on the next pass, and then the 3rd and 4th track, etc.  In this fashion they shingle wrote the whole media sample.

This seems to imply to me that the only way BPM can be written with todays head technology is sequentially.  What would this mean to the world of data processing.  There are already other media today that only support sequential, i.e, tape and optical.  And yet one significant advantage of disk at least in the past was that they could support random writes.

Today’s disk, at least SATA, high capacity disk, is already taking over from tape in the first tier backup solution. Any sequential only disk with even higher capacities would be a likely future revision of the current SATA disks in this application.

However there is more to data processing than purely backup.

How would we use a sequential only disk device?

Perhaps this would be an opening to support a hybrid disk like device, one that could support a limited amount of randomly written data while supporting a vast sequential address space.  This sounds like a new device architecture which would take some time to support but it’s not that different from data base and file system structures that exist today.

For file systems, file data is written sequentially through an contiguous sequence of blocks.  File meta-data, e.g., directory entries with file name, date, location, etc. is written randomly.

Database systems are a bit more complex.  Yes there are indexes similar to file meta-data above and tables are typically created sequentially.  But, table data can also be updated randomly.  It might take some effort to change this to be purely sequentially updated but that’s what would be needed to support such a sequential only disk.

Time for hybrid disks to re-appear

A couple of years back, when SSDs were expensive and relatively un-known, there was a version of disks for PCs and laptops that supported a relatively small amount of SSD and a large disk in a single 3.5″ form factor.  This was known as a hybrid disk and had some of the performance of pure SSD with the economics of disk.

Now BPM combined with SSD’s could be configured as a similar device with the SSDs portion supporting the random written data with the BPM disk supporting the sequentially written data.  But one difference between the old hybrid disk and this one is that the random data would only (maybe) exist in the SSD storage alone.

However, with BPM its possible that a portion (maybe a zone or two) of the disk surface could be BPM and a portion using (non-BPM) current technology.  This could also be done on a platter surface by surface basis if doing so on the same platter was too complex.  But such a device would also support hybrid random and sequential write operations without the need for NAND flash.

In any event, this is all relatively new and depends on the relative sizes of write heads and BPM bit cells.  But in order to get to 4Tb/sqin or higher technologists are talking BPM bit cells of 12nm by 12nm.  At that size shingled writes with todays head size would write span 8 or 9 tracks at a time.  Even taking current write head dimensions down by a factor of 5 would still leave one with a dual track width head.

One technique to reduce write size is to use thermally assisted recording (TAR) heads. This involves using a focused laser to heat up a single bit cell for writing.  The laser beam can be focused much smaller than the write head and could be used to isolate the writing to a single track.  Of course TAR heads are yet another new technology that would then have to be integrated into the new disk package.  But maybe this is the way to get back to a truly randomly written disk device.

Who knows this is all new technology and what’s published may not be a true representation of what’s available in the labs. But to get beyond todays capacity limitations there may be a new storage technology architecture on our horizon…