Will Hybrid drives conquer enterprise storage?

Toyota Hybrid Synergy Drive Decal: RAC Future Car Challenge by Dominic's pics (cc) (from Flickr)
Toyota Hybrid Synergy Drive Decal: RAC Future Car Challenge by Dominic's pics (cc) (from Flickr)

I saw where Seagate announced the next generation of their Momentus XT Hybrid (SSD & Disk) drive this week.  We haven’t discussed Hybrid drives much on this blog but it has become a viable product family.

I am not planning on describing the new drive specs here as there was an excellent review by Greg Schulz at StorageIOblog.

However, the question some in the storage industry have had is can Hybrid drives supplant data center storage.  I believe the answer to that is no and I will tell you why.

Hybrid drive secrets

The secret to Seagate’s Hybrid drive lies in its FAST technology.  It provides a sort of automated disk caching that moves frequently accessed OS or boot data to NAND/SSD providing quicker access times.

Storage subsystem caching logic has been around in storage subsystems for decade’s now, ever since the IBM 3880 Mod 11&13 storage control systems came out last century.  However, these algorithms have gotten much more sophisticated over time and today can make a significant difference in storage system performance.  This can be easily witnessed by the wide variance in storage system performance on a per disk drive basis (e.g., see my post on Latest SPC-2 results – chart of the month).

Enterprise storage use of Hybrid drives?

The problem with using Hybrid drives in enterprise storage is that caching algorithms are based on some predictability of access/reference patterns.  When you have a Hybrid drive directly connected to a server or a PC it can view a significant portion of server IO (at least to the boot/OS volume) but more importantly, that boot/OS data is statically allocated, i.e., doesn’t move around all that much.   This means that one PC session looks pretty much like the next PC session and as such, the hybrid drive can learn an awful lot about the next IO session just by remembering the last one.

However, enterprise storage IO changes significantly from one storage session (day?) to another.  Not only are the end-user generated database transactions moving around the data, but the data itself is much more dynamically allocated, i.e., moves around a lot.

Backend data movement is especially true for automated storage tiering used in subsystems that contain both SSDs and disk drives. But it’s also true in systems that map data placement using log structured file systems.  NetApp Write Anywhere File Layout (WAFL) being a prominent user of this approach but other storage systems do this as well.

In addition, any fixed, permanent mapping of a user data block to a physical disk location is becoming less useful over time as advanced storage features make dynamic or virtualized mapping a necessity.  Just consider snapshots based on copy-on-write technology, all it takes is a write to have a snapshot block be moved to a different location.

Nonetheless, the main problem is that all the smarts about what is happening to data on backend storage primarily lies at the controller level not at the drive level.  This not only applies to data mapping but also end-user/application data access, as cache hits are never even seen by a drive.  As such, Hybrid drives alone don’t make much sense in enterprise storage.

Maybe, if they were intricately tied to the subsystem

I guess one way this could all work better is if the Hybrid drive caching logic were somehow controlled by the storage subsystem.  In this way, the controller could provide hints as to which disk blocks to move into NAND.  Perhaps this is a way to distribute storage tiering activity to the backend devices, without the subsystem having to do any of the heavy lifting, i.e., the hybrid drives would do all the data movement under the guidance of the controller.

I don’t think this likely because it would take industry standardization to define any new “hint” commands and they would be specific to Hybrid drives.  Barring standards, it’s an interface between one storage vendor and one drive vendor.  Probably ok if you made both storage subsystem and hybrid drives but there aren’t any vendor’s left that does both drives and the storage controllers.

~~~~

So, given the state of enterprise storage today and its continuing proclivity to move data around accross its backend storage,  I believe Hybrid drives won’t be used in enterprise storage anytime soon.

Comments?

 

Sequential only disk?!

St Sadurni d'Anoia - Cordoniu Grid - Shoes on Wires by Shoes on Wires (from flickr) (cc)
St Sadurni d'Anoia - Cordoniu Grid - Shoes on Wires by Shoes on Wires (from flickr) (cc)

Was at a Rocky Mountain Magnetics (IEEE) seminar a couple of weeks ago and a fellow from HDS GST was discussing recent advances in bit patterned media (BPM).  They had shown some success at 45nm by 45 nm bit cells which corresponded to about 380Gb/sqin a little less than current technology is capable of without BPM.  The session was on some of the methodology used to create BPM, some of the magnetic characteristics and parameters that BPM is capable of and some other aspects of the “challenges” inherent in moving to BPM.   I have written before on some of the challenges inherent in the the coming hard drive capacity wall.

But one thing that caught my interest was that even at the 45x45nm spacing, they were forced to use shingled writes to modify the bit cells.  Apparently today’s read-write heads are bigger than 45x45nm in at least width dimension.  Thus, they were forced to write two tracks at a time and then go back and re-write the 2nd (and 3rd) track on the next pass, and then the 3rd and 4th track, etc.  In this fashion they shingle wrote the whole media sample.

This seems to imply to me that the only way BPM can be written with todays head technology is sequentially.  What would this mean to the world of data processing.  There are already other media today that only support sequential, i.e, tape and optical.  And yet one significant advantage of disk at least in the past was that they could support random writes.

Today’s disk, at least SATA, high capacity disk, is already taking over from tape in the first tier backup solution. Any sequential only disk with even higher capacities would be a likely future revision of the current SATA disks in this application.

However there is more to data processing than purely backup.

How would we use a sequential only disk device?

Perhaps this would be an opening to support a hybrid disk like device, one that could support a limited amount of randomly written data while supporting a vast sequential address space.  This sounds like a new device architecture which would take some time to support but it’s not that different from data base and file system structures that exist today.

For file systems, file data is written sequentially through an contiguous sequence of blocks.  File meta-data, e.g., directory entries with file name, date, location, etc. is written randomly.

Database systems are a bit more complex.  Yes there are indexes similar to file meta-data above and tables are typically created sequentially.  But, table data can also be updated randomly.  It might take some effort to change this to be purely sequentially updated but that’s what would be needed to support such a sequential only disk.

Time for hybrid disks to re-appear

A couple of years back, when SSDs were expensive and relatively un-known, there was a version of disks for PCs and laptops that supported a relatively small amount of SSD and a large disk in a single 3.5″ form factor.  This was known as a hybrid disk and had some of the performance of pure SSD with the economics of disk.

Now BPM combined with SSD’s could be configured as a similar device with the SSDs portion supporting the random written data with the BPM disk supporting the sequentially written data.  But one difference between the old hybrid disk and this one is that the random data would only (maybe) exist in the SSD storage alone.

However, with BPM its possible that a portion (maybe a zone or two) of the disk surface could be BPM and a portion using (non-BPM) current technology.  This could also be done on a platter surface by surface basis if doing so on the same platter was too complex.  But such a device would also support hybrid random and sequential write operations without the need for NAND flash.

In any event, this is all relatively new and depends on the relative sizes of write heads and BPM bit cells.  But in order to get to 4Tb/sqin or higher technologists are talking BPM bit cells of 12nm by 12nm.  At that size shingled writes with todays head size would write span 8 or 9 tracks at a time.  Even taking current write head dimensions down by a factor of 5 would still leave one with a dual track width head.

One technique to reduce write size is to use thermally assisted recording (TAR) heads. This involves using a focused laser to heat up a single bit cell for writing.  The laser beam can be focused much smaller than the write head and could be used to isolate the writing to a single track.  Of course TAR heads are yet another new technology that would then have to be integrated into the new disk package.  But maybe this is the way to get back to a truly randomly written disk device.

Who knows this is all new technology and what’s published may not be a true representation of what’s available in the labs. But to get beyond todays capacity limitations there may be a new storage technology architecture on our horizon…