A couple of weeks ago I attended a day of the SNIA Storage Developers Conference (SDC) where Garth Gibson of Carnegie Mellon University Parallel Data Lab (CMU PDL) and Panasas was giving a talk of what they are up to at CMU’s storage lab. His talk at the conference was on shingled magnetic recording (SMR) disks. We have discussed this topic before in our posts on Sequential only disks?! and in Disk trends revisited. SMR may require a re-thinking of how we currently access disk storage.
Recall that shingled magnetic recording uses a write head that overwrites multiple tracks at a time (see graphic above), with one track being properly written and the adjacent (inward) tracks being overwritten. As the head moves to the next track, that track can be properly written but more adjacent (inward) tracks are overwritten, etc. In this fashion data can be written sequentially, on overlapping write passes. In contrast, read heads can be much narrower and are able to read a single track.
In my post, I assumed that this would mean that the new shingled magnetic recording disks would need to be accessed sequentially not unlike tape. Such a change would need a massive rewrite to only write data sequentially. I had suggested this could potentially work if one were to add some SSD or other NVRAM to the device to help manage the mapping of the data to the disk. Possibly that plus a very sophisticated drive controller, not unlike SSD wear leveling today, could handle mapping a physically sequentially accessed disk to a virtually randomly accessed storage protocol.
Garth’s approach to the SMR dilemma
Garth and his team of researchers are taking another tack at the problem. In his view there are multiple groups of tracks on an SMR disk (zones or bands). Each band can be either written sequentially or randomly but all bands can be read randomly. One can break up the disk to include sections of multiple shingled bands, that are sequentially written and less, non-shingled bands that can be randomly written. Of course there would be a gap between the shingled bands in order not to overwrite adjacent bands. And there would also be gaps between the randomly written tracks in a non-shingled partition to allow for the wider track writing that occurs with the SMR write head.
His pitch at the conference dealt with some characteristics of such a multi-band disk device. Such as
- How to determine the density for a device that has multiple bands of both shingled write data and randomly written data.
- How big or small a shingled band should be in order to support “normal” small block and randomly accessed file IO.
- How many randomly written tracks or what the capacity of the non-shingled bands would need to be to support “normal” file IO activity.
For maximum areal density one would want large shingled bands. There are other interesting considerations that were not as obvious but I won’t go into here.
SCSI protocol changes for SMR disks
The other, more interesting section of Garth’s talk was on recent proposed T10 and T13 changes to support SMR disks that supported shingled and non-shingled partitions and what needed to be done to support SMR devices.
The SCSI protocol changes being considered to support SMR devices include:
- A new write cursor for shingled write bands that indicates the next LBA to be written. The write cursor starts out at a relative band address of 0 and as each LBA is written consecutively in the band it’s incremented by one.
- A write cursor can be reset (to zero) indicating that the band has been erased
- Each drive maintains the band map and current cursor position within each band and this can be requested by SCSI drivers to understand the configuration of the drive.
Probably other changes are required as well but these seem sufficient to flesh out the problem.
SMR device software support
Garth and his team implemented an SMR device, emulated in software using real random accessed devices. They then implemented an SMR device driver that used the proposed standards changes and finally, implemented a ShingledFS file system to use this emulated SMR disk to see how it would work. (See their report on Shingled Magnetic Recording for Big Data Applications for more information.)
The CMU team implemented a log structured file system for the ShingledFS that only wrote data to the emulated SMR disk shingled partition sequentially, except for mapping and meta-data information which was written and updated randomly in a non-shingled partition.
You may recall that a log structured file system is essentially written as a sequential stream of data (not unlike a log). But there is additional mapping required that indicates where file data is located in the log which allows for randomly accessing the file data.
In their report and at the conference, Garth presented some benchmark results for a big data application called Terasort (essentially Teragen, Terasort and Teravalidate) which seems to use Hadoop to sort a large body of data. Not sure I can replicate this information here but suffice it to say at the moment the emulated SMR device with ShingledFS did not beat a base EXT3 or FUSE using the same hardware for these applications.
Now the CMU project wAs done by a bunch of smart researchers but it’s still relatively new and not necessarily that optimized. Thus, there’s probably some room for improvement in the ShingledFS and maybe even the emulated SMR device and/or the SMR device driver.
At the moment Garth and his team seem to believe that SMR devices are certainly feasible and would take only modest changes to the SCSI protocols to support such devices. As for file system support there is plenty of history surrounding log structured file systems so these are certainly doable but would require probably extensive development to implemented in various OS to support an SMR device. The device driver changes don’t seem to be as significant.
It certainly looks like there’s going to be SMR devices in our future. It’s just a question whether they will be ever as widely supported as the randomly accessed disk device we know and love today. Possibly, this could all be behind a storage subsystem that makes the technology available as networked storage capacity and over time maybe SMR devices could be implemented in more standard OS device drivers and file systems. Nevertheless, to keep capacity and areal density on their current growth trajectory, SMR disks are coming, it’s just a matter of time.