(Storage QoW 15-003): SMR disks in GA enterprise storage in 12 months? Yes@.85 probability

Hard Disk by Jeff Kubina (cc) (from Flickr)
Hard Disk by Jeff Kubina (cc) (from Flickr)

(Storage QoW 15-003): Will we see SMR (shingled magnetic recording) disks in GA enterprise storage systems over the next 12 months?

Are there two vendors of SMR?

Yes, both Seagate and HGST have announced and currently shipping (?) SMR drives, HGST has a 10TB drive and Seagate has an 8TB drive on the market since last summer.

One other interesting fact is that SMR will be the common format for all future disk head technologies including HAMR, MAMR, & BPMR (see presentation).

What would storage vendors have to do to support SMR drives?

Because of the nature of SMR disks, writes overlap other tracks so they must be written, at least in part, sequentially (see our original post on Sequential only disks). Another post I did reported on recent work by Garth Gibson at CMU (Shingled Magnetic Recording disks) which showed how multiple bands or zones on an SMR disk could be used some of which could be written randomly and others which could be written sequentially but all could be read randomly. With such an approach you could have a reasonable file system on an SMR device with a metadata partition (randomly writeable) and a data partition (sequentially writeable).

In order to support SMR devices, changes have been requested for the T10 SCSI  & T13 ATA command protocols. Such changes would include:

  • SMR devices support a new write cursor for each SMR sequential band.
  • SMR devices support sequential writes within SMR sequential bands at the write cursor.
  • SMR band write cursors can be read, statused and reset to 0. SMR sequential band LBA writes only occur at the band cursor and for each LBA written, the SMR device increments the band cursor by one.
  • SMR devices can report their band map layout.

The presentation refers to multiple approaches to SMR support or SMR drive modes:

  • Restricted SMR devices – where the device will not accept any random writes, all writes occur at a band cursor, random writes are rejected by the device. But performance would be predictable. 
  • Host Aware SMR devices – where the host using the SMR devices is aware of SMR characteristics and actively manages the device using write cursors and band maps to write the most data to the device. However, the device will accept random writes and will perform them for the host. This will result in sub-optimal and non-predictable drive performance.
  • Drive managed SMR devices – where the SMR devices acts like a randomly accessed disk device but maps random writes to sequential writes internally using virtualization of the drive LBA map, not unlike SSDs do today. These devices would be backward compatible to todays disk devices, but drive performance would be bad and non-predictable.

Unclear which of these drive modes are currently shipping, but I believe Restricted SMR device modes are already available and drive manufacturers would be working on Host Aware and Drive managed to help adoption.

So assuming Restricted SMR device mode availability and prototypes of T10/T13 changes are available, then there are significant but known changes for enterprise storage systems to support SMR devices.

Nevertheless, a number of hybrid storage systems already implement Log Structured File (LSF) systems on their backends, which mostly write sequentially to backend devices, so moving to a SMR restricted device modes would be easier for these systems.

Unclear how many storage systems have such a back end, but NetApp uses it for WAFL and just about every other hybrid startup has a LSF format for their backend layout. So being conservative lets say 50% of enterprise hybrid storage vendors use LSF.

The other 60% would have more of a problem implementing SMR restricted mode devices, but it’s only a matter of time before  all will need to go that way. That is assuming they still use disks. So, we are primarily talking about hybrid storage systems.

All major storage vendors support hybrid storage and about 60% of startups support hybrid storage, so adding these to together, maybe about 75% of enterprise storage vendors have hybrid.

Using analysis on QoW 15-001, about 60% of enterprise storage vendors will probably ship new hardware versions of their systems over the next 12 months. So of the 13 likely new hardware systems over the next 12 months, 75% have hybrid solutions and 50% have LSF, or ~4.9 new hardware systems will be released over the next 12 months that are hybrid and have LSF backends already.

What are the advantages of SMR?

SMR devices will have higher storage densities and lower cost. Today disk drives are running 6-8TB and the SMR devices run 8-10TB so a 25-30% step up in storage capacity is possible with SMR devices.

New drive support has in the past been relatively easy because command sets/formats haven’t changed much over the past 7 years or so, but SMR is different and will take more effort to support. The fact that all new drives will be SMR over time gives more emphasis to get on the band wagon as soon as feasible. So, I would give a storage vendor a 80% likelihood of implementing SMR, assuming they have new systems coming out, are already hybrid and are already using LSF.

So of the ~4.9 systems that are LSF/Hybrid/being released *.8, says ~3.9 systems will introduce SMR devices over the next 12 months.

For non-LSF hybrid systems, the effort seems much harder, so I would give the likelihood of implementing SMR about a 40% chance. So of the ~8.1 systems left that will be introduced in next year, 75% are hybrid or ~6.1 systems and they have a 40% likelihood of implementing SMR so ~2.4 of these non-LSF systems will probably introduce SMR devices.

There’s one other category that we need to consider and that would be startups in stealth. These could have been designing their hybrid storage for SMR from the get go. In QoW 15-001 analysis I assumed another ~1.8 startup vendors would emerge to GA over the next 12 months. And if we assume that 0.75% of these were hybrid then there’s ~1.4 startups vendors that could be using SMR technology in their hybrid storage for a (4.9+2.4+1.4(1.8*.75)= 8.7 systems have a high probability of SMR implementation over the next 12 months in GA enterprise storage products.

Forecast

So my forecast of SMR adoption by enterprise storage is Yes for .85 probability (unclear what the probability should be, but it’s highly probable).

~~~~

Comments?

DS3, the BlackPearl and the way forward for … tape

Spectra Logic Summit 2013, Nathan Thompson, CEO talking about  Spectra Logic's historyJust got back from an analyst summit with Spectra Logic.  They announced a new interface to tape called, Deep Simple Storage Service (DS3) and an appliance that implements this interface named the BlackPearl.  The intent is to broaden the use of tape to include, todays more web services, application environments.

The main problems addressed by the new interface is how do you map an essentially sequential, high throughput but long latency access to first byte, removable media device to an essentially small file, get and put environment.  And is there a market for such services. I think Spectra Logic has answered the first set of questions and is about to embark on a journey to answer the second set of questions.

The new interface – it’s all about simplifying tape

The DS3 interface answers the first set of questions. With DS3 Specra Logic has extended Amazon’s S3 interface to expose some of the sequentiality and removability of tape to the object storage world.

As you should recall, Amazon S3 is a RESTful, web interface that uses HTTP type GET and PUT commands to move data to and from the S3 storage service.  The data you are moving is considered an object and the object name or identifier is unique across the storage service. When you “PUT” an object you get to add key-value pairs of information called meta-data to the object. When you “GET” an object you retrieve the data from the storage service. The other thing one needs to be aware of is that you get and put objects into “BUCKET”s.

With DS3, Spectra Logic has added essentially 4 new commands to S3 protocol, which are:

  • Bulk Put – this provides a list of objects that one wants to “PUT” into a DS3 storage service and the response from the DS3 storage service is an ordered list of which objects to PUT in sequence and which DS3 storage server node (essentially an IP address) to send the data.
  • Bulk Get – this supplies a list of objects that one wants to GET from a DS3 storage service and the response is an ordered list of the sequence to get those objects and the node address to use for those object gets
  • Export Bucket – this identifies a BUCKET that you wish to remove from a DS3 storage service.  Presumably the response would be where the bucket can be found,  the number of pieces of media to expect, and some identification of the media serial numbers that constitute a bucket on the DS3 storage service.
  • Import Bucket – this identifies a new bucket which will be imported into a DS3 storage service and will supply some necessary information such as how many pieces of media to expect and the serial numbers of the media.  Presumably the response will be a location which can be used to import the media.

With these four simple commands and an appropriate DS3 client, DS3 server and DS3 storage backend one now has everything they need to support a removable media object store. I could see real value for export/import like this on the “rare occasion” when a  cloud service provider goes out of business.

The DS3 interface will be publicly available and the intent is to both supply Spectra Logic developed clients as well a ISV/partner developed DS3 clients so as to provide removable media object stores for all sorts of other applications.

Spectra’s is providing developer tools and documentation so that anyone can write a DS3 client. To that end, the DS3 developer portal is up (couldn’t find a link this AM but will update this post when I find it) and available free of charge to anyone today (believe you need to register to gain access to the doc.). They have a DS3 server simulator that DS3 client developers can use to test out and validate their client software. They also have a try & buy service for client developers.

Essentially, the combination of DS3 clients, DS3 servers and DS3 backend storage create a really deep archive for object data. It’s not intended for primary or secondary storage access but it’s big, cheap, and power/space efficient storage that can be very effective if used for archive data.

BlackPearl, the first DS3 Server

Their second announcement is the first implementation of a DS3 server, Spectra Logic calls BlackPearl(™). The BlackPearl connects to one or more Spectra Logic tape libraries as a backend store which together essentially provides a DS3 object storage archive. The DS3 server talks to DS3 clients on the front end. BlackPearl uses SAS or FC connected tape transports, which can be any transport currently supported by SpectraLogic tape libraries, including IBM TS1140, LTO-4, -5 and -6.

In addition to BlackPearl, Spectra Logic is releasing the first DS3 client for Hadoop. In this case, the DS3 client implements a new version of the Hadoop DistCp (distributed copy) command which can be used to create a copy of an HDFS directory tree onto a DS3 storage service.

Current BlackPearl hardware is a standard 2U server with 4-400GB SSDs inside which act as sort of a speed matching buffer for the Object interface to SAS/FC tape interface.

We only saw a configuration with one BlackPearl in operation (GA of BlackPearl is expected this December). But the plan is to support multiple BlackPearl appliances to talk with the same DS3 backend storage. In that case, there will be a shared database and (tape) resource scheduler across all the appliances in the cluster.

Yes, but what about the market?

It’s a gutsy move for someone like Spectra Logic to define a new open interface to deep storage. The fact that the appliance exists outside the tape library itself and could potentially support any removable media offers interesting architectural capabilities. The current (beta) implementation lacked some sophistication but the expectation is that much of this will be resolved by GA or over time through incremental enhancements.

Pricing is appealing. When you add BlackPearl appliance(s), with a T950 Spectra Logic tape library using LTO drives which supports uncompressed data store of ~2.4PB of archive data, the purchase price is ~$0.10/GB. This compares especially well with current Amazon Glacier pricing of $0.01/GB/Month, so that for the price of 10 months of Glacier storage you could own your own DS3 storage service.

At larger capacities, such as BlackPearl using T950 with TS1140 tape drives supporting 6.4PB is even cheaper, at $0.09/GB. Other configurations are available and in general bigger congfigurations are cheaper on $/GB and smaller ones more expensive.  The configurations are speced by Spectra Logic to have all the media, tape drives and BlackPearl systems be needed to support an archives object store.

As for markets, Spectra Logic already has beta interest from a large well known web services customer and a number of media & entertainment customers.

In the long run, Spectra Logic believes that if they can simplify access to tape for an application where it’s well qualified to support (deep archive), that this will enable new applications to take advantage of tape, that weren’t even dreamed of before.  By opening up a Object Store interface to tape, anyone currently using S3 is a potential customer.

Amazon announced earlier this year that they have over 2 trillion objects is their S3. And as far as I can tell (see my post Who’s the next winner in storage?) they are growing with no end in sight.

~~~~

Comments?

 

Database appliances!?

The Sun Oracle Database Machine by Oracle OpenWorld San Francisco 2009 (cc) (from Flickr)
The Sun Oracle Database Machine by Oracle OpenWorld San Francisco 2009 (cc) (from Flickr)

Was talking with Oracle the other day and discussing their Exadata database system.  They have achieved a lot of success with this product.  All of which got me to wondering whether database specific storage ever makes sense.  I suppose the ultimate arbiter of “making sense” is commercial viability and Oracle and others have certainly proven this, but from a technologist perspective I still wonder.

In my view, the Exadate system combines database servers and storage servers in one rack (with extensions to other racks).  They use an Infiniband bus between the database and storage servers and have a proprietary storage access protocol between the two.

With their proprietary protocol they can provide hints to the storage servers as to what’s coming next and how to manage the database data which make the Exadata system a screamer of a database machine.  Such hints can speed up database query processing, more efficiently store database structures, and overall speed up Oracle database activity.  Given all that it makes sense to a lot of customers.

Now, there are other systems which compete with Exadata like Teradata and Netezza (am I missing anyone?) that also support onboard database servers and storage servers.  I don’t know much about these products but they all seem targeted at data warehousing and analytics applications similar to Exadata but perhaps more specialized.

  • As far as I can tell Teradata has been around for years since they were spun out of NCR (or AT&T) and have enjoyed tremendous success.  The last annual report I can find for them shows their ’09 revenue around $772M with net income $254M.
  • Netezza started in 2000 and seems to be doing OK in the database appliance market given their youth.  Their last annual report for ’10 showed revenue of ~$191M and net income of $4.2M.  Perhaps not doing as well as Teradata but certainly commercially viable.

The only reason database appliances or machines exist is to speed up database processing.  If they can do that then they seem able to build a market place for themselves.

Database to storage interface standards

The key question from a storage analyst perspective is shouldn’t there be some sort of standards committee, like SNIA or others, that work to define a standard protocol between database servers and storage that can be adopted by other storage vendors.  I understand the advantage that proprietary interfaces can supply to an enlightened vendor’s equipment but there are more database vendors out there than just Oracle, Teradata and Netezza and there are (at least for the moment) many more storage vendors out there as well.

A decade or so ago, when I was with another storage company we created a proprietary interface for backup activity and it sold ok but in the end it didn’t sell enough to be worthwhile for either the backup or storage company to continue the approach.  At the time we were looking to support another proprietary interface for sorting but couldn’t seem to justify it.

Proprietary interfaces tend to lock customers in and most customers will only accept lockin if there is a significant advantage to your functionality.  But customer lock-in can lull vendors into not investing R&D funding in the latest technology and over time this affect will cause the vendor to lose any advantage they previously enjoyed.

It seems to me that the more successful companies (with the possible exception of Apple) tend to focus on opening up their interfaces rather than closing them down.  By doing so they introduce more competition which serves their customers better, in the long run.

I am not saying that if Oracle would standardize/publicize their database server to storage server interface that there would be a lot of storage vendors going after that market.  But the high revenues in this market, as evident from Teradata and Netezza, would certainly interest a few select storage vendors.  Now not all of Teradata’s or Netezza’s revenues derive from pure storage sales but I would wager a significant part do.

Nevertheless, a standard database storage protocol could readily be defined by existing database vendors in conjunction with SNIA.  Once defined, I believe some storage vendors would adopt this protocol along with every other storage protocol (iSCSI, FCoE, FC, FCIP, CIFS, NFS, etc.). Once that occurs, customers across the board would benefit from the increased competition and break away from the current customer lock-in with today’s database appliances.

Any significant success in the market from early storage vendor adopters of this protocol would certainly interest other  vendors inducing a cycle of increased adoption, higher competition, and better functionality.  In the end, database customers world wide will benefit from the increased price performance available in the open market.  And in the end that makes a lot more sense to me than the database appliances of today.

As to why Apple has excelled within a closed system environment, that will need to wait for a future post.

SNIA Tech Center Grand Opening – Mystery Storage Contest

SNIA Technology Center Grand Opening Event - mingling before the show
SNIA Technology Center Grand Opening Event - mingling before the show

Yesterday in Colorado Springs SNIA held a grand opening for their new Technology Center.  They have moved their tech center about a half mile closer to Pikes Peak.

The new center has less data center floor space than the old one but according to Wayne Adams/EMC and Chairman of SNIA Board, this is a better fit for what SNIA is doing today.  These days SNIA doesn’t do as many plugfests requiring equipment to all be co-located on the same data center floor.  Most of SNIA’s plugfests today are done over the web, remotely, across continent wide distances.

SNIA’s new tech center is being leased from LSI which occupies the other half of the building.  If you were familiar with the old tech center it was leased from HP which resided in the other half of the old tech center.  This seems to work well for SNIA in the past, providing access to a large number of technical experts which can be called on to help out when needed.

A couple of things I didn’t realize about SNIA:

  • They have been in Colorado Springs since 2001
  • They have only 14 USA employees but over 4000 volunteers
  • They host a world wide IO trace library which member companies contribute and can access
  • All the storage equipment in their data center is provided for free by vendor/member companies.
  • SNIA certification training is one of the top 10 certifications as judged by an independent agency

Took a tour of their technology display area highlighting SNIA initiatives:

SNIA Green Storage Initiative display station
SNIA Green Storage Initiative display station
  • Green Storage Initiative display had a technician working on getting the power meter working properly.  One of only two analyzers I saw in their data centers.  The green storage initiative is all about the energy consumption of storage.
  • Solid State Storage Initiative display had a presentation on the SSSI activities and white papers which they have produced on this technology.
  • XAM initiative section had a couple of people talking about the importance of XAM to compliance activities and storage archives.
  • FCIA section had a talk on FC and its impact on storage present and futures.

Other initiatives were on display as well but I spent less time studying them.  In the conference room and 2 training rooms, SNIA had presentations on their Certification activity and storage training opportunities.  Howie Goldstein (HGAI Associates) was in one of the training rooms talking about education he provides through SNIA.

SNIA Tech Center Computer Lab 1
SNIA Tech Center Computer Lab 1

The new tech center has two computer labs.  Lab 1 seemed to have just about every vendors storage hardware.  As you can see from the photo, each storage subsystem was dedicated to SNIA initiative activities.  Didn’t see a lot of servers using this storage but they were probably located in computer lab 2.  In the picture one can see EMC, HDS, APC, and at the end, 3PAR storage.  On the other side of the aisle (not shown) was HP, NetApp, PillarData, and more HDS storage (and probably missed one or two more).

Don’t recall the SAN switch hardware but it wouldn’t surprise me to include a representative selection of all the vendors.  There was more switch hardware in Lab 2 and there we could make easily make out (McData or now) Brocade switch hardware.

SNIA Tech Center Computer Lab 2 switching hw
SNIA Tech Center Computer Lab 2 switching hw

Computer Lab 2 seemed to have most of the server hardware and more storage.  But both labs looked pretty clean from my perspective, probably due to all the press and grand opening celebration.  It ought to look more lived in/worked in over time.  I always like labs to be a bit more chaotic (see my price of quality post for a look at a busy HP EVA lab).

You would think with all SNIAs focus on plugfests there would be a lot more hardware analyzers floating around the two labs.  But outside of the power meter in the Green Storage Initiative display the only other analyzer like equipment I saw was a lone workstation behind some of the storage in lab 2.  It was way too clean to actually be in use. There ought to be post-it notes all over it, cables hanging all around it, with manuals and other documentation underneath it (but maybe I am a little old school docs should all be online nowadays).  Also, I didn’t see one white board in either of the labs, also a clear sign of early life.

SNIA Tech Center Lab 2 Lone portable workstation
SNIA Tech Center Lab 2 Lone portable workstation

We didn’t get to see much of the office space but it looked like plenty of windows and decent sized offices. Not sure how many people would be assigned to each but they have to put the volunteers and employees someplace.

Mystery Storage Contest – 1

And now for a new contest. See if you can determine the storage I am showing in the photo below.  Please submit your choice via comment(s) to this post and be sure to supply a valid email address.

Contest participant(s) will all receive a subscription to my (free) monthly Storage Intelligence email newsletter.  One winner will be chosen at random from all correct entries and will earn a free coupon code for 30% off any Silverton Consulting Briefings purchased through the web (once I figure out how to do this).

Correct entries must supply a valid email and identify the storage vendor and product model depicted in the picture.  Bonus points will be awarded for anyone who can tell the raw capacity of the subsystem in the picture.

SNIA volunteers, SNIA employees, SNIA member company employees and family members of any of these are not allowed to submit answers.  The contest will be closed 90 days after this post is published.  (And would someone from SNIA Technology Center please call me at 720-221-7270 and provide the identification and raw capacity of the storage subsystem depicted below in Computer Lab 2 on the NorthWest Wall.)

SNIA Technology Center - Mystery Storage 1
SNIA Technology Center - Mystery Storage 1

Remember our first Mystery Storage Contest closes in 90 days.

Also if you would like to submit an entry picture for future mystery storage contests please indicate so in your comment and I will be happy to contact you directly.