DS3, the BlackPearl and the way forward for … tape

Spectra Logic Summit 2013, Nathan Thompson, CEO talking about  Spectra Logic's historyJust got back from an analyst summit with Spectra Logic.  They announced a new interface to tape called, Deep Simple Storage Service (DS3) and an appliance that implements this interface named the BlackPearl.  The intent is to broaden the use of tape to include, todays more web services, application environments.

The main problems addressed by the new interface is how do you map an essentially sequential, high throughput but long latency access to first byte, removable media device to an essentially small file, get and put environment.  And is there a market for such services. I think Spectra Logic has answered the first set of questions and is about to embark on a journey to answer the second set of questions.

The new interface – it’s all about simplifying tape

The DS3 interface answers the first set of questions. With DS3 Specra Logic has extended Amazon’s S3 interface to expose some of the sequentiality and removability of tape to the object storage world.

As you should recall, Amazon S3 is a RESTful, web interface that uses HTTP type GET and PUT commands to move data to and from the S3 storage service.  The data you are moving is considered an object and the object name or identifier is unique across the storage service. When you “PUT” an object you get to add key-value pairs of information called meta-data to the object. When you “GET” an object you retrieve the data from the storage service. The other thing one needs to be aware of is that you get and put objects into “BUCKET”s.

With DS3, Spectra Logic has added essentially 4 new commands to S3 protocol, which are:

  • Bulk Put – this provides a list of objects that one wants to “PUT” into a DS3 storage service and the response from the DS3 storage service is an ordered list of which objects to PUT in sequence and which DS3 storage server node (essentially an IP address) to send the data.
  • Bulk Get – this supplies a list of objects that one wants to GET from a DS3 storage service and the response is an ordered list of the sequence to get those objects and the node address to use for those object gets
  • Export Bucket – this identifies a BUCKET that you wish to remove from a DS3 storage service.  Presumably the response would be where the bucket can be found,  the number of pieces of media to expect, and some identification of the media serial numbers that constitute a bucket on the DS3 storage service.
  • Import Bucket – this identifies a new bucket which will be imported into a DS3 storage service and will supply some necessary information such as how many pieces of media to expect and the serial numbers of the media.  Presumably the response will be a location which can be used to import the media.

With these four simple commands and an appropriate DS3 client, DS3 server and DS3 storage backend one now has everything they need to support a removable media object store. I could see real value for export/import like this on the “rare occasion” when a  cloud service provider goes out of business.

The DS3 interface will be publicly available and the intent is to both supply Spectra Logic developed clients as well a ISV/partner developed DS3 clients so as to provide removable media object stores for all sorts of other applications.

Spectra’s is providing developer tools and documentation so that anyone can write a DS3 client. To that end, the DS3 developer portal is up (couldn’t find a link this AM but will update this post when I find it) and available free of charge to anyone today (believe you need to register to gain access to the doc.). They have a DS3 server simulator that DS3 client developers can use to test out and validate their client software. They also have a try & buy service for client developers.

Essentially, the combination of DS3 clients, DS3 servers and DS3 backend storage create a really deep archive for object data. It’s not intended for primary or secondary storage access but it’s big, cheap, and power/space efficient storage that can be very effective if used for archive data.

BlackPearl, the first DS3 Server

Their second announcement is the first implementation of a DS3 server, Spectra Logic calls BlackPearl(™). The BlackPearl connects to one or more Spectra Logic tape libraries as a backend store which together essentially provides a DS3 object storage archive. The DS3 server talks to DS3 clients on the front end. BlackPearl uses SAS or FC connected tape transports, which can be any transport currently supported by SpectraLogic tape libraries, including IBM TS1140, LTO-4, -5 and -6.

In addition to BlackPearl, Spectra Logic is releasing the first DS3 client for Hadoop. In this case, the DS3 client implements a new version of the Hadoop DistCp (distributed copy) command which can be used to create a copy of an HDFS directory tree onto a DS3 storage service.

Current BlackPearl hardware is a standard 2U server with 4-400GB SSDs inside which act as sort of a speed matching buffer for the Object interface to SAS/FC tape interface.

We only saw a configuration with one BlackPearl in operation (GA of BlackPearl is expected this December). But the plan is to support multiple BlackPearl appliances to talk with the same DS3 backend storage. In that case, there will be a shared database and (tape) resource scheduler across all the appliances in the cluster.

Yes, but what about the market?

It’s a gutsy move for someone like Spectra Logic to define a new open interface to deep storage. The fact that the appliance exists outside the tape library itself and could potentially support any removable media offers interesting architectural capabilities. The current (beta) implementation lacked some sophistication but the expectation is that much of this will be resolved by GA or over time through incremental enhancements.

Pricing is appealing. When you add BlackPearl appliance(s), with a T950 Spectra Logic tape library using LTO drives which supports uncompressed data store of ~2.4PB of archive data, the purchase price is ~$0.10/GB. This compares especially well with current Amazon Glacier pricing of $0.01/GB/Month, so that for the price of 10 months of Glacier storage you could own your own DS3 storage service.

At larger capacities, such as BlackPearl using T950 with TS1140 tape drives supporting 6.4PB is even cheaper, at $0.09/GB. Other configurations are available and in general bigger congfigurations are cheaper on $/GB and smaller ones more expensive.  The configurations are speced by Spectra Logic to have all the media, tape drives and BlackPearl systems be needed to support an archives object store.

As for markets, Spectra Logic already has beta interest from a large well known web services customer and a number of media & entertainment customers.

In the long run, Spectra Logic believes that if they can simplify access to tape for an application where it’s well qualified to support (deep archive), that this will enable new applications to take advantage of tape, that weren’t even dreamed of before.  By opening up a Object Store interface to tape, anyone currently using S3 is a potential customer.

Amazon announced earlier this year that they have over 2 trillion objects is their S3. And as far as I can tell (see my post Who’s the next winner in storage?) they are growing with no end in sight.

~~~~

Comments?

 

Tape still alive, well and growing at Spectra Logic

T-Finity library at SpectraLogic's test facility (c) 2011 Silverton Consulting, All Rights Reserved
T-Finity library at SpectraLogic's test facility (c) 2011 Silverton Consulting, All Rights Reserved

Today I met with Spectra Logic execs and some of their Media and Entertainment (M&E) customers, and toured their manufacturing, test labs and briefing center.  The tour was a blast and the customers Kyle Knack from National Geographic (Nat Geo) Global Media, Toni Perez from Medcom (Panama based entertainment company) and Lee Coleman from Entertainment Tonight (ET) all talked about their use of the T-950 Spectra Logic tape libraries in the media ingest, editing and production processes.

Mr. Collins from ET spoke almost reverently about their T-950 and how it has enabled ET to access over 30 years of video interviews, movie segments and other media they can now use to put together clips on just about any entertainment subject imaginable.

He  talked specifically about the obit they did for Michael Jackson and how they were able to grab footage from an interview they did years ago and splice it together with more recent media to show a more complete story.  He also showed a piece on some early Eddie Murphy film footage and interviews they had done at the time which they used in a recent segment about his new movie.

All this was made possible by moving to digital file formats and placing digital media in their T-950 tape libraries.

Spectra Logic T-950 (I think) with TeraPack loaded in robot (c) 2011 Silverton Consulting, All Rights Reserved
Spectra Logic T-950 (I think) with TeraPack loaded in robot (c) 2011 Silverton Consulting, All Rights Reserved

Mr. Knack from Nat Geo Media said every bit of media they get anymore, automatically goes into the library archive and becomes the “original copy” of the media used in case other copies are corrupted or lost.  Nat Geo started out only putting important media in the library but found it just cost so much less to just store it in the tape archive that they decided it made more sense to just move all media to the tape library.

Typically they keep two copies in their tape library and important media is also copied to tape and shipped offsite (3 copies for this data).  They have a 4-frame T-950 with around 4000 slots and 14 drives (combination of LTO-4 and -5).  They use FC and FCoE storage for their primary storage and depend on 1000s of SATA drives for primary storage access.

He said they only use SSDs for some metadata support for their web site. He found that SATA drives can handle their big block sequential and provide consistent throughput and especially important to M&E companies consistent latency.

3D printer at Spectra Logic (for mechanical parts fabrication) (c) 2011 Silverton Consulting, All Rights Reserved
3D printer at Spectra Logic (for mechanical parts fabrication) (c) 2011 Silverton Consulting, All Rights Reserved

Mr. Perez from MedCom had much the same story. They were in the process of moving off of proprietary video tape format (Sony Betacam) to LTO media and digital files. The process is still ongoing although they are more than halfway there for current production.

They still have a lot of old media in Betacam format which will take them years to convert to digital files but they are at least starting this activity.  He said a recent move from one site to another revealed that much of the Betacam tapes were no longer readable.  Digital files on LTO tape should solve that problem for them when they finally get there.

Matt Starr Spectra Logic CTO talked about the history of tape libraries at Spectra Logic which was founded in 1998 and has been laser focused on tape data protection and tape libraries.

I find it pleasantly surprising that a company today can just supply tape libraries with software and make a ongoing concern of it. Spectra Logic must be doing something right, revenue grew 30% YoY last year and they are outgrowing their current (88K sq ft) office, lab, and manufacturing building they just moved into earlier this year and have just signed to occupy another building providing 55K sq ft of more space.

T-Series robot returning TeraPack to shelf (c) 2011 Silverton Consulting, All Rights Reserved
T-Series robot returning TeraPack to shelf (c) 2011 Silverton Consulting, All Rights Reserved

Molly Rector Spectra Logic CMO talked about the shift in the market from peta-scale (10**15 bytes) storage repositories to exa-scale (10**18 bytes) ones.  Ms. Rector believed that today’s cloud storage environments can take advantage of these large tape based, archives to provide much more economical storage for their users without suffering any performance penalty.

At lunch with Matt Starr, Fred Moore (Horison Information Strategies)Mark Peters (Enterprise Strategy Group) and I were talking about HPSS (High Performance Storage System) developed in conjunction with IBM and 5 US national labs that supports vast amounts of data residing across primary disk and tape libraries.

Matt said that there are about a dozen large HPSS sites (HPSS website shows at least 30 sites using it) that store a significant portion of the worlds 1ZB (10**21 bytes) of digital data created this past year (see my 3.3 exabytes of data a day!? post).  Later that day talking with Nathan Thompson Spectra Logic CEO, he said these large HPSS sites probably store ~10% of the worlds data, or 100EB.  I find that difficult to comprehend that much data at only ~12 sites but the national labs do have lots of data on hand.

Nowadays you can get a Spectra Logic T-Finity tape complex with 122K slot, using LTO-4/-5 or IBM TS1140 (enterprise class) tape drives.  This large a T-Finity has 4 rows of tape libraries which uses the ‘Skyway’ to transport a terapack of tape cartridges between one library row to the another.   All Spectra Logic libraries are built around a tape cartridge package they call the TeraPack which contains 10 LTO cartridges or (I think) 9-TS1140 tape cartridges (they are bigger than LTO tapes).  The TeraPack is used to import or export tapes from the library and all the tape slots in the library.

The software used to control all this is called BlueScale and is used in their T50e, a small, 50 slot library all the way up to the 122K T-Finity tape complex.  There are some changes for configuration, robotics and other personalization for each library type but the UI looks exactly the same across any of their libraries. Moreover, BlueScale offers the same enterprise level of functionality (e.g., drive and media life management) services for all Spectra Logic tape libraries.

Day 1 for SpectraPRDay closed with the lab tour and dinner.  Day 2 will start discussing futures and will be under NDA so there won’t be much to talk about right away. But from what I can see, Spectra Logic seems to be breaking down the barriers inhibiting tape use and providing tape library systems, that people almost revere.

I haven’t seen that sort of reaction about a tape library since the STK 4400 first came out last century.

—-

Comments?

Storage strategic inflection points

EMC vs S&P 500 Stock price chart
EMC vs S&P 500 Stock price chart - 20 yrs from Yahoo Finance

Both EMC and Spectra Logic celebrated their 30 years in business this month and it got me to thinking. Both companies started the same time but one is a ~$14B revenue (’09 projected) behemoth and the other a relatively successful, but relatively mid-size storage company (Spectra Logic is private and does not report revenues). What’s the big difference between these two. As far as I can tell both companies have been adequately run for some time now by very smart people. Why is one two or more orders of magnitude bigger than the other – recognizing strategic inflection points is key.

So what is a strategic inflection point? Andy Grove may have coined the term and calls a strategic inflection point a point “… where the old strategic picture dissolves and gives way to the new.” In my view EMC has been more successful at recognizing storage strategic inflection points than Spectra Logic and this explains a major part of their success.

EMC’s history in brief

In listening this week to Joe Tucci’s talk at EMC Analyst Days he talked about the rather humble beginnings of EMC. It started out selling furniture and memory for mainframes (I think) but Joe said it really took off in 1991, almost 12 years after it was founded. It seems they latched onto some DRAM based SSD like storage technology and converted it to use disk as a RAID storage device in the mainframe and later open systems arena. RAID killed off the big (14″ platter) disk devices that had dominated storage at that time and once started could not be stopped. Whether by luck or smarts EMC’s push into RAID storage made them what they are today – probably a little of both.

It was interesting to see how this played out in the storage market space. RAID used smaller disks, first 8″, then 5.25″ and now 3.5″. When first introduced, manufacturing costs for the RAID storage were so low that one couldn’t help but make a profit selling against big disk devices that held 14″ platters. The more successful RAID became, the more available and reliable the smaller disks became which led to a virtuous cycle culminating in the highly reliable 3.5″ disk devices available today. Not sure Joe was at EMC at the time but if he was he would probably have called that transition between big platter disks and RAID a “strategic inflection point” in the storage industry at the time.

Most of EMC’s competitors and customers would probably say that aggressive marketing also helped propel EMC to be the top of the storage heap. I am not sure which came first, the recognition of a strategic inflection like RAID or the EMC marketing machine but, together, they gave EMC a decided advantage that re-constructed the storage industry.

Spectra Logic’s history in brief

As far as I can tell Spectra Logic has been in the backup software for a long time and later started supporting tape technology where they are well known today. Spectra Logic has disk storage systems as well but they seem better known for their tape and backup technology.

The big changes in tape technology over the past 30 years have been tape cartridges and robotics. Although tape cartridges were introduced by IBM (for the IBM 3480 in 1985), the first true tape automation was introduced by Storage Technology Corp. (with the STK 4400 in 1987). Storage Technology rode the wave of the robotics revolution throughout the late 80’s into the mid 90’s and was very successful for a time. Spectra Logic’s entry into tape robotics was sometime later (1995) but by the time they got onboard it was a very successful and mature technology.

Nonetheless, the revolution in tape technology and operations brought on by these two advances, probably held off the decline in tape for a decade or two, and yet it could not ultimately stem the tide in tape use apparent today (see my post on Repositioning of tape). Spectra Logic has recently introduced a new tape library.

Another strategic inflection point that helped EMC

Proprietary “Open” Unix systems had started to emerge in the late 80’s and early 90’s and by the mid 90’s were beginning to host most new and sophisticated applications. The FC interface also emerged in the early to mid 90’s as a replacement to HPC-HPPI technology and for awhile battled it out against SSA technology from IBM but by 1997 emerged victorious. Once FC and the follow-on higher level protocols (resulting in SAN) were available, proprietary Unix systems had the IO architecture to support any application needed by the enterprise and they both took off feeding on each other. This was yet another strategic inflection point and I am not sure if EMC was the first entry into this market but they sure were the biggest and as such, quickly emerged to dominate it. In my mind EMC’s real accelerated growth can be tied to this timeframe.

EMC’s future bets today

Again, today, EMC seems to be in the fray for the next inflection. Their latest bets are on virtualization technology in VMware, NAND-SSD storage and cloud storage. They bet large on the VMware acquisition and it’s working well for them. They were the largest company and earliest to market with NAND-SSD technology in the broad market space and seem to enjoy a commanding lead. Atmos is not the first cloud storage service out there, but once again EMC was one of the largest companies to go after this market.

One can’t help but admire a company that swings for the bleachers every time they get a chance at bat. Not every one is going out of the park but when they get ahold of one, sometimes they can change whole industries.