Storage strategic inflection points

EMC vs S&P 500 Stock price chart
EMC vs S&P 500 Stock price chart - 20 yrs from Yahoo Finance

Both EMC and Spectra Logic celebrated their 30 years in business this month and it got me to thinking. Both companies started the same time but one is a ~$14B revenue (’09 projected) behemoth and the other a relatively successful, but relatively mid-size storage company (Spectra Logic is private and does not report revenues). What’s the big difference between these two. As far as I can tell both companies have been adequately run for some time now by very smart people. Why is one two or more orders of magnitude bigger than the other – recognizing strategic inflection points is key.

So what is a strategic inflection point? Andy Grove may have coined the term and calls a strategic inflection point a point “… where the old strategic picture dissolves and gives way to the new.” In my view EMC has been more successful at recognizing storage strategic inflection points than Spectra Logic and this explains a major part of their success.

EMC’s history in brief

In listening this week to Joe Tucci’s talk at EMC Analyst Days he talked about the rather humble beginnings of EMC. It started out selling furniture and memory for mainframes (I think) but Joe said it really took off in 1991, almost 12 years after it was founded. It seems they latched onto some DRAM based SSD like storage technology and converted it to use disk as a RAID storage device in the mainframe and later open systems arena. RAID killed off the big (14″ platter) disk devices that had dominated storage at that time and once started could not be stopped. Whether by luck or smarts EMC’s push into RAID storage made them what they are today – probably a little of both.

It was interesting to see how this played out in the storage market space. RAID used smaller disks, first 8″, then 5.25″ and now 3.5″. When first introduced, manufacturing costs for the RAID storage were so low that one couldn’t help but make a profit selling against big disk devices that held 14″ platters. The more successful RAID became, the more available and reliable the smaller disks became which led to a virtuous cycle culminating in the highly reliable 3.5″ disk devices available today. Not sure Joe was at EMC at the time but if he was he would probably have called that transition between big platter disks and RAID a “strategic inflection point” in the storage industry at the time.

Most of EMC’s competitors and customers would probably say that aggressive marketing also helped propel EMC to be the top of the storage heap. I am not sure which came first, the recognition of a strategic inflection like RAID or the EMC marketing machine but, together, they gave EMC a decided advantage that re-constructed the storage industry.

Spectra Logic’s history in brief

As far as I can tell Spectra Logic has been in the backup software for a long time and later started supporting tape technology where they are well known today. Spectra Logic has disk storage systems as well but they seem better known for their tape and backup technology.

The big changes in tape technology over the past 30 years have been tape cartridges and robotics. Although tape cartridges were introduced by IBM (for the IBM 3480 in 1985), the first true tape automation was introduced by Storage Technology Corp. (with the STK 4400 in 1987). Storage Technology rode the wave of the robotics revolution throughout the late 80’s into the mid 90’s and was very successful for a time. Spectra Logic’s entry into tape robotics was sometime later (1995) but by the time they got onboard it was a very successful and mature technology.

Nonetheless, the revolution in tape technology and operations brought on by these two advances, probably held off the decline in tape for a decade or two, and yet it could not ultimately stem the tide in tape use apparent today (see my post on Repositioning of tape). Spectra Logic has recently introduced a new tape library.

Another strategic inflection point that helped EMC

Proprietary “Open” Unix systems had started to emerge in the late 80’s and early 90’s and by the mid 90’s were beginning to host most new and sophisticated applications. The FC interface also emerged in the early to mid 90’s as a replacement to HPC-HPPI technology and for awhile battled it out against SSA technology from IBM but by 1997 emerged victorious. Once FC and the follow-on higher level protocols (resulting in SAN) were available, proprietary Unix systems had the IO architecture to support any application needed by the enterprise and they both took off feeding on each other. This was yet another strategic inflection point and I am not sure if EMC was the first entry into this market but they sure were the biggest and as such, quickly emerged to dominate it. In my mind EMC’s real accelerated growth can be tied to this timeframe.

EMC’s future bets today

Again, today, EMC seems to be in the fray for the next inflection. Their latest bets are on virtualization technology in VMware, NAND-SSD storage and cloud storage. They bet large on the VMware acquisition and it’s working well for them. They were the largest company and earliest to market with NAND-SSD technology in the broad market space and seem to enjoy a commanding lead. Atmos is not the first cloud storage service out there, but once again EMC was one of the largest companies to go after this market.

One can’t help but admire a company that swings for the bleachers every time they get a chance at bat. Not every one is going out of the park but when they get ahold of one, sometimes they can change whole industries.

Protecting the Yottabyte archive

blinkenlights by habi (cc) (from flickr)
blinkenlights by habi (cc) (from flickr)

In a previous post I discussed what it would take to store 1YB of data in 2015 for the National Security Agency (NSA). Due to length, that post did not discuss many other aspects of the 1YB archive such as ingest, index, data protection, etc. Thus, I will attempt to cover each of these in turn and as such, this post will cover some of the data protection aspects of the 1YB archive and its catalog/index.

RAID protecting 1YB of data

Protecting the 1YB archive will require some sort of parity protection. RAID data protection could certainly be used and may need to be extended to removable media (RAID for tape), but that would require somewhere in the neighborhood of 10-20% additional storage (RAID5 across 10 to 5 tape drives). It’s possible with Reed-Solomon encoding and using RAID6 that we could take this down to 5-10% of additional storage (RAID 6 for a 40 to a 20 wide tape drive stripe). Possibly other forms of ECC (such as turbo codes) might be usable in a RAID like configuration which would give even better reliability with less additional storage.

But RAID like protection also applies to the data catalog and indexes required to access the 1YB archive of data. Ditto for the online data itself while it’s being ingested, indexed, or readback. For the remainder of this post I ignore the RAID overhead but suffice it to say with today’s an additional 10% storage for parity will not change this discussion much.

Also in the original post I envisioned a multi-tier storage hierarchy but the lowest tier always held a copy of any files residing in the upper tiers. This would provide some RAID1 like redundancy for any online data. This might be pretty usefull, i.e., if a file is of high interest, it could have been accessed recently and therefore resides in upper storage tiers. As such, multiple copies of interesting files could exist.

Catalog and indexes backups for 1YB archive

IMHO, RAID or other parity protection is different than data backup. Data backup is generally used as a last line of defense for hardware failure, software failure or user error (deleting the wrong data). It’s certainly possible that the lowest tier data is stored on some sort of WORM (write once read many times) media meaning it cannot be overwritten, eliminating one class of user error.

But this presumes the catalog is available and the media is locatable. Which means the catalog has to be preserved/protected from user error, HW and SW failures. I wrote about whether cloud storage needs backup in a prior post and feel strongly that the 1YB archive would also require backups as well.

In general, backup today is done by copying the data to some other storage and keeping that storage offsite from the original data center. At this amount of data, most likely the 2.1×10**21 of catalog (see original post) and index data would be copied to some form of removable media. The catalog is most important as the other two indexes could potentially be rebuilt from the catalog and original data. Assuming we are unwilling to reindex the data, with LTO-6 tape cartridges, the catalog and index backups would take 1.3×10**9 LTO-6 cartridges (at 1.6×10**12 bytes/cartridge).

To back up this amount of data once per month would take a gaggle of tape drives. There are ~2.6×10**6 seconds/month and each LTO-6 drive can transfer 5.4×10**8 bytes/sec or 1.4X10**15 bytes/drive-month but we need to backup 2.1×10**21 bytes of data so we need ~1.5×10**6 tape transports. Now tapes do not operate 100% of the time because when a cartridge becomes full it has to be changed out with an empty one, but this amounts to a rounding error at these numbers.

To figure out the tape robotics needed to service 1.5×10**6 transports we could use the latest T-finity tape library just announced by Spectra Logic . The T-Finity supports 500 tape drives and 122,000 tape cartridges, so we would need 3.0×10**3 libraries to handle the drive workload and about 1.1×10**4 libraries to store the cartridge set required, so 11,000 T-finity libraries would suffice. Presumably, using LTO-7 these numbers could be cut in half ~5,500 libraries, ~7.5×10**5 transports, and 6.6×10**8 cartridges.

Other removable media exist, most notably the Prostor RDX. However RDX roadmap info out to the next generation are not readily available and high-end robotics are do not currently support RDX. So for the moment tape seems the only viable removable backup for the catalog and index for the 1YB archive.

Mirroring the data

Another approach to protecting the data is to mirror the catalog and index data. This involves taking the data and copying it to another online storage repository. This doubles the storage required (to 4.2×10**21 bytes of storage). Replication doesn’t easily protect from user error but is an option worthy of consideration.

Networking infrastructure needed

Whether mirroring or backing up to tape, moving this amount of data will require substantial networking infrastructure. If we assume that in 2105 we have 32GFC (32 gb/sec fibre channel interfaces). Each interface could potentially transfer 3.2GB/s or 3.2×10**9 bytes/sec. Mirroring or backing up 2.1×10**21 bytes over one month will take ~2.5×10**6 32GFC interfaces. Probably should have twice this amount of networking just to not have any one be a bottleneck so 5×10**6 32GFC interfaces should work.

As for switches, the current Brocade DCX supports 768 8GFC ports and presumably similar port counts will be available in 2015 to support 32GFC. In addition if we assume at least 2 ports per link, we will need ~6,500 fully populated DCX switches. This doesn’t account for multi-layer switches and other sophisticated switch topologies but could be accommodated with another factor of 2 or ~13,000 switches.

Hot backups require journals

This all assumes we can do catalog and index backups once per month and take the whole month to do them. Now storage today normally has to be taken offline (via snapshot or some other mechanism) to be backed up in a consistent state. While it’s not impossible to backup data that is concurrently being updated it is more difficult. In this case, one needs to maintain a journal file of the updates going on while the data is being backed up and be able to apply the journaled changes to the data backed up.

For the moment I am not going to determine the storage requirements for the journal file required to cover the catalog transactions for a month, but this is dependent on the change rate of the catalog data. So it will necessarily be a function of the index or ingest rate of the 1YB archive to be covered in a future post.

Stay tuned, I am just having too much fun to stop.

Repositioning of tape

HP LTO 4 Tape Media
HP LTO 4 Tape Media
In my past life, I worked for a dominant tape vendor. Over the years, we had heard a number of times that tape was dead. But it never happened. BTW, it’s also not happening today.

Just a couple of weeks ago, I was at SNW and vendor friend of mine asked if I knew anyone with tape library expertise because they were bidding on more and more tape archive opportunities. Tape seems alive and kicking for what I can see.

However, the fact is that tape use is being repositioned. Tape is no longer the direct target for backups that it once was. Most backup packages nowadays backup to disk and then later, if at all, migrate this data to tape (D2D2T). Tape is being relegated to a third tier of storage, a long-term archive and/or a long term backup repository.

The economics of tape are not hard to understand. You pay for robotics, media and drives. Tape, just like any removable media requires no additional power once it’s removed from the transport/drive used to write it. Removable media can be transported to an offsite repository or accross the continent. There it can await recall with nary an ounce (volt) of power consumed.

Problems with tape

So what’s wrong with tape, why aren’t more shops using it. Let me count the problems

  1. Tape, without robotics, requires manual intervention
  2. Tape, because of its transportability, can be lost or stolen, leading to data security breaches
  3. Tape processing, in general, is more error prone than disk. Tape can have media and drive errors which cause data transfer operations to fail
  4. Tape is accessed sequentially, it cannot be randomly accessed (quickly) and only one stream of data can be accepted per drive
  5. Much of a tape volume is wasted, never written space
  6. Tape technology doesn’t stay around forever, eventually causing data obsolescence
  7. Tape media doesn’t last forever, causing media loss and potentially data loss

Likely some other issues with tape missed here, but these seem the major ones from my perspective.

It’s no surprise that most of these problems are addressed or mitigated in one form or another by the major tape vendors, software suppliers and others interested in continuing tape technology.

Robotics can answer the manual intervention, if you can afford it. Tape encryption deals effectively with stolen tapes, but requires key management somewhere. Many applications exist today to help predict when media will go bad or transports need servicing. Tape data, is and always will be, accessed sequentially, but then so is lot’s of other data in today’s IT shops. Tape transports are most definitely single threaded but sophisticated applications can intersperse multiple streams of data onto that single tape. Tape volume stacking is old technology, not necessarily easy to deploy outside of some sort of VTL front-end, but is available. Drive and media technology obsolescence will never go away, but this indicates a healthy tape market place.

Future of tape

Say what you will about Ultrium or the Linear Tape-Open (LTO) technology, made up of HP, IBM, and Quantum research partners, but it has solidified/consolidated the mid-range tape technology. Is it as advanced as it could be, or pushing to open new markets – probably not. But they are advancing tape technology providing higher capacity, higher performance and more functionality over recent generations. And they have not stopped, Ultrium’s roadmap shows LTO-6 right after LTO-5 and delivery of LTO-5 at 1.6TB uncompressed capacity tape, is right around the corner.

Also IBM and Sun continue to advance their own proprietary tape technology. Yes, some groups have moved away from their own tape formats but that’s alright and reflects the repositioning that’s happening in the tape marketplace.

As for the future, I was at an IEEE magnetics meeting a couple of years back and the leader said that tape technology was always a decade behind disk technology. So the disk recording heads/media in use today will likely see some application to tape technology in about 10 years. As such, as long as disk technology advances, tape will come out with similar capabilities sometime later.

Still, it’s somewhat surprising that tape is able to provide so much volumetric density with decade old disk technology, but that’s the way tape works. Packing a ribbon of media around a hub, can provide a lot more volumetric storage density than a platter of media using similar recording technology.

In the end, tape has a future to exploit if vendors continue to push its technology. As a long term archive storage, it’s hard to beat its economics. As a backup target it may be less viable. Nonetheless, it still has a significant install base which turns over very slowly, given the sunk costs in media, drives and robotics.

Full disclosure: I have no active contracts with LTO or any of the other tape groups mentioned in this post.