The future of data storage is MRAM

Core Memory by teclasorg
Core Memory by teclasorg

We have been discussing NAND technology for quite awhile now but this month I ran across an article in IEEE Spectrum titled “a SPIN to REMEMBER – Spintronic memories to revolutionize data storage“. The article discussed a form of magneto-resistive random access memory or MRAM that uses quantum mechanical spin effects or spintronics to record data. We have talked about MRAM technology before and progress has been made since then.

Many in the industry will recall that current GMR (Giant Magneto-resistance) heads and TMR (Tunnel magneto-resistance) next generation disk read heads already make use of spintronics to detect magnetized bit values in disk media. GMR heads detect bit values on media by changing its electrical resistance.

Spintronics however can also be used to record data as well as read it. These capabilities are being exploited in MRAM technology which uses a ferro-magnetic material to record data in magnetic spin alignment – spin UP, means 0; spin down, means 1 (or vice versa).

The technologists claim that when MRAM reaches its full potential it could conceivably replace DRAM, SRAM, NAND, and hard disk drives or all current electrical and magnetic data storage. Some of MRAM’s advantages include unlimited write passes, fast reads and writes and data non-volatilility.

MRAM reminds me of old fashioned magnetic core memory (in photo above) which used magnetic polarity to record non-volatile data bits. Core was a memory mainstay in the early years of computing before the advent of semi-conductor devices like DRAM.

Back to future – MRAM

However, the problems with MRAM today are that it is low-density, takes lots of power and is very expensive. But technologists are working on all these problems with the view that the future of data storage will be MRAM. In fact, researchers at the North Carolina State University (NCSU) Electrical Engineering department have been having some success with reducing power requirements and increasing density.

As for data density NCSU researchers now believe they can record data in cells approximating 20 nm across, better than current bit patterned media which is the next generation disk recording media. However reading data out of such a small cell will prove to be difficult and may require a separate read head on top of each cell. The fact that all of this is created with normal silicon fabrication methods make doing so at least feasible but the added chip costs may be hard to justify.

Regarding high power, their most recent design records data by electronically controlling the magnetism of a cell. They are using dilute magnetic semiconductor material doped with gallium maganese which can hold spin value alignment (see the article for more information). They are also using a semiconductor p-n junction on top of the MRAM cell. Apparently at the p-n junction they can control the magnetization of the MRAM cells by applying -5 volts or removing this. Today the magnetization is temporary but they are also working on solutions for this as well.

NCSU researchers would be the first to admit that none of this is ready for prime time and they have yet to demonstrate in the lab a MRAM memory device with 20nm cells, but the feeling is it’s all just a matter of time and lot’s of research.

Fortunately, NCSU has lots of help. It seems Freescale, Honeywell, IBM, Toshiba and Micron are also looking into MRAM technology and its applications.

—–

Let’s see, using electron spin alignment in a magnetic medium to record data bits, needs a read head to read out the spin values – couldn’t something like this be used in some sort of next generation disk drive that uses the ferromagnetic material as a recording medium. Hey, aren’t disks already using a ferromagnetic material for recording media? Could MRAM be fabricated/layed down as a form of magnetic disk media?? Maybe there’s life in disks yet….

What do you think?

Latest ESRPv3 (Exchange 2010) results analysis for 1K-to-5Kmailboxes – chart of the month

(c) 2010 Silverton Consulting, Inc., All Rights Reserved
(c) 2010 Silverton Consulting, Inc., All Rights Reserved

The chart is from SCI’s October newsletter/performance dispatch on Exchange 2010 Solution Reviewed Program (ESRP v3.0) and shows the mailbox database access latencies for read, write and log write.  For this report we are covering solutions supporting from 1001 up to 5000 mailboxes (1K-to-5Kmbx), larger and (a few) smaller configurations have been covered in previous performance dispatches.  On latency charts like this – lower is better.

We like this chart because in our view this represents a reasonable measure of email user experience.  As users read and create new emails they are actually reading Exchange databases and writing database and logs.  Database and log latencies should show up as longer or shorter delays in these activities.  (Ok, not exactly true, email client and Exchange server IO aren’t the same thing.  But ultimately every email sent has to be written to an Exchange database and log sometime and every new email read-in has to come from an Exchange database as well).

A couple of caveats are in order for this chart.

  • Xiotech’s top run (#1) did not use database redundancy or DAGs (Database Availability Groups) in their ESRPv3 run. Their feeling is that this technology is fairly new and it will take some time before it’s widely adopted.
  • There is quite the mix of SAS (#2,3,6,7,9&10), FC (#1,5&8) and iSCSI (#4) connected storage in this mailbox range.  Some would say that SAS connected storage should have an advantage here but that’s not obvious from the rankings.
  • Vendors get to select the workload intensity for any ESRPv3/Jetstress run, e.g. the solutions shown here used between 0.15 IO/sec/mailbox (#9&10) and 0.36 IO/sec/mailbox (#1).  IO intensity is just one of the myriad of Jetstress tweakable parameters that make analyzing ESRP so challenging.  Normally this would only matter with database and log access counts but heavier workloads can also impact latencies as well.

Wide variance between read and write latencies

The other thing of interest in this chart is the interesting span between read latencies and write (database and log) latencies for the same solution. Take the #10 Dell PowerEdge system for example.  It showed a database read latency of ~18msec. but a database write latency of ~0.4msec.  Why?

It turns out this Dell system had only 6 disk drives (2TB/7200 RPM).  So few disk drives don’t seem adequate to support the read workload and as a result, show up poorly in database read latencies.  However, write activity can mostly be masked with cache until it fills up, forcing write delays.  With only 1100 mailboxes and 0.15 IOs/sec/mailbox, the write workload apparent fits in cache well enough to be destaged over time, without delaying ongoing write activity.  Similar results appear for the other Dell PowerEdge (#6) and the HP Smart Array (#7) which had 12-2TB/7200 RPM and 24-932GB/7200 RPM drives respectively.

On the other hand, Xiotech’s #1 position had 20-360GB/15Krpm drives and EMC’s Celerra #4 run had 15-400GB/10Krpm drives, both of which were able to sustain a more balanced performance across reads and writes (database and logs).  For Xiotech’s #5 run they used 40-500GB/10Krpm drives.

It seems there is a direct correlation between drive speed and read database latencies.  Most of the systems in the bottom half of this chart have 7200 RPM drives (except for #8, HP StorageWorks MSA) and the top 3 all had 15Krpm drives.  However, write latencies don’t seem to be as affected by drive speed and have more to do with the balance between workload, cache size and effective destaging.

The other thing that’s apparent from this chart is that SAS connected storage continues to be an effective solution for this range of Exchange configurations, following a trend first shown in ESRP v2 (Exchange 2007) results.  We reported on this in our  January ESRPv2 analysis dispatch for this year .

The full dispatch will be up on our website in a couple of weeks but if you are interested in seeing it sooner just sign up for our free newsletter (see upper right) or subscribe by email and we will send you the current issue with download instructions for this and other reports.

As mentioned previously ESRP/Jetstress results are difficult to compare/analyze and we continue to welcome any constructive suggestions on how to improve.

Data compression lives on

Macroblocking: demolish the eerie ▼oid by Rosa Menkman (cc) (from Flickr)
Macroblocking: demolish the eerie ▼oid by Rosa Menkman (cc) (from Flickr)

Last week NetApp announced the availability of data compression on many of their unified storage platforms, which includes block and file storage.  Earlier this year EMC announced data compression for LUNs on CLARiion and Celerra.  I must commend both of them for re-integrating data compression back into primary storage systems, missing since IBM and Sun stopped marketing RVA and SVA.

Data compression algorithms

Essentially data compression is an algorithm that eliminates redundancy in data streams.  Data compression can be “lossy” or “loss-less”.  Data compression in storage subsystems is typically loss-less which means that the original data can be reconstructed without any loss of information.  One sees lossy algorithms in video/audio data compression which doesn’t impact video/audio fidelity unless it results in significant loss of data.

One simple example of loss-less data compression is Run-Length Encoding which substitutes a trigger, count, and character string for any character repeated more than 4 times in a block of data.  This compresses well any text strings with lots of blanks, numerical data with lot’s of 0’s and initial format data written with a repeating character.

There are other, more sophisticated compression algorithms like Huffman Coding, which identify the most frequent bytes in a block of data and replace these bytes with shorter bit patterns. For example if ~50% of the characters in a text file are the letters “a”, “e”, “i”, “o”, “t”, and “n” (see Wikipedia, Frequency Analysis) then these characters can take up much less space if we encode them 4 or less bits rather than the 8-bits in a byte.

I am certain that both EMC and NetApp are using much more sophisticated algorithms than either of these and it wouldn’t surprise me to know they are using something like the open source algorithm like Zlib (gzip) or Bzip2 (see my Poor deduplication with Oracle RMAN compressed backups post for an explanation) which uses Huffman Coding and adds even more sophistication.  Data compression algorithms like these could offer something like 50% compression, i.e., your data could be stored in 50% less space.

Data compression is often confused with Data Deduplication but it’s not the same. Deduplication looks for duplicate data across different data blocks and files while data compression is strictly only examining the data stream within a block or file and doesn’t depend on any other data.

Storage system data compression

In the past, data compression was relegated to a separate appliance, tape storage systems, and/or host software.  By integrating these algorithms into their main storage engines, both NetApp and EMC are taking advantage of the recent processor speed increases being embedding into their systems to offer offline functionality for online data.

Historically, compression algorithms such as these were implemented in hardware but nowadays they can easily be done in software by being relegated to operate during off-peak IO time or execute as the lowest priority task in the storage system. As such, there can be no guarantee when your data will finally be compressed but it will be compressed eventually.

Data compression like this is great for data that isn’t modified frequently.  It takes some processing time to compress data and as such, would need to be repeated after every modification of a compressed block or file.  So if the data isn’t modified that much, compression’s processing cost could be amortized over longer data lifetimes.

Further, data compression must be undone at read time, i.e., the data needs to be de-compressed and handed off to the IO requesting it.  De-compression is a much faster algorithm than compression because in the case of something like Huffman Coding the character dictionary is already known and as such, it’s just a matter of table lookup and bit field isolation. It would be convenient if this data were sitting in the system DRAM someplace but lacking that, moving it from cache to DRAM could be done quickly enough, processed there, and then moved back before final transfer to the requesting IO.

As such, data compression may impact response time for compressed data reads.  How much of an impact is yet TBD.

Data writes will not be impacted at all because the compression activity is done much later. Whether the data stays in cache until compressed or is brought back in at some later time is another algorithm question which may impact cache hit rates/compression performance but this doesn’t have to be a serious impediment.

NetApp is able to offer this capability for both block and file storage because of it’s WAFL backend data structure which essentially allows it to create variable length blocks for file and block data.  EMC only offers this for LUN data (block storage) as of yet but it’s probably just a matter of time before it’s available for other data as well.

Any questions?

EMC to buy Isilon Systems

Isilon X series nodes (c) 2010 Isilon from Isilon's website
Isilon X series nodes (c) 2010 Isilon from Isilon's website

I understand the rationale behind EMC’s purchase of Isilon scale out NAS technology for big data applications.  More and more data is being created every day and most of that unstructured.  How can one begin to support multiple PBs of file data that’s coming online in the next couple of years without scale out NAS.  Scale out NAS has the advantage that within the same architecture one can scale from TBs to PBs of file storage by just adding storage and/or accessor nodes.  Sounds great.

Isilon for backup storage?

But what’s surprising to me is the use of Isilon NL-Series storage in more mundane applications like Database backup.  A couple of weeks ago I wrote a post on how Oracle RMAN compressed backups don’t dedupe very well.  The impetus for that post was that a very large enterprise customer I was talking with had just started deploying Isilon NAS systems in their backup environment to handle non-dedupable data.  The customer was backing up PB of storage, a good portion of which was non-dedupable, and as such, they planed to use Isilon Systems to store this data.

I had never seen scale out NAS systems used for backup storage so I was intrigued to find out why.  Essentially, this customer was in the throws of replacing tape and between deduplication appliances and Isilon storage they believed they had the solutions to eliminate tape forever from their backup systems.

All this begs the question where does EMC put Isilon –  with Celerra and other storage platforms, with Atmos and other cloud services, or with Data Domain and other backup systems?  It seems one could almost break out the three Isilon storage systems and split them into these three business groups but given Isilon’s flexibility it probably belongs in storage platforms.

However, I would think that BRS would have an immediate market requirement for Isilon’s NL-Series storage to complement it’s other backup systems.  I guess we will know shortly where EMC puts it – until then it’s anyone’s guess.

Why Bus-Tech, why now – Mainframe/System z data growth

Z10 by Roberto Berlim (cc) (from Flickr)
Z10 by Roberto Berlim (cc) (from Flickr)

Yesterday, EMC announced the purchase of Bus-Tech, their partner in mainframe or System z attachment for the Disk Library Mainframe (DLm) product line.

The success of open systems mainframe attach products based on Bus-Tech or competitive technology is subject to some debate but it’s the only inexpensive way to bring such functionality into mainframes.  The other, more expensive approach is to build in System z attach directly into the hardware/software for the storage system.

Most mainframer’s know that FC and FICON (System z storage interface) utilize the same underlying transport technology.  However, FICON has a few crucial differences when it comes to data integrity, device commands and other nuances which make easy interoperability more of a challenge.

But all that just talks about the underlying hardware when you factor in disk layout (CKD), tape format, disk and tape commands (CCWs), System z interoperability can become quite an undertaking.

Bus-Tech’s virtual tape library maps mainframe tape/tape library commands and FICON protocols into standard FC and tape SCSI command sets. This way one could theoretically attach anybody’s open system tape or virtual tape system onto System z.  Looking at Bus-Tech’s partner list, there were quite a few organizations including Hitachi, NetApp, HP and others aside from EMC using them to do so.

Surprise – Mainframe data growth

Why is there such high interest in mainframes? Mainframe data is big and growing, in some markets almost at open systems/distributed systems growth rates.  I always thought mainframes made better use of data storage, had better utilization, and controlled data growth better.  However, this can only delay growth, it can’t stop it.

Although I have no hard numbers to back up my mainframe data market or growth rates, I do have anecdotal evidence.  I was talking with an admin at one big financial firm a while back and he casually mentioned they had 1.5PB of mainframe data storage under management!  I didn’t think this was possible – he replied not only was this possible, he was certain they weren’t the largest in their vertical/East coast area by any means .

Ok so mainframe data is big and needs lot’s of storage but this also means that mainframe backup needs storage as well.

Surprise 2 – dedupe works great on mainframes

Which brings us back to EMC DLm and their deduplication option.  Recently, EMC announced a deduplication storage target for disk library data used as an alternative to their previous CLARiion target.  This just happens to be a Data Domain 880 appliance behind a DLm engine.

Another surprise, data deduplication works great for mainframe backup data.  It turns out that z/OS users have been doing incremental and full backups for decades.  Obviously, anytime some system uses full backups, dedupe technology can reduce storage requirements substantially.

I talked recently with Tom Meehan at Innovation Data Processing, creators of FDR, one of only two remaining mainframe backup packages (the other being IBM DFSMShsm).  He re-iterated that deduplication works just fine on mainframes assuming you can separate the meta-data from actual backup data.

System z and distributed systems

In the mean time, this past July, IBM recently announced the zBX, System z Blade eXtension hardware system which incorporates Power7 blade servers running AIX into and under System z management and control.  As such, the zBX brings some of the reliability and availability of System z to the AIX open systems environment.

IBM had already supported Linux on System z but that was just a software port.  With zBX, System z could now support open systems hardware as well.  Where this goes from here is anybody’s guess but it’s not a far stretch to talk about running x86 servers under System z’s umbrella.

—-

So there you have it, Bus-Tech is the front-end of EMC DLm system.  As such, it made logical sense if EMC was going to focus more resources in the mainframe dedupe market space to lock up Bus-Tech, a critical technology partner.  Also, given market valuations these days, perhaps the opportunity was too good to pass up.

However, this now leaves Luminex as the last standing independent vendor to provide mainframe attach for open systems.  Luminex and EMC Data Domain already have a “meet-in-the-channel” model to sell low-end deduplication appliances to the mainframe market.  But with the Bus-Tech acquisition we see this slowly moving away and current non-EMC Bus-Tech partners migrating to Luminex or abandoning the mainframe attach market altogether.

[I almost spun up a whole section on CCWs, CKD and other mainframe I/O oddities but it would have detracted from this post’s main topic.  Perhaps, another post will cover mainframe IO oddities, stay tuned.]

What’s wrong with the iPad?

Apple iPad (wi-fi) (from apple.com)
Apple iPad (wi-fi) (from apple.com)

We have been using the wi-fi iPad for just under 6 months now and I have a few suggestions to make it even easier to use.

Safari

Aside from the problem with lack of Flash support there are a few things that would make websurfing easier on the iPad:

  • Tabbed windows option – I use tabbed windows on my desktop/laptop all the time but for some reason on the iPad Apple chose to use a grid of distinct windows accessible via a Safari special purpose icon.  While this approach probably makes a lot of sense for the iPhone/iPod, there is little reason to only do this on the iPad.  There is ample screen real-estate to show tabs selectable with the touch of a finger.  As it is now, it takes two touches to select an alternate screen for web browsing, not to mention some time to paint the thumbnail screen when you have multiple web pages open.
  • Non-mobile mode – It seems that many websites nowadays detect whether one is accessing a web page from a mobile device or not and as such, shrink their text/window displays to accommodate their much smaller display screen.  With the iPad this shows up as a wasted screen space and takes more than necessary screen paging to get to data that retrievable on a single screen with a desktop/laptop.  Not sure whether the problem is in the web server or the iPad’s signaling what device it is, however it seems to me that if the iPad/Safari app could signal to web servers that it is a laptop/small-desktop, web browsing could be better.

Other Apps

There are a number of Apps freely available on the iPhone/iPod that are not available on the iPad without purchase.  For some reason, I find I can’t live without some of these:

  • Clock app – On the iPhone/iPod I use the clock app at least 3 times a day.  I time my kids use of video games, my own time to having to do something, how much time I am willing/able to spend on a task, and myriad other things.  It’s one reason why I keep the iPhone on my body or close by whenever I am at home.  I occasionally use the clock app as a stop watch and a world clock but what I really need on the iPad is a timer of some sort.  I really have been unable to find an equivalent app for the iPad that matches the functionality of the iPhone/iPod Clock app.
  • Calculator app – On the iPhone/iPod I use the calculator sporadically, mostly when I am away from my desktop/office (probably because I have a calculator on my desk).  However, I don’t have other calculators that are easily accessible throughout my household and having one on the iPad would just make my life easier.  BTW, I ended up purchasing a calculator app that Apple says is equal to the iPhone Calc App which works fine but it should have come free.
  • Weather app – This is probably the next most popular app on my iPhone.  I know this information is completely available on the web, but by the time I have to enter the url/scan my bookmarks it takes at least 3-4 touches to get the current weather forecast.  By having the Weather app available on the iPhone it takes just one touch to get this same information.  I believe there is some way to transform a web page into an app icon on the iPad but this is not the same.

IOS software tweaks

There are some things I think could make IOS much better from my standpoint and I assume all the stuff in IOS 4.2 will be coming shortly so I won’t belabor those items:

  • File access – This is probably heresy but, I would really like a way to be able to cross application boundaries to access all files on the iPad.  That is, have something besides Mail, iBook and Pages be able to access PDF file, and Mail, Photo, and Pages/Keynote be able to access photos. Specifically, some of the FTP upload utilities should be able to access any file on the iPad.  Not sure where this belongs but there should be some sort of data viewer at the IOS level that can allow access to any file on the iPad.
  • Dvorak soft keypad – Ok, maybe I am a bit weird, but I spent the time and effort to learn the Dvorak keyboard layout to be able to type faster and would like to see this same option available for the iPad soft keypad.  I currently use Dvorak with the iPad’s external BT keyboard hardware but I see no reason that it couldn’t work for the soft keypad as well.
  • Widgets – The weather app discussed above looks to me like the weather widget on my desktop iMac.  It’s unclear why IOS couldn’t also support other widgets so that the app developers/users could easily create use their desktop widgets on the iPad.

iPad hardware changes

There are some things that scream out to me for hardware changes.

  • Ethernet access – I have been burned before and wish not to be burned again but some sort of adaptor that would allow an Ethernet plug connection would make the tethered iPad a much more complete computing platform.  I don’t care if such a thing comes as a BlueTooth converter or has to use the same plug as the power adaptor but having this would just make accessing the internet (under some circumstances) that much easier.
  • USB access – This just opens up another whole dimension to storage access and information/data portability that is sorely missing from the iPad.  It would probably need some sort of “file access” viewer discussed above but it would make the iPad much more extensible as a computing platform.
  • Front facing camera – I am not an avid user of FaceTime (yet) but if I were, I would really need a front camera on the iPad.  Such a camera would also provide some sort of snapshot capability with the iPad (although a rear facing camera would make more sense for this).  In any event, a camera is a very useful device to record whiteboard notes, scan paper documents, and record other items of the moment and even a front-facing one could do this effectively.
  • Solar panels – Probably off the wall, but having to lug a power adaptor everywhere I go with the iPad is just another thing to misplace/loose.  Of course, when traveling to other countries, one also needs a plug adaptor for each country as well.  It seems to me having some sort of solar panel on the back or front could provide adequate power to charge the iPad would be that much simpler.

—-

Well that’s about it for now.   We are planning on taking a vacation soon and we will be taking both a laptop and the iPad (because we can no longer live without it).  I would rather just leave the laptop home but can’t really do that given my problems in the past with the iPad.  Some changes described above could make hauling the laptop on vacation a much harder decision.

As for how the iPad fares on the beach, I will have to let you know…

Commodity hardware always loses

Herman Miller's Embody Chair by johncantrell (cc) (from Flickr)
A recent post by Stephen Foskett has revisted a blog discussion that Chuck Hollis and I had on commodity vs. special purpose hardware.  It’s clear to me that commodity hardware is a losing proposition for the storage industry and for storage users as a whole.  Not sure why everybody else disagrees with me about this.

It’s all about delivering value to the end user.  If one can deliver equivalent value with commodity hardware than possible with special purpose hardware then obviously commodity hardware wins – no question about it.

But, and it’s a big BUT, when some company invests in special purpose hardware, they have an opportunity to deliver better value to their customers.  Yes it’s going to be more expensive on a per unit basis but that doesn’t mean it can’t deliver commensurate benefits to offset that cost disadvantage.

Supercar Run 23 by VOD Cars (cc) (from Flickr)
Supercar Run 23 by VOD Cars (cc) (from Flickr)

Look around, one sees special purpose hardware everywhere. For example, just checkout Apple’s iPad, iPhone, and iPod just to name a few.  None of these would be possible without special, non-commodity hardware.  Yes, if one disassembles these products, you may find some commodity chips, but I venture, the majority of the componentry is special purpose, one-off designs that aren’t readily purchase-able from any chip vendor.  And the benefits it brings, aside from the coolness factor, is significant miniaturization with advanced functionality.  The popularity of these products proves my point entirely – value sells and special purpose hardware adds significant value.

One may argue that the storage industry doesn’t need such radical miniaturization.  I disagree of course, but even so, there are other more pressing concerns worthy of hardware specialization, such as reduced power and cooling, increased data density and higher IO performance, to name just a few.   Can some of this be delivered with SBB and other mass-produced hardware designs, perhaps.  But I believe that with judicious selection of special purposed hardware, the storage value delivered along these dimensions can be 10 times more than what can be done with commodity hardware.

Cuba Gallery: France / Paris / Louvre / architecture / people / buildings / design / style / photography by Cuba Gallery (cc) (from Flickr)
Cuba Gallery: France / Paris / Louvre / ... by Cuba Gallery (cc) (from Flickr)

Special purpose HW cost and development disadvantages denied

The other advantage to commodity hardware is the belief that it’s just easier to develop and deliver functionality in software than hardware.  (I disagree, software functionality can be much harder to deliver than hardware functionality, maybe a subject for a different post).  But hardware development is becoming more software like every day.  Most hardware engineers do as much coding as any software engineer I know and then some.

Then there’s the cost of special purpose hardware but ASIC manufacturing is getting more commodity like every day.   Several hardware design shops exist that sell off the shelf processor and other logic one can readily incorporate into an ASIC and Fabs can be found that will manufacture any ASIC design at a moderate price with reasonable volumes.  And, if one doesn’t need the cost advantage of ASICs, use FPGAs and CPLDs to develop special purpose hardware with programmable logic.  This will cut engineering and development lead-times considerably but will cost commensurably more than ASICs.

Do we ever  stop innovating?

Probably the hardest argument to counteract is that over time, commodity hardware becomes more proficient at providing the same value as special purpose hardware.  Although this may be true, products don’t have to stand still.  One can continue to innovate and always increase the market delivered value for any product.

If there comes a time when further product innovation is not valued by the market than and only then, does commodity hardware win.  However, chairs, cars, and buildings have all been around for many years, decades, even centuries now and innovation continues to deliver added value.  I can’t see where the data storage business will be any different a century or two from now…

Cloud storage replication does not suffice for backups – revisited

Free Whipped Cream Clouds on True Blue Sky Creative Commons by Pink Sherbet Photography (cc) (from Flickr)
Free Whipped Cream Clouds on True Blue Sky Creative Commons by Pink Sherbet Photography (cc) (from Flickr)

I was talking with another cloud storage gateway provider today and I asked them if they do any sort of backup for data sent to the cloud.  His answer disturbed me – they said they depend on backend cloud storage providers replication services to provide data protection – sigh. Curtis and I have written about this before (see my Does Cloud Storage need Backup? post and Replication is not backup by W. Curtis Preston).

Cloud replication is not backup

Cloud replication is not data protection for anything but hardware failures!   Much more common than hardware failures is mistakes by end-users who inadvertently delete files, overwrite files, corrupt files, or systems that corrupt files any of which would just be replicated in error throughout the cloud storage multi-verse.  (In fact, cloud storage itself can lead to corruption see Eventual data consistency and cloud storage).

Replication does a nice job of covering a data center or hardware failure which leaves data at one site inaccessible but allows access to a replica of the data from another site.  As far as I am concerned there’s nothing better than replication for these sorts of DR purposes but it does nothing for someone deleting the wrong file. (I one time did a “rm * *” command on a shared Unix directory – it wasn’t pretty).

Some cloud storage (backend) vendors delay the deletion of blobs/containers until sometime later  as one solution to this problem.  By doing this, the data “stays around” for “sometime” after being deleted and can be restored via special request to the cloud storage vendor. The only problem with this is that “sometime” is an ill-defined, nebulous concept which is not guaranteed/specified in any way.  Also, depending on the “fullness” of the cloud storage, this time frame may be much shorter or longer.  End-user data protection cannot depend on such a wishy-washy arrangement.

Other solutions to data protection for cloud storage

One way is to have a local backup of any data located in cloud storage.  But this kind of defeats the purpose of cloud storage and has the cloud data being stored both locally (as backups) and remotely.  I suppose the backup data could be sent to another cloud storage provider but someone/somewhere would need to support some sort of versioning to be able to keep multiple iterations of the data around, e.g., 90 days worth of backups.  Sounds like a backup package front-ending cloud storage to me…

Another approach is to have the gateway provider supply some sort of backup internally using the very same cloud storage to hold various versions of data.  As long as the user can specify how many days or versions of backups can be held this works great, as cloud replication supports availability in the face of hardware failures and multiple versions support availability in the face of finger checks/logical corruptions.

This problem can be solved in many ways, but just using cloud replication is not one of them.

Listen up folks, whenever you think about putting data in the cloud, you need to ask about backups among other things.  If they say we only offer data replication provided by the cloud storage backend – go somewhere else. Trust me, there are solutions out there that really backup cloud data.