SOHO backup options

© 2010 RDX Storage Alliance. All Rights Reserved. (From their website)
© 2010 RDX Storage Alliance. All Rights Reserved. (From their website)

I must admit, even though I have disparaged DVD archive life (see CDs and DVDs longevity questioned) I still backup my work desktops/family computers to DVD and DVDdl disks.  It’s cheap (on sale 100 DVDs cost about $30 and DVDdl ~2.5 that much) and it’s convenient (no need for additional software, outside storage fees, or additional drives).  For offsite backups I take the monthly backups and store them in a safety deposit box.

But my partner (and wife) said “Your time is worth something, every time you have to swap DVDs you could be doing something else.” (… like helping around the house.)

She followed up by saying “Couldn’t you use something that was start it and forget it til it was done.”

Well this got me to thinking (as well as having multiple media errors in my latest DVDdl full backup), there’s got to be a better way.

The options for SOHO (small office/home office) Offsite backups look to be as follows: (from sexiest to least sexy)

  • Cloud storage for backup – Mozy, Norton BackupGladinetNasuni, and no doubt many others can provide secure, cloud based backup of desktop, laptop data for Macs and Window systems.  Some of these would require a separate VM or server to connect to the cloud while others would not.  Using the cloud might require the office systems to be left on at nite but that would be a small price to pay to backup your data offsite.   Benefits to cloud storage approaches are that it would get the backups offsite, could be automatically scheduled/scripted to take place off-hours and would require no (or minimal) user intervention to perform.  Disadvantages to this approach is that the office systems would need to be left powered on, backup data is out of your control and bandwidth and storage fees would need to be paid.
  • RDX devices – these are removable NFS accessed disk storage which can support from 40GB to 640GB per cartridge. The devices claim 30yr archive life, which should be fine for SOHO purposes.  Cost of cartridges is probably RDX greatest issue BUT, unlike DVDs you can reuse RDX media if you want to.   Benefits are that RDX would require minimal operator intervention for anything less than 640GB of backup data, backups would be faster (45MB/s), and the data would be under your control.  Disadvantages are the cost of the media (640GB Imation RDX cartridge ~$310) and drives (?), data would not be encrypted unless encrypted at the host, and you would need to move the cartridge data offsite.
  • LTO tape – To my knowledge there is only one vendor out there that makes an iSCSI LTO tape and that is my friends at Spectra Logic but they also make a SAS (6Gb/s) attached LTO-5 tape drive.  It’s unclear which level of LTO technology is supported with the iSCSI drive but even one or two generations down would work for many SOHO shops.  Benefits of LTO tape are minimal operator intervention, long archive life, enterprise class backup technology, faster backups and drive data encryption.  Disadvantages are the cost of the media ($27-$30 for LTO-4 cartridges), drive costs(?), interface costs (if any) and the need to move the cartridges offsite.  I like the iSCSI drive because all one would need is a iSCSI initiator software which can be had easily enough for most desktop systems.
  • DAT tape – I thought these were dead but my good friend John Obeto informed me they are alive and well.  DAT drives support USB 2.0, SAS or parallel SCSI interfaces. Although it’s unclear whether they have drivers for Mac OS/X, Windows shops could probably use them without problem. Benefits are similar to LTO tape above but not as fast and not as long a archive life.  Disadvantages are cartridge cost (320GB DAT cartridge ~$37), drive costs (?) and one would have to move the media offsite.
  • (Blu-ray, Blu-ray dl), DVD, or DVDdl – These are ok but their archive life is miserable (under 2yrs for DVDs at best, see post link above). Benefits are they’res very cheap to use, lowest cost removable media (100GB of data would take ~22 DVDs or 12 DVDdls which at $0.30/ DVD or $0.75 for DVDdl thats  ~$6.60 to $9 per backup), and lowest cost drive (comes optional on most desktops today). Disadvantages are high operator intervention (to swap out disks), more complexity to keep track of each DVDs portion of the backup, more complex media storage (you have a lot more of it), it takes forever (burning 7.7GB to a DVDdl takes around an hour or ~2.1MB/sec.), data encryption would need to be done at the host, and one has to take the media offsite.  I don’t have similar performance data for using Blu-ray  for backups other than Blu-ray dl media costs about $11.50 each (50GB).

Please note this post only discusses Offsite backups. Many SOHOs do not provide offsite backup (risky??) and for online backups I use a spare disk drive attached to every office and family desktop.

Probably other alternatives exist for offsite backups, not the least of which is NAS data replication.  I didn’t list this as most SOHO customers are unlikely to have a secondary location where they could host the replicated data copy and the cost of a 2nd NAS box would need to be added along with the bandwidth between the primary and secondary site.  BUT for those sophisticated SOHO customers out there already using a NAS box for onsite shared storage maybe data replication might make sense. Deduplication backup appliances are another possibility but suffer similar disadvantages to NAS box replication and are even less likely to be already used by SOHO customers.


Ok where to now.  Given all this I M hoping to get a Blu-ray dl writer in my next iMac.  Let’s see that would cut my DVDdl swaps down by ~3.2X for single layer and ~6.5X for dl Blu-ray.  I could easily live with that until I quadrupled my data storage, again.

Although an iSCSI LTO-5 tape transport would make a real nice addition to the office…


Cloud storage replication does not suffice for backups – revisited

Free Whipped Cream Clouds on True Blue Sky Creative Commons by Pink Sherbet Photography (cc) (from Flickr)
Free Whipped Cream Clouds on True Blue Sky Creative Commons by Pink Sherbet Photography (cc) (from Flickr)

I was talking with another cloud storage gateway provider today and I asked them if they do any sort of backup for data sent to the cloud.  His answer disturbed me – they said they depend on backend cloud storage providers replication services to provide data protection – sigh. Curtis and I have written about this before (see my Does Cloud Storage need Backup? post and Replication is not backup by W. Curtis Preston).

Cloud replication is not backup

Cloud replication is not data protection for anything but hardware failures!   Much more common than hardware failures is mistakes by end-users who inadvertently delete files, overwrite files, corrupt files, or systems that corrupt files any of which would just be replicated in error throughout the cloud storage multi-verse.  (In fact, cloud storage itself can lead to corruption see Eventual data consistency and cloud storage).

Replication does a nice job of covering a data center or hardware failure which leaves data at one site inaccessible but allows access to a replica of the data from another site.  As far as I am concerned there’s nothing better than replication for these sorts of DR purposes but it does nothing for someone deleting the wrong file. (I one time did a “rm * *” command on a shared Unix directory – it wasn’t pretty).

Some cloud storage (backend) vendors delay the deletion of blobs/containers until sometime later  as one solution to this problem.  By doing this, the data “stays around” for “sometime” after being deleted and can be restored via special request to the cloud storage vendor. The only problem with this is that “sometime” is an ill-defined, nebulous concept which is not guaranteed/specified in any way.  Also, depending on the “fullness” of the cloud storage, this time frame may be much shorter or longer.  End-user data protection cannot depend on such a wishy-washy arrangement.

Other solutions to data protection for cloud storage

One way is to have a local backup of any data located in cloud storage.  But this kind of defeats the purpose of cloud storage and has the cloud data being stored both locally (as backups) and remotely.  I suppose the backup data could be sent to another cloud storage provider but someone/somewhere would need to support some sort of versioning to be able to keep multiple iterations of the data around, e.g., 90 days worth of backups.  Sounds like a backup package front-ending cloud storage to me…

Another approach is to have the gateway provider supply some sort of backup internally using the very same cloud storage to hold various versions of data.  As long as the user can specify how many days or versions of backups can be held this works great, as cloud replication supports availability in the face of hardware failures and multiple versions support availability in the face of finger checks/logical corruptions.

This problem can be solved in many ways, but just using cloud replication is not one of them.

Listen up folks, whenever you think about putting data in the cloud, you need to ask about backups among other things.  If they say we only offer data replication provided by the cloud storage backend – go somewhere else. Trust me, there are solutions out there that really backup cloud data.