Better erasure coding for scale-out & cloud storage

LRcC(6,2,2) example layout
LRcC(6,2,2) example layout

Microsoft Azure uses a different style of erasure coding for their cloud storage than what I have encountered in the past. Their erasure coding technique was documented in a paper presented at USENIX ATC’12 (for more info check out their Erasure coding in Windows Azure Storage paper).

The new erasure coding can be optimized for rebuild read or storage space overhead. can at times correct for more errors than equivalent, more traditional, Reed-Solomon (RS) erasure coding schemes.
Continue reading “Better erasure coding for scale-out & cloud storage”

(QoM16-002): Will Intel Omni-Path GA in scale out enterprise storage by February 2016 – NO 0.91 probability

opa-cardQuestion of the month (QoM for February is: Will Intel Omni-Path (Architecture, OPA) GA in scale out enterprise storage by February 2016?

In this forecast enterprise storage are the major and startup vendors supplying storage to data center customers.

What is OPA?

OPA is Intel’s replacement for InfiniBand and starts out at 100Gbps. It’s intended more for high performance computing (HPC), to be used as an inter-cluster server interconnect or next generation fabric. Intel says it “will maintain consistency and compatibility with existing Intel True Scale Fabric and InfiniBand APIs by working through the open source OpenFabrics Alliance (OFA) software stack on leading Linux* distribution releases”. Seems like Intel is making it as easy as possible for vendors to adopt the technology.
Continue reading “(QoM16-002): Will Intel Omni-Path GA in scale out enterprise storage by February 2016 – NO 0.91 probability”

SolidFire supplies scale-out SSD storage for cloud service providers

SolidFire SF3010 node (c) 2011 SolidFire (from their website)
SolidFire SF3010 node (c) 2011 SolidFire (from their website)

I was talking with a local start up called SolidFire the other day with an interesting twist on SSD storage.  They were targeting cloud service providers with a scale-out, cluster based SSD iSCSI storage system.  Apparently a portion of their team had come from Lefthand (now owned by HP) another local storage company and the rest came from Rackspace, a national cloud service provider.

The hardware

Their storage system is a scale-out cluster of storage nodes that can range from 3 to a theoretical maximum of 100 nodes (validated node count ?). Each node comes equipped with 2-2.4GHz, 6-core Intel processors and 10-300GB SSDs for a total of 3TB raw storage per node.  Also they have 8GB of non-volatile DRAM for write buffering and 72GB read cache resident on each node.

The system also uses 2-10GbE links for host to storage IO and inter-cluster communications and support iSCSI LUNs.  There are another 2-1GigE links used for management communications.

SolidFire states that they can sustain 50K IO/sec per node. (This looks conservative from my viewpoint but didn’t state any specific R:W ratio or block size for this performance number.)

The software

They are targeting cloud service providers and as such the management interface was designed from the start as a RESTful API but they also have a web GUI built out of their API.  Cloud service providers will automate whatever they can and having a RESTful API seems like the right choice.

QoS and data reliability

The cluster supports 100K iSCSI LUNs and each LUN can have a QoS SLA associated with it.  According to SolidFire one can specify a minimum/maximum/burst level for IOPS and a maximum or burst level for throughput at a LUN granularity.

With LUN based QoS, one can divide cluster performance into many levels of support for multiple customers of a cloud provider.  Given these unique QoS capabilities it should be relatively easy for cloud providers to support multiple customers on the same storage providing very fine grained multi-tennancy capabilities.

This could potentially lead to system over commitment, but presumably they have some way to ascertain over commitment is near and not allowing this to occur.

Data reliability is supplied through replication across nodes which they call Helix(tm) data protection.  In this way if an SSD or node fails, it’s relatively easy to reconstruct the lost data onto another node’s SSD storage.  Which is probably why the minimum number of nodes per cluster is set at 3.

Storage efficiency

Aside from the QoS capabilities, the other interesting twist from a customer perspective is that they are trying to price an all-SSD storage solution at the $/GB of normal enterprise disk storage. They believe their node with 3TB raw SSD storage supports 12TB of “effective” data storage.

They are able to do this by offering storage efficiency features of enterprise storage using an all SSD configuration. Specifically they provide,

  • Thin provisioned storage – which allows physical storage to be over subscribed and used to support multiple LUNs when space hasn’t completely been written over.
  • Data compression – which searches for underlying redundancy in a chunk of data and compresses it out of the storage.
  • Data deduplication – which searches multiple blocks and multiple LUNs to see what data is duplicated and eliminates duplicative data across blocks and LUNs.
  • Space efficient snapshot and cloning – which allows users to take point-in-time copies which consume little space useful for backups and test-dev requirements.

Tape data compression gets anywhere from 2:1 to 3:1 reduction in storage space for typical data loads. Whether SolidFire’s system can reach these numbers is another question.  However, tape uses hardware compression and the traditional problem with software data compression is that it takes lots of processing power and/or time to perform it well.  As such, SolidFire has configured their node hardware to dedicate a CPU core to each physical disk drive (2-6 core processors for 10 SSDs in a node).

Deduplication savings are somewhat trickier to predict but ultimately depends on the data being stored in a system and the algorithm used to deduplicate it.  For user home directories, typical deduplication levels of 25-40% are readily attainable.  SolidFire stated that their deduplication algorithm is their own patented design and uses a small fixed block approach.

The savings from thin provisioning ultimately depends on how much physical data is actually consumed on a storage LUN but in typical environments can save 10-30% of physical storage by pooling non-written or free storage across all the LUNs configured on a storage system.

Space savings from point-in-time copies like snapshots and clones depends on data change rates and how long it’s been since a copy was made.  But, with space efficient copies and a short period of existence, (used for backups or temporary copies in test-development environments) such copies should take little physical storage.

Whether all of this can create a 4:1 multiplier for raw to effective data storage is another question but they also have a eScanner tool which can estimate savings one can achieve in their data center. Apparently the eScanner can be used by anyone to scan real customer LUNs and it will compute how much SolidFire storage will be required to support the scanned volumes.


There are a few items left on their current road map to be delivered later, namely remote replication or mirroring. But for now this looks to be a pretty complete package of iSCSI storage functionality.

SolidFire is currently signing up customers for Early Access but plan to go GA sometime around the end of the year. No pricing was disclosed at this time.

I was at SNIA’s BoD meeting the other week and stated my belief that SSDs will ultimately lead to the commoditization of storage.  By that I meant that it would be relatively easy to configure enough SSD hardware to create a 100K IO/sec  or 1GB/sec system without having to manage 1000 disk drives.  Lo and behold, SolidFire comes out the next week.  Of course, I said this would happen over the next decade – so I am off by a 9.99 years…


Top 10 storage technologies over the last decade

Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)
Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)

Some of these technologies were in development prior to 2000, some were available in other domains but not in storage, and some were in a few subsystems but had yet to become popular as they are today.  In no particular order here are my top 10 storage technologies for the decade:

  1. NAND based SSDs – DRAM and other technology solid state drives (SSDs) were available last century but over the last decade NAND Flash based devices have dominated SSD technology and have altered the storage industry forever more.  Today, it’s nigh impossible to find enterprise class storage that doesn’t support NAND SSDs.
  2. GMR head– Giant Magneto Resistance disk heads have become common place over the last decade and have allowed disk drive manufacturers to double data density every 18-24 months.  Now GMR heads are starting to transition over to tape storage and will enable that technology to increase data density dramatically
  3. Data DeduplicationDeduplication technologies emerged over the last decade as a complement to higher density disk drives as a means to more efficiently backup data.  Deduplication technology can be found in many different forms today, ranging from file and block storage systems, backup storage systems, to backup software only solutions.
  4. Thin provisioning – No one would argue that thin provisioning emerged last century but it took the last decade to really find its place in the storage pantheon.  One almost cannot find a data center class storage device that does not support thin provisioning today.
  5. Scale-out storage – Last century if you wanted to get higher IOPS from a storage subsystem you could add cache or disk drives but at some point you hit a subsystem performance wall.  With scale-out storage, one can now add more processing elements to a storage system cluster without having to replace the controller to obtain more IO processing power.  The link reference talks about the use of commodity hardware to provide added performance but scale-out storage can also be done with non-commodity hardware (see Hitachi’s VSP vs. VMAX).
  6. Storage virtualizationserver virtualization has taken off as the dominant data center paradigm over the last decade but a counterpart to this in storage has also become more viable as well.  Storage virtualization was originally used to migrate data from old subsystems to new storage but today can be used to manage and migrate data over PBs of physical storage dynamically optimizing data placement for cost and/or performance.
  7. LTO tape When IBM dominated IT in the mid to late last century, the tape format dejour always matched IBM’s tape technology.  As the decade dawned, IBM was no longer the dominant player and tape technology was starting to diverge into a babble of differing formats.  As a result, IBM, Quantum, and HP put their technology together and created a standard tape format, called LTO, which has become the new dominant tape format for the data center.
  8. Cloud storage Unclear just when over the last decade cloud storage emerged but it seemed to be a supplement to cloud computing that also appeared this past decade.  Storage service providers had existed earlier but due to bandwidth limitations and storage costs didn’t survive the dotcom bubble. But over this past decade both bandwidth and storage costs have come down considerably and cloud storage has now become a viable technological solution to many data center issues.
  9. iSCSI SCSI has taken on many forms over the last couple of decades but iSCSI has the altered the dominant block storage paradigm from a single, pure FC based SAN to a plurality of technologies.  Nowadays, SMB shops can have block storage without the cost and complexity of FC SANs over the LAN networking technology they already use.
  10. FCoEOne could argue that this technology is still maturing today but once again SCSI has taken opened up another way to access storage. FCoE has the potential to offer all the robustness and performance of FC SANs over data center Ethernet hardware simplifying and unifying data center networking onto one technology.

No doubt others would differ on their top 10 storage technologies over the last decade but I strived to find technologies that significantly changed data storage that existed in 2000 vs. today.  These 10 seemed to me to fit the bill better than most.