On Storage Benchmarks

What is it about storage benchmarks that speaks to me? Is it the fact that they always present new data on current products, that there are always some surprises, or that they always reveal another facet of storage performance.

There are some that say benchmarks have lost their way, become too politicized, and as a result, become less realistic. All these faults can and do happen but it doesn’t have to be this way. Vendors can do the right thing if enough of them are engaged and end-users can play an important part as well.

Benchmarks exist mainly to serve the end-user community, by supplying an independent, audit-able, comparison of storage subsystem performance. To make benchmarks more useful, end users can help insure that they model real-world workloads. But this only happens when end-users participate in benchmark organizations, understand benchmark workloads, and understand in detail, their own I/O workloads. Which end-users can afford to do this, especially today?

As a result, storage vendors take up the cause. They argue amongst themselves to define “realistic end-user workloads”, put some approximation out as a benchmark and tweak it over time. The more storage vendors, the better this process becomes.

When I was a manager of storage subsystem development, I hated benchmark results. Often it meant there was more work to do. Somewhere, somehow or someway we weren’t getting the right level of performance from our subsystem. Something had to change. We would end up experimenting until we convinced ourselves we were on the right track. That lasted until we exhausted that track and executed the benchmark again. It almost got to the point where I didn’t really want to know the results – almost but not quite. In the end, benchmarks caused us to create better storage, to understand the best of the storage world, and to look outside ourselves at what others could accomplish.

Is storage performance still important today? I was talking with a storage vendor a couple of months back who said that storage subsystems today perform so well that performance is no longer a major differentiator or a significant buying consideration. I immediately thought why all the interest in SSDs and 8GFC. To some extent I suppose, raw storage performance is not as much a concern today but it will never go away completely.

Consider the automobile, it’s over a century old now (see Wikipedia) and we still talk about car performance. Perhaps it’s no longer raw speed, but a car’s performance still matters to most of us. What’s happened over time is that the definition of car performance has become more differentiated, more complex – top speed is not the only metric anymore. I am convinced that similar differentiation will happen to storage performance and storage benchmarks must lead the way.

So my answer is yes, storage performance still matters and benchmarks ultimately define storage performance. It’s up to all of us to keep benchmarks evolving to match the needs of end-users.

Nowadays, I can enjoy looking at storage benchmarks and leave the hard work to others.

EMC's Data Domain ROI

I am trying to put EMC’s price for Data Domain (DDup) into perspective but am having difficulty. According to InfoWorld article on EMC acquisitions ’03-’06 and some other research this $2.2B$2.4B is more money (not inflation adjusted) than anything in EMC’s previous acquisition history. The only thing that comes close was the RSA acquisition for $2.1B in ’06.

VMware only cost EMC $625M and has been by all accounts, very successful being spun out of EMC in an IPO and currently shows a market cap of ~$10.2B. Documentum cost $1.7B and Legato only cost $1.3B both of which are still within EMC.

Something has happened here, in a recession valuations are supposed to be more realistic not less realistic. At Data Domain’s TTM revenues ($300.5M) this will take over 7 years to breakeven on a straightline view. If one considers WACC (weighted average cost of capital) it looks much worse. Looking at DDup’s earnings makes it look even worse.

Other than fire up EMC’s marketing and sales engine to sell more DDup products, what else can EMC do to gain a better return on it’s DDup acquisition? (not in order)

  • Move EMC’s current Disk Libraries to DDup technology and let go of Quantum-FalconStor OEM agreements and/or abandon the current DL product line and substitute Ddup
  • Incorporate DDup technology into Legato Networker for target deduplication applications
  • Incorporate DDup technology into Mozy and Atmos
  • Incorporate DDup technology into Documentum
  • Incorporate DDup technology into Centera and Celerra

Can EMC selling DDup products and doing all this to better its technology double the revenue earnings and savings derived from DDup products and technology – maybe. But the incorporation of DDup into Centera and Celerra could just as easily decrease EMC revenues profits from the storage capacity lost depending on the relative price differences.

I figure the Disk Library, Legato, and Mozy integrations would be first on anyone’s list. Atmos next, and Celerra-Centera last.

As for what to add to DDup’s product line. Possibly additions are around the top end and the bottom end. DDup has been moving up market of late and integration with EMC DL might just help take it there. Down market, there is a potential market of small businesses that might want to use DDup technology at the right price point.

Not sure if the money paid for Ddup still makes sense but at least it begins to look better…

BlueArc introduces Mercury

Tomorrow, BlueArc will open up a new front in their battle with the rest of the NAS vendors by introducing the Mercury 50 NAS head. This product is slated to address the more mid-range enterprise market that historically shunned the relatively higher priced Titan series.

Mercury 50 is only the first product in this series and other products to be released in the future will help fill out the top end of this series. Priced similar to the NetApp 3140 this product has all the support of standard BlueArc file system while only limiting the Max storage capacity to 1PB. Its NFS throughput is a little better than half the current Titan 3100.

Mercury 50 will eventually be offered by BlueArc’s OEM partner HDS. However, immediately the Mercury 50 will be sold by the BlueArc’s direct sales force as well as many new channel partners that BlueArc has acquired over this past year.

This marks a departure for BlueArc into the more mainstream enterprise storage space. Historically, BlueArc has been successful in the high performance market but the real volumes and commensurate revenue are in the standard enterprise space. The problem in the past has been the high price of the BlueArc Titan systems but now with Mercury this should no longer be an issue.

That being said, the competition is much more intense as you move down market. EMC and NetApp will not stand still while their market share is eroded. And both of these company’s have the wherewithal to compete on performance, pricing and features.

Exciting times ahead for the NAS users out there.

Tape v Disk v SSD v RAM

There was a time not long ago when the title of this post wouldn’t have included SSD. But, with the history of the last couple of years, SSD has earned its right to be included.

A couple of years back I was at a Rocky Mountain Magnetics Seminar (see IEEE magnetics societies) and a disk drive technologist stated that Disks have about another 25 years of technology roadmap ahead of them. During this time they will continue to increase density, throughput and other performance metrics. After 25 years of this they will run up against some theoretical limits which will halt further density progress.

At the same seminar, the presenter said that Tape was lagging Disk technology by about 5-10 years or so. As such, tape should continue to advance for another 5-10 years after disk stops improving at which time tape would also stop increasing density.

Does all this mean the end of tape and disk? I think not. Paper stopped advancing in density theoretically about 2 to 3000 years ago (the papyrus scroll was the ultimate in paper “rotating media”). If we move up to the codex or book form- which in my view is a form factor advance – this took place around 400AD (see history of scroll and codex). Paperback, another form factor advance, took place in the early 20th century (see paperback history).

Turning now to write performance, moveable type was a significant paper (write) performance improvement and started in the mid 15th century. The printing press would go on to improve (paper write) performance for the next six centuries (see printing press history) and continues today.

All this indicates that some data technology, whose density was capped over 2000 years ago, can continue to advance and support valuable activity in today’s world and for the foreseeable future. “Will disk and tape go away” is the wrong question, the right question is “can disk or tape, after SSDs reach price equivalence on a $/GB basis, still be useful to the world”?

I think yes, but that depends on a number of factors as to how the relative SSD-Disk-Tape technologies advance. Assuming someday all these technologies support equivalent Tb/SqIn or spatial density and

  • SSD’s retain their relative advantage in random access speed,
  • Tape it’s advantage in sequential throughput, volumetric density, and long media life, and
  • Disk it’s all around, combined sequential and random access advantage

It seems likely that each can sustain some niche in the data center/home office of tomorrow, although probably not where they are today.

One can see trends being enacted in the enterprise data centers today that are altering the relative positioning of SSDs, disks and tape. Tape is now being relegated to long term, archive storage, Disk is moving to medium-term, secondary storage and SSDs is replacing top tier, primary storage.

More thoughts on this in future posts.

HDS upgrades AMS2000

Today, HDS refreshed their AMS2000 product line with a new high density drive expansion tray with 48-drives and up to a maximum capacity of 48TB, 8Gps FC (8GFC) ports for the AMS2300 and AMS2500 systems, and a new NEBS Level-3 compliant and DC powered version, the AMS2500DC.

HDS also re-iterated their stance that Dynamic Provisioning will be available on AMS2000 in the 2nd half of this year. (See my prior post on this subject for more information).

HDS also mentioned that the AMS2000 now supports external authentication infrastructure for storage managers and will support Common Criteria Certification for more stringent data security needs. The external authentication will be available in the second half of the year.

I find the DC version pretty interesting and signals a renewed interest in telecom OEM applications for this mid-range storage subsystem. Unclear to me whether this is a significant market for HDS. The 2500DC only supports 4Gps FC and is packaged with a Cisco MDS 9124 SAN switch. DC powered storage is also more energy efficient than AC storage.

Other than that the Common Criteria Certification can be a big thing for those companies or government entitities with significant interest in secure data centers. There was no specific time frame for this certification but presumably they have started the process.

As for the rest of this, it’s a pretty straightforward refresh.

DataDirect Networks WOS cloud storage

DataDirect Networks (DDN) announced this week a new product offering private cloud services. Apparently the new Web Object Scaler (WOS) is a storage appliance that can be clustered together across multiple sites and offers a single global file name space across all the sites. Also the WOS cloud supports policy file replication and distribution across sites for redundancy and/or load ballancing purposes.

DDN’s press release said a WOS cloud can service up to 1 million random file reads per second. They did not indicate the number of nodes required to sustain this level of performance and they didn’t identify the protocol that was used to do this. The press release implied low-latency file access but didn’t define what they meant here. 1M file reads/sec doesn’t necessarily mean they are all read quickly. Also, there appears to b more work for a file write than a file read and there is no statement on file ingest rate provided.

There are many systems out there touting a global name space. However not many say thier global name space spans across multiple sites. I suppose cloud storage would need to support such a facility to keep file names straight across sites. Nonetheless, such name space services would imply more overhead during file creation/deletion to keep everything straight and meta data duplication/replication/redundancy to support this.

Many questions on how this all works together with NFS or CIFS but it’s entirely possible that WOS doesn’t support either file access protocol and just depends on HTML get and post to access files or similar web services. Moreover, assuming WOS supports NFS or CIFS protocols, I often wonder why these sorts of announcements aren’t paired with a SPECsfs(r) 2008 benchmark report which could validate any performance claim at least at the NFS or CIFS protocol levels.

I talked to one media person a couple of weeks ago and they said cloud storage is getting boring. There are a lot of projects (e.g., Atmos from EMC) out there targeting future cloud storage, I hope for their sake boring doesn’t mean no market exists for cloud storage.

What really drives storage innovation

Ongoing waves of consolidation remind me of what really drives storage innovation – companies willing to experiment. Startups can only succeed when their products can engage the marketplace.

Startups risk everything to develop technology an innovation or two that can change the world. But what they ultimately discover, what they truly need is some large and/or small company’s willingness to experiment with new and untried technology. Such market engagement is essential to understand their technology’s rough edges, customer requirements, and distribution options.

Recently, I was informed that some large companies prefer to work with startups because they can better control any emerging technology direction. Also, their problems are big enough that typically no one solution can solve them. Startups allow them to cobble together (multiple) solutions that ultimately can solve their problem.

From the small company’s perspective the question becomes how to attract and begin the dialogue with innovative customers willing to invest time and money in startups. But, the real problem is knowing enough about a customer’s environment to know if suitable prospects for their technology exist. Armed with this knowledge, targeted marketing approaches can be applied to ultimately get a hearing with the customer.

However, what’s missing is a forum for large and small companies to describe their environment and more importantly, their serious, chronic problems. Mostly, this has been done informally or on an ad hoc basis in the past, but some formality around this could really benefit storage innovation at least from startups.

I see many possibilities to solve this, ways that companies could provide information on their environment and identify problems needing solutions. Such possibilities include:

  • an electronic forum something like Innocentive.com where companies could post problems and solicit solutions
  • an award to solve a particularly pressing problem like Xprize.org where a group of companies, perhaps in one vertical combine together to offer a significant award to help solve a particular nasty storage/IT problem
  • an organization of sorts like SNIA end user council that could provide anonymous information on IT environments and problems needing solutions.
  • a Small Business Innovation Research-like (see SBIR.gov) that could provide a list of problems soliciting solutions

The problem with SNIA end user council and SBIR-like approaches is the lack of anonyminity, the problems with an Xprize-like award is the inability for any one organization to fund the award. All of which is why I prefer an innocentive.com-like approach, maybe better targeted to IT issues and less targeted on basic and materials science. Finally, perhaps another, unforeseen approach that might even work better – comments?

Why big storage vendors can’t be enticed to work on something like this is another conundrum and probably subject for a future post.

HDS Dynamic Provisioning for AMS

HDS announced support today for their thin provisioning (called Dynamic Provisioning) feature to be available in their mid-range storage subsystem family the AMS. Expanding the subsystems that support Thin provisioning can only help the customer in the long run.

It’s not clear whether you can add dynamic provisioning to an already in place AMS subsystem or if it’s only available on a fresh installation of an AMS subsystem. Also no pricing was announced for this feature. In the past, HDS charged double the price of a GB of storage when it was in a thinly provisioned pool.

As you may recall, thin provisioning is a little like a room with a bunch of inflatable castles inside. Each castle starts with it’s initial inflation amount. As demand dictates, each castle can independently inflate to whatever level is needed to support the current workload up to that castles limit and the overall limit imposed by the room the castles inhabit. In this analogy, the castles are LUN storage volumes, the room the castles are located in, is the physical storage pool for the thinly provisioned volumes, and the air inside the castles is the physical disk space consumed by the thinly provisioned volumes.

In contrast, hard provisioning is like building permanent castles (LUNS) in stone, any change to the size of a structure would require major renovation and/or possible destruction of the original castle (deletion of the LUN).

When HDS first came out with dynamic provisioning it was only available for USP-V internal storage, later they released the functionality for USP-V external storage. This announcement seems to complete the roll out to all their SAN storage subsystems.

HDS also announced today a new service called the Storage Reclamation Service that helps
1) Assess whether thin provisioning will work well in your environment
2) Provide tools and support to identify candidate LUNs for thin provisioning, and
3) Configure new thinly provisioned LUNs and migrate your data over to the thinly provisioned storage.

Other products that support SAN storage thin provisioning include 3PAR, Compellent, EMC DMX, IBM SVC, NetApp and PillarData.