Latest SPECsfs2008 results: NFSops vs. #disks – chart of the month

(c) 2010 Silverton Consulting, All Rights Reserved
(c) 2010 Silverton Consulting, All Rights Reserved

Above one can see a chart from our September SPECsfs2008 Performance Dispatch displaying the scatter plot of NFS Throughput Operations/Second vs. number of disk drives in the solution.  Over the last month or so there has been a lot of Twitter traffic on the theory that benchmark results such as this and Storage Performance Council‘s SPC-1&2 are mostly a measure of the number of disk drives in a system under test and have little relation to the actual effectiveness of a system.  I disagree.

As proof of my disagreement I offer the above chart.  On the chart we have drawn a linear regression line (supplied by Microsoft Excel) and displayed the resultant regression equation.  A couple of items to note on the chart:

  1. Regression Coefficient – Even though there are only 37 submissions which span anywhere from 1K to over 330K NFS throughput operations a second, we do not have a perfect correlation (R**2=~0.8 not 1.0) between #disks and NFS ops.
  2. Superior systems exist – Any of the storage systems above the linear regression line have superior effectiveness or utilization of their disk resources than systems below the line.

As one example, take a look at the two circled points on the chart.

  • The one above the line is from Avere Systems and is a 6-FXT 2500 node tiered NAS storage system which has internal disk cache (8-450GB SAS disks per node) and an external mass storage NFS server (24-1TB SATA disks) for data with each node having a system disk as well, totaling 79 disk drives in the solution.  The Avere system was able to attain ~131.5K NFS throughput ops/sec on SPECsfs2008.
  • The one below the line is from Exanet Ltd., (recently purchased by Dell) and is an 8-ExaStore node clusterd NAS system which has attached storage (576-146GB SAS disks) as well as mirrored boot disks (16-73GB disks) totaling 592 disks drives in the solution.  They were only able to attain ~119.6K NFS throughput ops/sec on the benchmark.

Now the two systems respective architectures were significantly different but if we just count the data drives alone, Avere Systems (with 72 data disks) was able to attain 1.8K NFS throughput ops per second per data disk spindle and Exanet (with 576 data disks) was able to attain only 0.2K NFS throughput ops per second per data disk spindle.  A 9X difference in per drive performance for the same benchmark.

As far as I am concerned this definitively disproves the contention that benchmark results are dictated by the number of disk drives in the solution.  Similar comparisons can be seen looking horizontally at any points with equivalent NFS throughput levels.

Rays reading: NAS system performance is driven by a number of factors and the number of disk drives is not the lone determinant of benchmark results.  Indeed, one can easily see differences in performance of almost 10X on a throughput ops per second per disk spindle for NFS storage without looking very hard.

We would contend that similar results can be seen for block and CIFS storage benchmarks as well which we will cover in future posts.

The full SPECsfs2008 performance report will go up on SCI’s website next month in our dispatches directory.  However, if you are interested in receiving this sooner, just subscribe by email to our free newsletter and we will send you the current issue with download instructions for this and other reports.

As always, we welcome any suggestions on how to improve our analysis of SPECsfs2008 performance information so please comment here or drop us a line.

SNIA illuminates storage power efficiency

Untitled by johnwilson1969 (cc) (from Flickr)
Untitled by johnwilson1969 (cc) (from Flickr)

At SNW, a couple of weeks back, SNIA annouced the coming out of their green storage initiative’s new SNIA Emerald Program and the first public draft release of their storage power efficiency test  specification.  Up until now, other than SPC and some pronouncements from EPA there hasn’t been much standardization activity on how to measure storage power efficiency.

SNIA’s Storage Power Efficiency Specification

As such, SNIA felt there was a need for an industry standard on how to measure storage power use.  SNIA’s specification supplies a taxonomy for storage systems that can be used to define and categorize various storage systems. Their extensive taxonomy should minimize problems like comparing consumer storage power use against data center storage power use.  Also, the specification identifies storage use attributes such as deduplication and thin provisioning or capacity optimization features that can impact power efficiency.

In addition, the specification has two appendices:

  • Appendix A specifies the valid power and environmental meters that are to be used to measure power efficiency of the system under test.
  • Appendix B specifies the benchmark tool that is used to drive the system under test while its power efficiency is being measured.

Essentially, there are two approved benchmark drivers used to drive IOs in the online storage category Iometer and vdbench both of which are freely available.  Iometer has been employed for quite awhile now in vendor benchmarking activity.  In contrast, vdbench is a relative newcomer but I have worked with its author, Henk Vandenbergh, over many years now and he is a consummate performance analyst.  I look forward to seeing how Henk’s vdbench matures over time.

Given the spec’s taxonomy and the fact that it lists online, near-online, removable media, virtual media and adjunct storage device categories with multiple sub-categories for each, we will focus only on the online family of storage and save the rest for later.

SPC energy efficiency measures

As my readers should recall, the Storage Performance Council (SPC) also has benchmarks that measure energy use with their SPC-1/E and SPC-1C/E reports (see our SPC-1 IOPS per Watt post).  The interesting part about SPC-1/E results is that there are definite IOPS levels where storage power use undergoes significant transitions.

One can examine a SPC-1/E Executive Summary report and see power use at various IO intensity levels, i.e., 100%, 95%, 90%, 85%, 80%, 50%, 10% and 0% (or idle) for a storage subsystem under test.   SPC summarizes these detail power measurements by defining profiles for “Low”, “Medium” and “Heavy” storage system use.  But the devils often in the details and having all the above measurements allows one to calculate whatever activity profile works best for you.

Unfortunately, only a few SPC-1/E reports have been submitted to date and it has yet to take off.

SNIA alternative power efficiency metrics

Enter SNIA’s Emerald program, which is supposed to be an easier and quicker way to measure storage power use.  In addition to the specification, SNIA has established a website (see above) to hold SNIA approved storage power efficiency results and a certification program for auditors that can be used to verify vendor power efficiency testing meet all specification requirements.

What’s missing from the present SNIA power efficiency test specification are the following:

  • More strict IOPS level definitions – the specification refers to IO intensity but doesn’t provide an adequate definition from my perspective.  It says that subsystem response time cannot exceed 30msec and uses this to define 100% IO intensity for the workloads.  However given this definition it could apply to random read, random write, or mixed workloads and there is no separate specification for sequential or random (and/or mixed) workloads.  This could be tightened up
  • More IO intensity levels measured – the specification calls for power measurements at an IO intensity of 100% for all workloads and 25% for 70:30 R:W workloads for online storage.  However we would be more interested in also seeing 80% and 10%.  From a user perspective, 80% probably represents a heavy sustainable IO workload and 10% looks like a complete cache hit workload.  We would only measure these levels for the “Mixed workload” so as to minimize effort.
  • More write activity in “Mixed workloads” – the specification defines mixed workload as 70% read and 30% write random IO activity.  Given today’s O/S propensity to buffer read data, it would seem more prudent to use a 50:50 Read to Write mix.

Probably other items need more work as well, such as defining a standardized reporting format containing a detailed description of HW and SW of system under test, benchmark driver HW and SW, table for reporting all power efficiency metrics and inclusion of full benchmark report including input parameter specifications and all outputs, etc. but these are nits.

Finally, SNIA’s specification goes into much detail about capacity optimization testing which includes things like compression, deduplication, thin provisioning, delta-snapshotting, etc. with an intent to measure storage system power use when utilizing these capabilities.  This is a significant and complex undertaking to define how each of these storage features will be configured and used during power measurement testing.  Although SNIA should be commended for their efforts here, this seems to much to take on at the start.  We suggest capacity optimization testing definitions should be deferred to a later release and focus now on the more standard storage power efficiency measurements.

—-

I critique specifications at my peril.  Being wrong in the past has caused me to re-double efforts to insure a correct interpretation of any specification.  However, if there’s something I have misconstrued or missed here that are worthy of note please feel free to comment.

Data processing logistics

IBM System/370 Model 145 By jovike (cc) (from Flickr)
IBM System/370 Model 145 By jovike (cc) (from Flickr)

Chuck Hollis wrote a great post on “information logistics” as a new paradigm for IT centers to have to consider as they deploy applications around the globe and into the cloud.  The problem is that there’s lot’s of data to move around in order to make all this work.

Supercomputing’s Solution

Big data/super computing groups have been thinking about this problem for a long time and have some solutions that might help but it all harken’s back to batch processing and JCL (job control language) of the last century.  In my comment to Chuck’s post I mentioned the University of Wisconsin’s Condor(r) Project which can be used to schedule data transmission and data processing across distributed server nodes in a network, but there are others namely the Globus ToolKit 4 (GT4)  which creates a data grid to support collaborative research on PB of data currently being used by CERN for LHC data, EU for their data grid and others.  We have discussed Condor in our Free Cloud Storage and Cloud Computing post and GT4 in our 15PB a year created by CERN post.

These super computing projects were designed to move data around so that analysis could be done locally with results shared within the community.  However, at least with GT4, they replicate data at a number of nodes, which may not be storage efficient but does provide quicker access for data analysis.  In CERN, there are a hierarchy of nodes which participate in a GT4 data grid and the data is replicated between tiers and within peer nodes just to have better access to it.

In olden days, …

With JCL someone would code up a sequence of batch steps, each of which could be conditional on previous steps that would manipulate data into some transient and at the end, final form.  Sometimes JCL would invoke another job (set of JCL) for a follow on step if everything in this job worked as planned.  The JCL would wait in a queue until the data and execution resources were available for it.

This could mean mounting removable media, creating disk storage “datasets”, or waiting until other jobs were down with datasets being needed, jobs would execute in a priority sequence, and scheduling options could include using different hosts (servers) that would coordinate to provide job execution services.   For all I know, z/OS still supports JCL for batch processing, but it’s been a long time since I have used JCL.

Cloud computing and storage services

Where does that bring us for today. Cloud computing and Cloud storage are bringing this execution paradigm back into vogue. But instead of batch jobs, we are talking about virtual machines, or web applications or anything else that can be packaged up and run generically on anybody’s hardware and storage.

The only problem is that there are only application specific ways to control these execution activities.  I am thinking here of web services that hand off web handling to any web server that happens to have cycles to support it.  Similarly, database machines seem capable of handing off queries to any database server that has idle ergs to process with.  There are myriad others like this but they all seem specific to one application domain.  Nothing exists that is generic or can cross many application domains.

That’s where something like Condor, GT4 or god forbid, JCL can make some sense.  In essence, all of these approaches are application independent.  By doing so, they can be used for any number of applications to take advantage of cloud computing and cloud storage services.

Just had to get this out.  Chuck’s post had me thinking about JCL again and there had to be another solution.

Poor deduplication with Oracle RMAN compressed backups

Oracle offices by Steve Parker (cc) (from Flickr)
Oracle offices by Steve Parker (cc) (from Flickr)

I was talking with one large enterprise customer today and he was lamenting how poorly Oracle RMAN compressed backupsets dedupe. Apparently, non-compressed RMAN backup sets generate anywhere from 20 to 40:1 deduplication ratios but when they use RMAN backupset compression, their deduplication ratios drop down to 2:1.  Given that RMAN compression probably only adds another 2:1 compression ratio then the overall data reduction becomes something ~4:1.

RMAN compression

It turns out Oracle RMAN supports two different compression algorithms that can be used zlib (or gzip) and bzip2.  I assume the default is zlib and if you want to one can specify bzip2 for even higher compression rates with the commensurate slower or more processor intensive compression activity.

  • Zlib is pretty standard repeating strings elimination followed by Huffman coding which uses shorter bit strings to represent more frequent characters and longer bit strings to represent less frequent characters.
  • Bzip2 also uses Huffman coding but only after a number of other transforms such as run length encoding (changing duplicated characters to a count:character sequence), Burrows–Wheeler transform (changes data stream so that repeating characters come together), move-to-front transform (changes data stream so that all repeating character strings are moved to the front), another run length encoding step, huffman encoding, followed by another couple of steps to decrease the data length even more…

The net of all this is that a block of data that is bzip2 encoded may look significantly different if even one character is changed.  Similarly, even zlib compressed data will look different with a single character insertion, but perhaps not as much.  This will depend on the character and where it’s inserted but even if the new character doesn’t change the huffman encoding tree, adding a few bits to a data stream will necessarily alter its byte groupings significantly downstream from that insertion. (See huffman coding to learn more).

Deduplicating RMAN compressed backupsets

Sub-block level deduplication often depends on seeing the same sequence of data that may be skewed or shifted by one to N bytes between two data blocks.  But as discussed above, with bzip2 or zlib (or any huffman encoded) compression algorithm the sequence of bytes looks distinctly different downstream from any character insertion.

One way to obtain decent deduplication rates from RMAN compressed backupsets would be to decompress the data at the dedupe appliance and then run the deduplication algorithm on it – dedupe appliance ingestion rates would suffer accordingly.  Another approach is to not use RMAN compressed backupsets but the advantages of compression are very appealing such as less network bandwidth, faster backups (because they are not transferring as much data), and quicker restores.

Oracle RMAN OST

On the other hand, what might work is some form of Data Domain OST/Boost like support from Oracle RMAN which would partially deduplicate the data at the RMAN server and then send the deduplicated stream to the dedupe appliance.  This would provide less network bandwidth and faster backups but may not do anything for restores.  Perhaps a tradeoff worth investigating.

As for the likelihood that Oracle would make such services available to deduplicatione vendors, I would have said this was unlikely but ultimately the customers have a say here.   It’s unclear why Symantec created OST but it turned out to be a money maker for them and something similar could be supported by Oracle.  Once an Oracle RMAN OST-like capability was in place, it shouldn’t take much to provide Boost functionality on top of it.  (Although EMC Data Domain is the only dedupe vendor that has Boost yet for OST or their own Networker Boost version.)

—-

When I first started this post I thought that if the dedupe vendors just understood the format of the RMAN compressed backupsets they would be able to have the same dedupe ratios as seen for normal RMAN backupsets.  As I investigated the compression algorithms being used I became convinced that it’s a computationally “hard” problem to extract duplicate data from RMAN compressed backupsets and ultimately would probably not be worth it.

So, if you use RMAN backupset compression, probably ought to avoid deduplicating this data for now.

Anything I missed here?

What eMLC and eSLC do for SSD longevity

Enterprise NAND from Micron.com (c) 2010 Micron Technology, Inc.
Enterprise NAND from Micron.com (c) 2010 Micron Technology, Inc.

I talked last week with some folks from Nimbus Data who were discussing their new storage subsystem.  Apparently it uses eMLC (enterprise Multi-Level Cell) NAND SSDs for its storage and has no SLC (Single Level Cell) NAND at all.

Nimbus believes with eMLC they can keep the price/GB down and still supply the reliability required for data center storage applications.  I had never heard of eMLC before but later that week I was scheduled to meet with Texas Memory Systems and Micron Technologies that helped get me up to speed on this new technology.

eMLC/eSLC defined

eMLC and its cousin, eSLC are high durability NAND parts which supply more erase/program cycles than generally available from MLC and SLC respectively.  If today’s NAND technology can supply 10K erase/program cycles for MLC and similarly, 100K erase/program cycles for SLC then, eMLC can supply 30K.  Never heard a quote for eSLC but 300K erase/program cycles before failure might be a good working assumption.

The problem is that NAND wears out, and can only sustain so many erase/program cycles before it fails.  By having more durable parts, one can either take the same technology parts (from MLC to eMLC) to use them longer or move to cheaper parts (from SLC to eMLC) to use them in new applications.

This is what Nimbus Data has done with eMLC.  Most data center class SSD or cache NAND storage these days are based on SLC. But SLC, with only on bit per cell, is very expensive storage.  MLC has two (or three) bits per cell and can easily halve the cost of SLC NAND storage.

Moreover, the consumer market which currently drives NAND manufacturing depends on MLC technology for cameras, video recorders, USB sticks, etc.  As such, MLC volumes are significantly higher than SLC and hence, the cost of manufacturing MLC parts is considerably cheaper.

But the historic problem with MLC NAND is the reduction in durability.  eMLC addresses that problem by lengthening the page programming (tProg) cycle which creates a better, more lasting data write, but slows write performance.

The fact that NAND technology already has ~5X faster random write performance than rotating media (hard disk drives) makes this slightly slower write rate less of an issue. If eMLC took this to only ~2.5X disk writes it still would be significantly faster.  Also, there are a number of architectural techniques that can be used to speed up drive write speeds easily incorporated into any eMLC SSD.

How long will SLC be around?

The industry view is that SLC will go away eventually and be replaced with some form of MLC technology because the consumer market uses MLC and drives NAND manufacturing.  The volumes for SLC technology will just be too low to entice manufacturers to support it, driving the price up and volumes even lower – creating a vicious cycle which kills off SLC technology.  Not sure how much I believe this, but that’s conventional wisdom.

The problem with this prognosis is that by all accounts the next generation MLC will be even less durable than today’s generation (not sure I understand why but as feature geometry shrinks, they don’t hold charge as well).  So if today’s generation (25nm) MLC supports 10K erase/program cycles, most assume the next generation (~18nm) will only support 3K erase/program cycles. If eMLC then can still support 30K or even 10K erase/program cycles that will be a significant differentiator.

—-

Technology marches on.  Something will replace hard disk drives over the next quarter century or so and that something is bound to be based on transistorized logic of some kind, not the magnetized media used in disks today. Given todays technology trends, it’s unlikely that this will continue to be NAND but something else will most certainly crop up – stay tuned.

Anything I missed in this analysis?

To iPad or not to iPad – part 3

Apple iPad (wi-fi) (from apple.com)
Apple iPad (wi-fi) (from apple.com)

Well I did take the iPad and BlueTooth (BT) keypad to a short conference a couple of weeks ago and it was a disaster unlike what I envisioned in Parts 1 & 2 of this saga.  It turns out that some WiFi logins don’t work with the iPad (not sure if this is “Flash” issue or not).  In any event, the iPad was rendered WiFi-less during the whole conference which made for an unconnected experience to say the least (recall that I don’t own a 3G version).

The hotel used T-Mobile for their WiFi connection.  I must have created my account at least 3 times and tried to log-in afterward at least 5 times (persistance occasionally pays but not this time). Each time the login screen hung and I never got in.  The conference had a different WiFi supplier but it had the same problem only this time all I had to do was to sign into the service with a conference supplied SSID&password.  No such luck.  The hotel gave me two free WiFi card keys for T-Mobile but I can’t use them.

I even tried some of the tricks that are on the web to get around this problem but none worked. Nuts!

The blog post from hell

Of course, I didn’t plan to write a blog post at the conference but I had the time and the muse struck.  So I whipped out my trusty iPhone, paired the BT keypad with the iPhone, used Notes and WordPress App (WP, available free) to create a new blog post.  I power typed it into the iPhone Notes app and copied and pasted into WP’s new post window.

I was always curious how to add media to posts via the WP app but anything on the iPhone including the photo library and camera photos were accessible as new media to be added to any post.  I had used my iPhone to earlier take some pictures from the conference and easily added these to the post.  The WordPress app uses the more primitive editing window (not WYSIWYG) but that was ok as I didn’t have a lot of fancy text layout.  What’s funny is that saving on the WP app was not the same as uploading it to my blog.  And once uploaded you had to change the post status to Published to get it externally visible.

Another option would have been to use the web and update the blog post through WordPress on Safari. I  can’t recall but last time when I used Safari & WordPress there were some scrolling incompatibilities (inability to scroll down into the post – flash maybe) and there were other nuisances, so I decided to try the WP app this time.

The only problem with using the iPhone & WP app to enter the post was that it was hard to check spellings and see the whole post to edit it properly.  Only really got to see a couple of (short) lines at a time in the iPhone WP app window and the WP app preview was not all that useful.

Needless to say, the post was published with numerous typos, mis-spellings, grammatical faux pas, etc. (so what’s different Ray?).   A few readers caught the issues and DMed me on Twitter which I picked up later that night.  I tried my best to fix them but it still had problems a day later when I got to my desktop.  For some unknown reason, it became my most  popular post – go figure.

Using the iPhone at the conference

Of course the iPhone 4 worked fine for emails, twitter, facebook and other social media given its screen and soft keypad limitations during the conference.  And I was still able to take notes with the iPad I just couldn’t send them anyplace and would have liked to insert them into the post as an outline but couldn’t be done.

There is just no way to get data out of an iPad without WiFi or 3G access.  Maybe if I could take a screen shot with the iPhone and then use an OCR app to interpret it into a Notes item and then I could get the text into iPhone – but I didn’t have an OCR app at the time. Also, it smacks of a Rube Goldberg contraption.

—-

I would say the WP app on the iPad looks a lot better than the one on the iPhone but much of that is due to the increased screen space.  If everything was working fine I probably wouldn’t have had as many problems using iPad WP app to enter in the post.  Of course I would have had to mail the photos from the iPhone to the iPad to enter them into the post but this is standard practice with the iPad…

There’s another conference coming up (it’s conference season here in the US) and I am NOT taking the iPad. Too bad, my back hurts already just thinking about it.  I foresee either a 3G iPad or the Mac Air laptop sometime in my near future but for now on it’s lugging laptops.

Just not sure if I shouldn’t take the BT keypad to take notes on the iPhone!?

PS. Saw Rob Peglar and he had a Verizon Dongle that provided a local WiFi for his iPad and 4 other “close” friends.  Maybe that’s what I should invest in?

CommVault’s Simpana 9 release

CommVault annoucned a new release of their data protection product today – Simpana® 9.  The new software provides significantly enhanced support for VM backup, new source-level deduplication capabilities and other enhanced facilities.

Simpana 9 starts by defining 3 tiers of data protection based on their Snapshot Protection Client (SPC):

  • Recovery tier – using SPC application consistent hardware snapshots can be taken utilizing storage interfaces to create content aware granular level recovery.  Simpana 9 SPC now supports EMC, NetApp, HDS, Dell, HP, and IBM (including LSI) storage snapshot capabilities.  Automation supplied with Simpana 9 allows the user to schedule hardware snapshots at various intervals throughout the day such  that they can be used to recover data without delay.
  • Protection tier – using mounted snapshot(s) provided by SPC above, Simpana 9 can create an extract or physical backup set copy to any disk type (DAS, SAN, NAS) providing a daily backup for retention purposes. This data can be deduplicated and encrypted for increased storage utilization and data security.
  • Compliance tier – selective backup jobs can then be sent to cloud storage and/or archive appliances such as HDS HCP or Dell DX for long term retention and compliance, preserving CommVault’s deduplication and encryption.  Alternatively, compliance data can be sent to the cloud.  CommVault’s previous cloud storage support included Amazon S3, Microsoft Azure, Rackspace, Iron Mountain and Nirvanix, with Simpana 9, they now add EMC Atmos providers and Mezeo to the mix.

Simpana 9 VM backup support

Simpana 9 also introduces a SnapProtect Enable Virtual Server Agent (VSA) to speed up virtual machine datastore backups.  With VSA’s support for storage hardware snapshot backups and VMware facilities to provide application consistent backups, virtual server environments can now scale to 1000s of VMs without concern for backup’s processing and IO impact to ongoing activity.  VSA snapshots can be mounted afterwards to a proxy server and using VMware services extract file level content which CommVault can then data deduplicate, encrypt and offload to other media that allows for granular content recovery.

In addition, Simpana 9 supports auto-discovery of virtual machines with auto-assignment of data protection policies.  As such, VM guests can be automatically placed into an appropriate, pre-defined data protection regimen without the need for operator intervention after VM creation.

Also with all the meta-data content cataloguing, Simpana 9 now supplies a light weight file-oriented Storage Resources Manager capability via the CommVault management interface.  Such services can provide detailed file level analytics for VM data without the need for VM guest agents.

Simpana 9 new deduplication support

CommVault’s 1st gen deduplication with Simpana 7 was at the object level.  With Simpana 8 deduplication occured at the block level providing content aware variable block sizes and added software data encryption support for disk or tape backup sets.  With today’s release, Simpana 9 shifts some deduplication processing out to the source (the client) increasing backup data throughput by reducing data transfer. All this sounds similar to EMC’s Data Domain Boost capability introduced earlier this year .

Such a change takes advantage of the CommVault’s intelligent Data Agent (iDA) running in the clients to provide pre-deduplication hashing and list creation rather than doing this all at CommVault’s Media Agent node, reducing data to be transferred.  Further, CommVault’s data deduplication can be applied across a number of clients for a global deduplication service that spans remote clients as well as a central data center repositories.

Simpana 9 new non-CommVault backup reporting and migration capabilities

Simpana 9 provides a new data collector for NetBackup versions 6.0, 6.5, and 7.0 and TSM 6.1 which allows CommVault to discover other backup services in the environment, extract backup policies, client configurations, job histories, etc. and report on these foreign backup processes.  In addition, once their data collertor is in place, Simpana 9 also supports automated procedures that can roll out and convert all these other backup services to CommVault data protection over a weekend, vastly simplifying migration from non-CommVault to Simpana 9 data protection.

Simpana 9 new software licensing

CommVault is also changing their software licensing approach to include more options for capacity based licensing. Previously, CommVault supported limited capacity based licensing but mostly used CommVault architectural component level licensing.  Now, they have expanded the capacity licensing offerings and both licensing modes are available so the customer can select whichever approach proves best for them.  With CommVault’s capacity-based licensing, usage can be tracked on the fly to show when customers may need to purchase a larger capacity license.

Probably other enhancements I missed here as Simpana 9 was a significant changeover from Simpana 8. Nonetheless, this version’s best feature was their enhanced approach to VM backups, allowing more VMs to run on a single server without concern for backup overhead.  The fact that they do source-level pre-deduplication processing just adds icing to the cake.

What do you think?

Hitachi’s VSP vs. VMAX

Today’s announcement of Hitachi’s VSP brings another round to the competition between EMC and Hitachi/HDS in the enterprise. VSP’s recent introduction which is GA and orderable today, takes the rivalry to a whole new level.

I was on SiliconANGLEs live TV feed earlier today discussing the merits of the two architectures with David Floyer and Dave Vellante from Wikibon. In essence, there seems to be a religious war going on between the two.

Examining VMAX, it’s obviously built around a concept of standalone nodes which all have cache, frontend, backend and processing components built in. Scaling the VMAX, aside from storage and perhaps cache, involves adding more VMAX nodes to the system. VMAX nodes talk to one another via an external switching fabric (RapidIO currently). The hardware although sophisticated packaging, IO connection technology and other internal stuff looks very much like a 2U server one could purchase from any number of vendors.

On the other hand, Hitachi’s VSP is a special built storage engine (or storage computer as Hu Yoshida says). While the architecture is not a radical revision of USP-V, it’s a major upleveling of all component technology from the 5th generation cross bar switch, the new ASIC driven Front-end and Back-end directors, the shared control L2 cache memory and the use of quad core Xenon Intel processors. Much of this hardware is unique, sophistication abounds and looks very much like a blade system for the storage controller community.

The VSP and VMAX comparison is sort of like a open source vs. closed source discussion. VMAX plays the role of open source champion that largely depends on commodity hardware, sophisticated packaging but with minimal ASICs technology. As evidence of the commodity hardware VPLEX EMC’s storage virtualization engine reportedly runs on VMAX hardware. Commodity hardware lets EMC ride the technology curve as it advances for other applications.

Hitachi VSP plays the role of closed source champion. Its functionality is locked inside proprietary hardware architecture, ASICS and interfaces. The functionality it provides is tightly coupled with their internal architecture and Hitachi probably believes that by doing so they can provide better performance and more tightly integrated functionality to the enterprise.

Perhaps this doesn’t do justice to either development team. There is plenty of unique proprietary hardware and sophisticated packaging in VMAX but they have taken the approach of separate but equal nodes. Whereas Hitachi has distributed this functionality out to various components like Front-end directors (FEDs), backend directors (BEDs), cache adaptors (CAs) and virtual storage directors (VSDs), each of which can scale independently, i.e., doesn’t require more BEDs to add FEDs or CAs. Ditto for VSDs. Each can be scaled separately up to the maximum that can fit inside a controller chasis and then if needed, you can add a whole another controller chasis.

One has an internal switching infrastructure (the VSP cross bar switch) and the other uses external switching infrastructure (the VMAX RapidIO). The promise of external switching like commodity hardware, is that you can share the R&D funding to enhance this technology with other users. But the disadvantage is that architecturally you may have more latency to propagate an IO to other nodes for handling.

With VSP’s cross bar switch, you may still need to move IO activity between VSDs but this can be done much faster and any VSD can access any CA, BED, FED resource required to perform the IO so the need to move IO is reduced considerably. Thus, providing a global pool of resources that any IO can take advantage of.

In the end, blade systems like VSP or separate server systems like VMAX, can all work their magic. Both systems have their place today and in the foreseeable future. Where blades servers shine is in dense packaging, high power cooling efficiency and bringing a lot of horse power to a small package. On the other hand, server systems are simple to deploy and connect together with minimal limitations on the number of servers that can be brought together.

In a small space blade systems probably can bring more compute (storage IO) power to bear within the same volume than multiple server systems but the hardware is much more proprietary and costs lots of R&D $s to maintain leading edge capabilities.

Typed this out after the show, hopefully I characterized the two products properly. If I am missing anything please let me know.

[Edited for readability, grammar and numerous misspellings – last time I do this on an iPhone. Thanks to Jay Livens (@SEPATONJay) and others for catching my errors.]