Hitachi’s VSP vs. VMAX

Today’s announcement of Hitachi’s VSP brings another round to the competition between EMC and Hitachi/HDS in the enterprise. VSP’s recent introduction which is GA and orderable today, takes the rivalry to a whole new level.

I was on SiliconANGLEs live TV feed earlier today discussing the merits of the two architectures with David Floyer and Dave Vellante from Wikibon. In essence, there seems to be a religious war going on between the two.

Examining VMAX, it’s obviously built around a concept of standalone nodes which all have cache, frontend, backend and processing components built in. Scaling the VMAX, aside from storage and perhaps cache, involves adding more VMAX nodes to the system. VMAX nodes talk to one another via an external switching fabric (RapidIO currently). The hardware although sophisticated packaging, IO connection technology and other internal stuff looks very much like a 2U server one could purchase from any number of vendors.

On the other hand, Hitachi’s VSP is a special built storage engine (or storage computer as Hu Yoshida says). While the architecture is not a radical revision of USP-V, it’s a major upleveling of all component technology from the 5th generation cross bar switch, the new ASIC driven Front-end and Back-end directors, the shared control L2 cache memory and the use of quad core Xenon Intel processors. Much of this hardware is unique, sophistication abounds and looks very much like a blade system for the storage controller community.

The VSP and VMAX comparison is sort of like a open source vs. closed source discussion. VMAX plays the role of open source champion that largely depends on commodity hardware, sophisticated packaging but with minimal ASICs technology. As evidence of the commodity hardware VPLEX EMC’s storage virtualization engine reportedly runs on VMAX hardware. Commodity hardware lets EMC ride the technology curve as it advances for other applications.

Hitachi VSP plays the role of closed source champion. Its functionality is locked inside proprietary hardware architecture, ASICS and interfaces. The functionality it provides is tightly coupled with their internal architecture and Hitachi probably believes that by doing so they can provide better performance and more tightly integrated functionality to the enterprise.

Perhaps this doesn’t do justice to either development team. There is plenty of unique proprietary hardware and sophisticated packaging in VMAX but they have taken the approach of separate but equal nodes. Whereas Hitachi has distributed this functionality out to various components like Front-end directors (FEDs), backend directors (BEDs), cache adaptors (CAs) and virtual storage directors (VSDs), each of which can scale independently, i.e., doesn’t require more BEDs to add FEDs or CAs. Ditto for VSDs. Each can be scaled separately up to the maximum that can fit inside a controller chasis and then if needed, you can add a whole another controller chasis.

One has an internal switching infrastructure (the VSP cross bar switch) and the other uses external switching infrastructure (the VMAX RapidIO). The promise of external switching like commodity hardware, is that you can share the R&D funding to enhance this technology with other users. But the disadvantage is that architecturally you may have more latency to propagate an IO to other nodes for handling.

With VSP’s cross bar switch, you may still need to move IO activity between VSDs but this can be done much faster and any VSD can access any CA, BED, FED resource required to perform the IO so the need to move IO is reduced considerably. Thus, providing a global pool of resources that any IO can take advantage of.

In the end, blade systems like VSP or separate server systems like VMAX, can all work their magic. Both systems have their place today and in the foreseeable future. Where blades servers shine is in dense packaging, high power cooling efficiency and bringing a lot of horse power to a small package. On the other hand, server systems are simple to deploy and connect together with minimal limitations on the number of servers that can be brought together.

In a small space blade systems probably can bring more compute (storage IO) power to bear within the same volume than multiple server systems but the hardware is much more proprietary and costs lots of R&D $s to maintain leading edge capabilities.

Typed this out after the show, hopefully I characterized the two products properly. If I am missing anything please let me know.

[Edited for readability, grammar and numerous misspellings – last time I do this on an iPhone. Thanks to Jay Livens (@SEPATONJay) and others for catching my errors.]

Latest SPC-2 results – chart of the month

SPC-2* benchmark results, spider chart for LFP, LDQ and VOD throughput
SPC-2* benchmark results, spider chart for LFP, LDQ and VOD throughput

Latest SPC-2 (Storage Performance Council-2) benchmark resultschart displaying the top ten in aggregate MBPS(TM) broken down into Large File Processing (LFP), Large Database Query (LDQ) and Video On Demand (VOD) throughput results. One problem with this chart is that it really only shows 4 subsystems: HDS and their OEM partner HP; IBM DS5300 and Sun 6780 w/8GFC at RAID 5&6 appear to be the same OEMed subsystem; IBM DS5300 and Sun 6780 w/ 4GFC at RAID 5&6 also appear to be the same OEMed subsystem; and IBM SVC4.2 (with IBM 4700’s behind it).

What’s interesting about this chart is what’s going on at the top end. Both the HDS (#1&2) and IBM SVC (#3) seem to have found some secret sauce for performing better on the LDQ workload or conversely some dumbing down of the other two workloads (LFP and VOD). According to the SPC-2 specification

  • LDQ is a workload consisting of 1024KiB and 64KiB transfers whereas the LFP consists of 1024KiB and 256KiB transfers and the VOD consists of only 256KiB, so transfer size doesn’t tell the whole story.
  • LDQ seems to have a lower write proportion (1%) while attempting to look like joining two tables into one, or scanning data warehouse to create output whereas, LFP processing has a read rate of 50% (R:W of 1:1) while executing a write-only phase, read-write phase and a read-only phase, and apparently VOD has a 100% read only workload mimicking streaming video.
  • 50% of the LDQ workload uses 4 I/Os outstanding and the remainder 1 I/O outstanding. The LFP uses only 1 I/O outstanding and VOD uses only 8 I/Os outstanding.

These seem to be the major differences between the three workloads. I would have to say that some sort of caching sophistication is evident in the HDS and SVC systems that is less present in the remaining systems. And I was hoping to provide some sort of guidance as to what that sophistication looked like but

  • I was going to say they must have a better sequential detection algorithm but the VOD, LDQ and LFP workloads have 100%, 99% and 50% read ratios respectively and sequential detection should perform better with VOD and LDQ than LFP. So thats not all of it.
  • Next I was going to say it had something to do with I/O outstanding counts. But VOD has 8 I/Os outstanding and the LFP only has 1, so the if this were true VOD should perform better than LFP. While LDQ having two sets of phases with 1 and 4 I/Os outstanding should have results somewhere in between these two. So thats not all of it.
  • Next I was going to say stream (or file) size is an important differentiator but “Segment Stream Size” for all workloads is 0.5GiB. So that doesn’t help.

So now I am a complete loss as to understand why the LDQ workloads are so much better than the LFP and VOD workload throughputs for HDS and SVC.

I can only conclude that the little write activity (1%) thrown into the LDQ mix is enough to give the backend storage a breather and allow the subsystem to respond better to the other (99%) read activity. Why this would be so much better for the top performers than the remaining results is not entirely evident. But I would add that, being able to handle lots of writes or lots of reads is relatively straight forward, but handling a un-ballanced mixture is harder to do well.

To validate this conjecture would take some effort. I thought it would be easy to understand what’s happening but as with most performance conundrums the deeper you look the more confounding the results often seem to be.

The full report on the latest SPC results will be up on my website later this year but if you want to get this information earlier and receive your own copy of our newsletter – email me at SubscribeNews@SilvertonConsulting.com?Subject=Subscribe_to_Newsletter.

I will be taking the rest of the week off so Happy Holidays to all my readers and a special thanks to all my commenters. See you next week.

ESRP results over 5K mbox-chart of the month

ESRP Results, over 5K mailboxr, normalized (per 5Kmbx) read and write DB transfers as of 30 October 2009
ESRP Results, over 5K mailbox, normalized (per 5Kmbx) read and write DB transfers as of 30 October 2009

In our quarterly study on Exchange Solution Reviewed Program (ESRP) results we show a number of charts to get a good picture of storage subsystem performance under Exchange workloads. The two that are of interest to most data centers are both the normalized and un-normalized database transfer (DB xfer) charts. The problem with un-normalized DB xfer charts is that the subsystem supporting the largest mailbox count normally shows up best, and the rest of the results are highly correlated to mailbox count. In contrast, the normalized view of DB xfers tends to discount high mailbox counts and shows a more even handed view of performance.

 

We show above a normalized view of ESRP results for the category that were available last month. A couple of caveats are warranted here:

  • Normalized results don’t necessarily scale – results shown in the chart range from 5,400 mailboxes (#1) to 100,000 mailboxes (#6). While normalization should allow one to see what a storage subsystem could do for any mailbox count. It is highly unlikely that one would configure the HDS AMS2100 to support 100,000 mailboxes and it is equally unlikely that one would configure the HDS USP-V to support 5,400 mailboxes.
  • The higher count mailbox results tend to cluster when normalized – With over 20,000 mailboxes, one can no longer just use one big Exchange server and given the multiple servers driving the single storage subsystem, results tend to shrink when normalized. So one should probably compare like mailbox counts rather than just depend on normalization to iron out the mailbox count differences.

There are a number of storage vendors in this Top 10. There are no standouts here, the midrange systems from HDS, HP, and IBM seem to hold down the top 5 and the high end subsystems from EMC, HDS, and 3PAR seem to own the bottom 5 slots.

However, Pillar is fairly unusual in that their 8.5Kmbx result came in at #4 and their 12.8Kmbx result came in at #8. In contrast, the un-normalized results for this chart appear exactly the same. Which brings up yet another caveat, when running two benchmarks with the same system, normalization may show a difference where none exists.

The full report on the latest ESRP results will be up on our website later this month but if you want to get this information earlier and receive your own copy of our newsletter – just subscribe by emailing us.

HDS upgrades AMS2000

Today, HDS refreshed their AMS2000 product line with a new high density drive expansion tray with 48-drives and up to a maximum capacity of 48TB, 8Gps FC (8GFC) ports for the AMS2300 and AMS2500 systems, and a new NEBS Level-3 compliant and DC powered version, the AMS2500DC.

HDS also re-iterated their stance that Dynamic Provisioning will be available on AMS2000 in the 2nd half of this year. (See my prior post on this subject for more information).

HDS also mentioned that the AMS2000 now supports external authentication infrastructure for storage managers and will support Common Criteria Certification for more stringent data security needs. The external authentication will be available in the second half of the year.

I find the DC version pretty interesting and signals a renewed interest in telecom OEM applications for this mid-range storage subsystem. Unclear to me whether this is a significant market for HDS. The 2500DC only supports 4Gps FC and is packaged with a Cisco MDS 9124 SAN switch. DC powered storage is also more energy efficient than AC storage.

Other than that the Common Criteria Certification can be a big thing for those companies or government entitities with significant interest in secure data centers. There was no specific time frame for this certification but presumably they have started the process.

As for the rest of this, it’s a pretty straightforward refresh.

HDS Dynamic Provisioning for AMS

HDS announced support today for their thin provisioning (called Dynamic Provisioning) feature to be available in their mid-range storage subsystem family the AMS. Expanding the subsystems that support Thin provisioning can only help the customer in the long run.

It’s not clear whether you can add dynamic provisioning to an already in place AMS subsystem or if it’s only available on a fresh installation of an AMS subsystem. Also no pricing was announced for this feature. In the past, HDS charged double the price of a GB of storage when it was in a thinly provisioned pool.

As you may recall, thin provisioning is a little like a room with a bunch of inflatable castles inside. Each castle starts with it’s initial inflation amount. As demand dictates, each castle can independently inflate to whatever level is needed to support the current workload up to that castles limit and the overall limit imposed by the room the castles inhabit. In this analogy, the castles are LUN storage volumes, the room the castles are located in, is the physical storage pool for the thinly provisioned volumes, and the air inside the castles is the physical disk space consumed by the thinly provisioned volumes.

In contrast, hard provisioning is like building permanent castles (LUNS) in stone, any change to the size of a structure would require major renovation and/or possible destruction of the original castle (deletion of the LUN).

When HDS first came out with dynamic provisioning it was only available for USP-V internal storage, later they released the functionality for USP-V external storage. This announcement seems to complete the roll out to all their SAN storage subsystems.

HDS also announced today a new service called the Storage Reclamation Service that helps
1) Assess whether thin provisioning will work well in your environment
2) Provide tools and support to identify candidate LUNs for thin provisioning, and
3) Configure new thinly provisioned LUNs and migrate your data over to the thinly provisioned storage.

Other products that support SAN storage thin provisioning include 3PAR, Compellent, EMC DMX, IBM SVC, NetApp and PillarData.

HDS High Availability Manager(HAM)

What does HAM look like to the open systems end user. We need to break this question up into two parts – one part for USP-V internal storage and the other part for external storage.

It appears that for internal storage first you need data replication services such as asynch or synchronous replication between the two USP-V storage subsystems. But here you still need some shared External storage used as a quorum disk. Then once all this is set up under HAM the two subsystems can automatically failover access to the replicated internal and shared external storage from one USP-V to the other.

For external storage it appears that this storage must be shared between the two USP-V systems and whenever the primary one fails the secondary one can take over (failover) data storage responsibilities for the failing USP-V frontend.

What does this do for data migration? Apparently, using automated failover with HAM one can migrate date between two different storage pools and then failover server access from one to the other non-disruptively.

Obviously all the servers accessing storage under HAM control would need to be able to access both USP-Vs in order for this to all work properly.

Continuous availability is a hard nut to crack. HDS seems to have taken a shot at doing this from a purely storage subsystem perspective. This might be very useful for data centers running heterogeneous server environments. Typically server clustering software is OS specific like MSCS. Symantec being the lone exception with VCS which supports multiple OSs. Such server clustering can handle storage outages but also depend on storage replication services to make this work.

Unclear to me which is preferable but when you add the non-disruptive data migration – it seems that HAM might make sense.