Latest SPC-1 results – IOPS vs drive counts – chart-of-the-month

Scatter plot of SPC-1  IOPS against Spindle count, with linear regression line showing Y=186.18X + 10227 with R**2=0.96064
(SCISPC111122-004) (c) 2011 Silverton Consulting, All Rights Reserved

[As promised, I am trying to get up-to-date on my performance charts from our monthly newsletters. This one brings us current up through November.]

The above chart plots Storage Performance Council SPC-1 IOPS against spindle count.  On this chart, we have eliminated any SSD systems, systems with drives smaller than 140 GB and any systems with multiple drive sizes.

Alas, the regression coefficient (R**2) of 0.96 tells us that SPC-1 IOPS performance is mainly driven by drive count.  But what’s more interesting here is that as drive counts get higher than say 1000, the variance surrounding the linear regression line widens – implying that system sophistication starts to matter more.

Processing power matters

For instance, if you look at the three systems centered around 2000 drives, they are (from lowest to highest IOPS) 4-node IBM SVC 5.1, 6-node IBM SVC 5.1 and an 8-node HP 3PAR V800 storage system.  This tells us that the more processing (nodes) you throw at an IOPS workload given similar spindle counts, the more efficient it can be.

System sophistication can matter too

The other interesting facet on this chart comes from examining the three systems centered around 250K IOPS that span from ~1150 to ~1500 drives.

  • The 1156 drive system is the latest HDS VSP 8-VSD (virtual storage directors, or processing nodes) running with dynamically (thinly) provisioned volumes – which is the first and only SPC-1 submission using thin provisioning.
  • The 1280 drive system is a (now HP) 3PAR T800 8-node system.
  • The 1536 drive system is an IBM SVC 4.3 8-node storage system.

One would think that thin provisioning would degrade storage performance and maybe it did but without a non-dynamically provisioned HDS VSP benchmark to compare against, it’s hard to tell.  However, the fact that the HDS-VSP performed as well as the other systems did with much lower drive counts seems to tell us that thin provisioning potentially uses hard drives more efficiently than fat provisioning, the 8-VSD HDS VSP is more effective than an 8-node IBM SVC 4.3 and an 8-node (HP) 3PAR T800 systems, or perhaps some combination of these.

~~~~

The full SPC performance report went out to our newsletter subscriber’s last November.  [The one change to this chart from the full report is the date in the chart’s title was wrong and is fixed here].  A copy of the full report will be up on the dispatches page of our website sometime this month (if all goes well). However, you can get performance information now and subscribe to future newsletters to receive these reports even earlier by just sending us an email or using the signup form above right.

For a more extensive discussion of block or SAN storage performance covering SPC-1&-2 (top 30) and ESRP (top 20) results please consider purchasing our recently updated SAN Storage Buying Guide available on our website.

As always, we welcome any suggestions on how to improve our analysis of SPC results or any of our other storage system performance discussions.

Comments?

SPC-1 Results IOPs vs. Capacity – chart of the month

SPC-1* IOPS vs. Capacity, (c) 2010 Silverton Consuliting, All Rights Reserved
SPC-1* IOPS vs. Capacity, (c) 2010 Silverton Consuliting, All Rights Reserved

This chart is from SCI’s last months report on recent Storage Performance Council (SPC) benchmark results. There were a couple of new entries this quarter but we decided to introduce this new chart as well.

This is a bubble scatter plot of SPC-1(TM) (online transaction workloads) results. Only storage subsystems that cost less than $100/GB, trying to introduce some fairness.

  • Bubble size is a function of the total cost of the subsystem
  • Horizontal access is subsystem capacity in GB
  • Vertical access is peak SPC-1 IOPS(TM)

Also we decided to show a linear regression line and equation to better analyze the data. As shown in the chart there is a pretty good correlation between capacity and IOPS (R**2 of ~0.8). The equation parameters can be read from the chart but it seems pretty tight from a visual perspective.

The one significant outlier here at ~250K IOPS is TMS RAMSAN which uses SSD technology. The two large bubbles at the top right were two IBM SVC 5.1 runs at similar backend capacity. The top SVC run had 6 nodes and the bottom SVC run only had 4.

As always, a number of caveats to this:

  • Not all subsystems on the market today are benchmarked with SPC-1
  • The pricing cap eliminated high priced storage from this analysis
  • IOPS may or may not be similar to your workloads.

Nevertheless, most storage professionals come to realize that having more disks can often result in better performance. This is often confounded by RAID type used, disk drive performance, and cache size. However, the nice thing about SPC-1 runs, is that most (nearly all) use RAID 1, have the largest cache size that makes sense, and the best performing disk drives (or SSDs). The conclusion cannot be more certain – the more RAID 1 capacity one has the higher the number of IOPS one can attain from a given subsystem.

The full SPC report went out to our newsletter subscribers last month and a copy of the report will be up on the dispatches page of our website later this month. However, you can get this information now and subscribe to future newsletters to receive future full reports even earlier, just email us at SubscribeNews@SilvertonConsulting.com?Subject=Subscribe_to_Newsletter.

As always, we welcome any suggestions on how to improve our analysis of SPC or any of our other storage system performance results. This new chart was a result of one such suggestion.

Latest SPC-2 results – chart of the month

SPC-2* benchmark results, spider chart for LFP, LDQ and VOD throughput
SPC-2* benchmark results, spider chart for LFP, LDQ and VOD throughput

Latest SPC-2 (Storage Performance Council-2) benchmark resultschart displaying the top ten in aggregate MBPS(TM) broken down into Large File Processing (LFP), Large Database Query (LDQ) and Video On Demand (VOD) throughput results. One problem with this chart is that it really only shows 4 subsystems: HDS and their OEM partner HP; IBM DS5300 and Sun 6780 w/8GFC at RAID 5&6 appear to be the same OEMed subsystem; IBM DS5300 and Sun 6780 w/ 4GFC at RAID 5&6 also appear to be the same OEMed subsystem; and IBM SVC4.2 (with IBM 4700’s behind it).

What’s interesting about this chart is what’s going on at the top end. Both the HDS (#1&2) and IBM SVC (#3) seem to have found some secret sauce for performing better on the LDQ workload or conversely some dumbing down of the other two workloads (LFP and VOD). According to the SPC-2 specification

  • LDQ is a workload consisting of 1024KiB and 64KiB transfers whereas the LFP consists of 1024KiB and 256KiB transfers and the VOD consists of only 256KiB, so transfer size doesn’t tell the whole story.
  • LDQ seems to have a lower write proportion (1%) while attempting to look like joining two tables into one, or scanning data warehouse to create output whereas, LFP processing has a read rate of 50% (R:W of 1:1) while executing a write-only phase, read-write phase and a read-only phase, and apparently VOD has a 100% read only workload mimicking streaming video.
  • 50% of the LDQ workload uses 4 I/Os outstanding and the remainder 1 I/O outstanding. The LFP uses only 1 I/O outstanding and VOD uses only 8 I/Os outstanding.

These seem to be the major differences between the three workloads. I would have to say that some sort of caching sophistication is evident in the HDS and SVC systems that is less present in the remaining systems. And I was hoping to provide some sort of guidance as to what that sophistication looked like but

  • I was going to say they must have a better sequential detection algorithm but the VOD, LDQ and LFP workloads have 100%, 99% and 50% read ratios respectively and sequential detection should perform better with VOD and LDQ than LFP. So thats not all of it.
  • Next I was going to say it had something to do with I/O outstanding counts. But VOD has 8 I/Os outstanding and the LFP only has 1, so the if this were true VOD should perform better than LFP. While LDQ having two sets of phases with 1 and 4 I/Os outstanding should have results somewhere in between these two. So thats not all of it.
  • Next I was going to say stream (or file) size is an important differentiator but “Segment Stream Size” for all workloads is 0.5GiB. So that doesn’t help.

So now I am a complete loss as to understand why the LDQ workloads are so much better than the LFP and VOD workload throughputs for HDS and SVC.

I can only conclude that the little write activity (1%) thrown into the LDQ mix is enough to give the backend storage a breather and allow the subsystem to respond better to the other (99%) read activity. Why this would be so much better for the top performers than the remaining results is not entirely evident. But I would add that, being able to handle lots of writes or lots of reads is relatively straight forward, but handling a un-ballanced mixture is harder to do well.

To validate this conjecture would take some effort. I thought it would be easy to understand what’s happening but as with most performance conundrums the deeper you look the more confounding the results often seem to be.

The full report on the latest SPC results will be up on my website later this year but if you want to get this information earlier and receive your own copy of our newsletter – email me at SubscribeNews@SilvertonConsulting.com?Subject=Subscribe_to_Newsletter.

I will be taking the rest of the week off so Happy Holidays to all my readers and a special thanks to all my commenters. See you next week.

5 Reasons to Virtualize Storage

Storage virtualization has been out for at least 5 years now and one can see more and more vendors offering products in this space. I have written before about storage virtualization in “Virtualization: Tales from the Trenches” article and I would say little has changed since then but it’s time for a refresher.

Storage virtualization differs from file or server virtualization by focusing only on FC storage domain. Unlike server virtualization there is no need to change host operating environments to support most storage virtualization products.

As an aside, there may be some requirement for iSCSI storage virtualization but to date I haven’t seen much emphasis on this. Some of the products listed below may support iSCSI frontends for FC backend storage subsystems but I am unaware of any that can support FC or iSCSI frontend for iSCSI backend storage.

I can think of at least the following storage virtualization products – EMC Invista, FalconStor IPStor, HDS USP-V, IBM SVC, and NetApp ONTAP. There are more than just these but they have the lion’s share of installations, Most of these products offer similar capabilities:

  1. Ability to non-disruptively migrate data from one storage subsystem to another. This can be used to help ease technology obsolescence by online migrating data from an old subsystem to a new subsystem. There are some tools and/or services on the market which can help automate this process but storage virtualization trumps them all in that it can help tech refresh as well as provide other services.
  2. Ability to better support multiple storage tiers by migrating data from one storage tier to another. Non-disruptive data migration can also ease implementation of multiple storage tiers such as slow/high capacity disk, fast/low capacity disk and SSD storage within one storage environment. Some high end subsystems can do this with multiple storage tiers within one subsystems, but only storage virtualization can do this across storage subsystems.
  3. Ability to aggregate heterogeneous storage subsystems under one storage management environment. The other major characteristic of most storage virtualization products is that they support multiple vendor storage subsystems under one storage cluster. This can be very valuable in multi-vendor shops by providing a single management interface to provision and administer all storage under a single storage virtualization environment.
  4. Ability to scale out rather than just scale up storage performance. By aggregating storage subsystems into a single storage cluster one can add storage performance by simple adding more storage virtualization cluster nodes. Not every storage virtualization system supports multiple cluster nodes but those that do offer another dimension to storage subsystem performance.
  5. Ability to apply high-end functionality to low-end storage. This takes many forms not the least of which is sophisticated caching, point-in-time copies and data replication or mirroring capabilities typically found only in higher end storage subsystems. Such capabilities can be supplied to any and all storage underneath the storage virtualization environment and can make storage much easier to use effectively.

There are potential downsides to storage virtualization as well, not the least of which is lock-in but this may be somewhat of a red-herring. Most storage virtualization products make it easy to migrate storage into the virtualization environment. Some of these products also make it relatively easy to migrate storage out of their environment as well. This is more complex because data that was once on this storage could be almost anywhere in the current virtualized storage subsystems and would need to be re-constituted back in one piece on the storage being exported.

The other reason for lock-in is that the functionality provided by storage virtualization makes it harder to remove. But it would probably be more correct to say “once you virtualize storage you never want to go back”. Many customers I talk with that have had a good initial experience with storage virtualization want to do it again, whenever given the chance.