Every month (or so) we do a more detailed analysis of a chart that appears in our free monthly newsletter, this was done earlier in the year and documented the correlation between IOPS and drive counts in SPC-1 results.
Lower is better on this chart. I can’t remember the last time we showed this Top 10 $/IOPS™ chart from the Storage Performance Council SPC-1 benchmark. Recall that we prefer our IOPS/$/GB which factors in subsystem size but this past quarter two new submissions ranked well on this metric. The two new systems were the all SSD Huawei Symantec Oceanspace™ Dorado2100 (#2) and the latest Fujitsu ETERNUS DX80 S2 storage (#7) subsystems.
Most of the winners on $/IOPS are SSD systems (#1-5 and 10) and most of these were all SSD storage system. These systems normally have better $/IOPS by hitting high IOPS™ rates for the cost of their storage. But they often submit relatively small systems to SPC-1 reducing system cost and helping them place better on $/IOPS.
On the other hand, some disk only storage do well by abandoning any form of protection as with the two Sun J4400 (#6) and J4200 (#8) storage systems which used RAID 0 but also had smaller capacities, coming in at 2.2TB and 1.2TB, respectively.
The other two disk only storage systems here, the Fujitsu ETERNUS DX80 S2 (#7) and the Huawei Symantec Oceanspace S2600 (#9) systems also had relatively small capacities at 9.7TB and 2.9TB respectively.
The ETERNUS DX80 S2 achieved ~35K IOPS and at a cost of under $80K generated a $2.25 $/IOPS. Of course, the all SSD systems blow that away, for example the Oceanspace Dorado2100 (#2), all SSD system hit ~100K IOPS but cost nearly $90K for a $0.90 $/IOPS.
Moreover, the largest capacity system here with 23.7TB of storage was the Oracle Sun ZFS (#10) hybrid SSD and disk system which generated ~137K IOPS at a cost of ~$410K hitting just under $3.00 $/IOPS.
Still prefer our own metric on economical performance but each has their flaws. The SPC-1 $/IOPS metric is dominated by SSD systems and our IOPS/$/GB metric is dominated by disk only systems. Probably some way to do better on the cost of performance but I have yet to see it.
The full SPC performance report went out in SCI’s February newsletter. But a copy of the full report will be posted on our dispatches page sometime next month (if all goes well). However, you can get the full SPC performance analysis now and subscribe to future free newsletters by just sending us an email or using the signup form above right.
For a more extensive discussion of current SAN or block storage performance covering SPC-1 (top 30), SPC-2 (top 30) and ESRP (top 20) results please see SCI’s SAN Storage Buying Guide available on our website.
As always, we welcome any suggestions or comments on how to improve our analysis of SPC results or any of our other storage performance analyses.
[As promised, I am trying to get up-to-date on my performance charts from our monthly newsletters. This one brings us current up through November.]
The above chart plots Storage Performance Council SPC-1 IOPS against spindle count. On this chart, we have eliminated any SSD systems, systems with drives smaller than 140 GB and any systems with multiple drive sizes.
Alas, the regression coefficient (R**2) of 0.96 tells us that SPC-1 IOPS performance is mainly driven by drive count. But what’s more interesting here is that as drive counts get higher than say 1000, the variance surrounding the linear regression line widens – implying that system sophistication starts to matter more.
Processing power matters
For instance, if you look at the three systems centered around 2000 drives, they are (from lowest to highest IOPS) 4-node IBM SVC 5.1, 6-node IBM SVC 5.1 and an 8-node HP 3PAR V800 storage system. This tells us that the more processing (nodes) you throw at an IOPS workload given similar spindle counts, the more efficient it can be.
System sophistication can matter too
The other interesting facet on this chart comes from examining the three systems centered around 250K IOPS that span from ~1150 to ~1500 drives.
The 1156 drive system is the latest HDS VSP 8-VSD (virtual storage directors, or processing nodes) running with dynamically (thinly) provisioned volumes – which is the first and only SPC-1 submission using thin provisioning.
The 1280 drive system is a (now HP) 3PAR T800 8-node system.
The 1536 drive system is an IBM SVC 4.3 8-node storage system.
One would think that thin provisioning would degrade storage performance and maybe it did but without a non-dynamically provisioned HDS VSP benchmark to compare against, it’s hard to tell. However, the fact that the HDS-VSP performed as well as the other systems did with much lower drive counts seems to tell us that thin provisioning potentially uses hard drives more efficiently than fat provisioning, the 8-VSD HDS VSP is more effective than an 8-node IBM SVC 4.3 and an 8-node (HP) 3PAR T800 systems, or perhaps some combination of these.
The full SPC performance report went out to our newsletter subscriber’s last November. [The one change to this chart from the full report is the date in the chart’s title was wrong and is fixed here]. A copy of the full report will be up on the dispatches page of our website sometime this month (if all goes well). However, you can get performance information now and subscribe to future newsletters to receive these reports even earlier by just sending us an email or using the signup form above right.
For a more extensive discussion of block or SAN storage performance covering SPC-1&-2 (top 30) and ESRP (top 20) results please consider purchasing our recently updated SAN Storage Buying Guide available on our website.
As always, we welcome any suggestions on how to improve our analysis of SPC results or any of our other storage system performance discussions.
The above chart is from our May Storage Intelligence newsletter dispatch on system performance and shows the latest Storage Performance Council SPC-1 benchmark results in a scatter plot with IO/sec [or IOPS(tm)] on the vertical axis and number of disk drives on the horizontal axis. We have tried to remove all results that used NAND flash as a cache or SSDs. Also this displays only results below a $100/GB.
One negative view of benchmarks such as SPC-1 is that published results are almost entirely due to the hardware thrown at it or in this case, the number of disk drives (or SSDs) in the system configuration. An R**2 of 0.93 shows a pretty good correlation of IOPS performance against disk drive count and would seem to bear this view out, but is an incorrect interpretation of the results.
Just look at the wide variation beyond the 500 disk drive count versus below that where there are only a few outliers with a much narrower variance. As such, we would have to say that at some point (below 500 drives), most storage systems can seem to attain a reasonable rate of IOPS as a function of the number of spindles present, but after that point the relationship starts to break down. There are certainly storage systems at the over 500 drive level that perform much better than average for their drive configuration and some that perform worse.
For example, consider the triangle formed by the three best performing (IOPS) results on this chart. The one at 300K IOPS with ~1150 disk drives is from Huawei Symantec and is their 8-node Oceanspace S8100 storage system whereas the other system with similar IOPS performance at ~315K IOPS used ~2050 disk drives and is a 4-node, IBM SVC (5.1) system with DS8700 backend storage. In contrast, the highest performer on this chart at ~380K IOPS, also had ~2050 disk drives and is a 6-node IBM SVC (5.1) with DS8700 backend storage.
Given the above analysis there seems to be much more to system performance than merely disk drive count, at least at the over 500 disk count level.
The full performance dispatch will be up on our website after the middle of next month but if you are interested in viewing this today, please sign up for our free monthly newsletter (see subscription request, above right) or subscribe by email and we’ll send you the current issue. If you need a more analysis of SAN storage performance please consider purchasing SCI’s SAN Storage Briefing.
As always, we welcome all constructive suggestions on how to improve any of our storage performance analyses.
Other cluster oriented systems here include all the IBM SVC submissions (#1,2,6, & 7) as well as the now HP 3Par system coming in at number 9. One could probably argue that the IBM Power 595 w/SSDs also should also be considered a clustered system but it really only had one server (with 96 cores on it though) with SAS connected SSDs behind it.
It’s somewhat surprising not to see better performance from using SSDs on this chart. The only SSD systems being IBM Power 594 and the two TMS systems. It’s apparent from this data that one can obtain superior performance just by using lots of disk drives, at least for SPC-1 IOPS.
The full performance dispatch will be up on our website after month end but if one is interested in seeing it sooner sign up for our free monthly newsletter (see subscription request, above right) or subscribe by email and we will send the current issue along with download instructions for this and other reports. If you need an even more in-depth analysis of SAN storage performance please consider purchasing SCI’s SAN Storage Briefing also available from our website.
As always, we welcome any constructive suggestions on how to improve any of our storage performance analysis.
Since our last blog post on this subject there have been 6 new entries in LRT Top 10 (#3-6 &, 9-10). As can be seen here which combines SPC-1 and 1/E results, response times vary considerably. 7 of these top 10 LRT results come from subsystems which either have all SSDs (#1-4, 7 & 9) or have a large NAND cache (#5). The newest members on this chart were the NetApp 3270A and the Xiotech Emprise 5000-300GB disk drives which were published recently.
The NetApp FAS3270A, a mid-range subsystem with 1TB of NAND cache (512MB in each controller) seemed to do pretty well here with all SSD systems doing better than it and a pair of all SSD systems doing worse than it. Coming in under 1msec LRT is no small feat. We are certain the NAND cache helped NetApp achieve their superior responsiveness.
What the Xiotech Emprise 5000-300GB storage subsystem is doing here is another question. They have always done well on an IOPs/drive basis (see SPC-1&-1/E results IOPs/Drive – chart of the month) but being top ten in LRT had not been their forte, previously. How one coaxes a 1.47 msec LRT out of a 20 drive system that costs only ~$41K, 12X lower than the median price(~$509K) of the other subsystems here is a mystery. Of course, they were using RAID 1 but so were half of the subsystems on this chart.
The full performance dispatch will be up on our website in a couple of weeks but if you are interested in seeing it sooner just sign up for our free monthly newsletter (see upper right) or subscribe by email and we will send you the current issue with download instructions for this and other reports.
As always, we welcome any constructive suggestions on how to improve our storage performance analysis.
Lost in much of the discussions on storage system performance is the need for both throughput and response time measurements.
By IO throughput I generally mean data transfer speed in megabytes per second (MB/s or MBPS), however another definition of throughput is IO operations per second (IO/s or IOPS). I prefer the MB/s designation for storage system throughput because it’s very complementary with respect to response time whereas IO/s can often be confounded with response time. Nevertheless, both metrics qualify as storage system throughput.
By IO response time I mean the time it takes a storage system to perform an IO operation from start to finish, usually measured in milleseconds although lately some subsystems have dropped below the 1msec. threshold. (See my last year’s post on SPC LRT results for information on some top response time results).
Benchmark measurements of response time and throughput
Both Standard Performance Evaluation Corporation’s SPECsfs2008 and Storage Performance Council’s SPC-1 provide response time measurements although they measure substantially different quantities. The problem with SPECsfs2008’s measurement of ORT (overall response time) is that it’s calculated as a mean across the whole benchmark run rather than a strict measurement of least response time at low file request rates. I believe any response time metric should measure the minimum response time achievable from a storage system although I can understand SPECsfs2008’s point of view.
On the other hand SPC-1 measurement of LRT (least response time) is just what I would like to see in a response time measurement. SPC-1 provides the time it takes to complete an IO operation at very low request rates.
In regards to throughput, once again SPECsfs2008’s measurement of throughput leaves something to be desired as it’s strictly a measurement of NFS or CIFS operations per second. Of course this includes a number (>40%) of non-data transfer requests as well as data transfers, so confounds any measurement of how much data can be transferred per second. But, from their perspective a file system needs to do more than just read and write data which is why they mix these other requests in with their measurement of NAS throughput.
Storage Performance Council’s SPC-1 reports throughput results as IOPS and provide no direct measure of MB/s unless one looks to their SPC-2 benchmark results. SPC-2 reports on a direct measure of MBPS which is an average of three different data intensive workloads including large file access, video-on-demand and a large database query workload.
Why response time and throughput matter
Historically, we used to say that OLTP (online transaction processing) activity performance was entirely dependent on response time – the better storage system response time, the better your OLTP systems performed. Nowadays it’s a bit more complex, as some of todays database queries can depend as much on sequential database transfers (or throughput) as on individual IO response time. Nonetheless, I feel that there is still a large component of response time critical workloads out there that perform much better with shorter response times.
On the other hand, high throughput has its growing gaggle of adherents as well. When it comes to high sequential data transfer workloads such as data warehouse queries, video or audio editing/download or large file data transfers, throughput as measured by MB/s reigns supreme – higher MB/s can lead to much faster workloads.
The only question that remains is who needs higher throughput as measured by IO/s rather than MB/s. I would contend that mixed workloads which contain components of random as well as sequential IOs and typically smaller data transfers can benefit from high IO/s storage systems. The only confounding matter is that these workloads obviously benefit from better response times as well. That’s why throughput as measured by IO/s is a much more difficult number to understand than any pure MB/s numbers.
Now there is a contingent of performance gurus today that believe that IO response times no longer matter. In fact if one looks at SPC-1 results, it takes some effort to find its LRT measurement. It’s not included in the summary report.
Also, in the post mentioned above there appears to be a definite bifurcation of storage subsystems with respect to response time, i.e., some subsystems are focused on response time while others are not. I would have liked to see some more of the top enterprise storage subsystems represented in the top LRT subsystems but alas, they are missing.
Call me old fashioned but I feel that response time represents a very important and orthogonal performance measure with respect to throughput of any storage subsystem and as such, should be much more widely disseminated than it is today.
For example, there is a substantive difference a fighter jet’s or race car’s top speed vs. their maneuverability. I would compare top speed to storage throughput and its maneuverability to IO response time. Perhaps this doesn’t matter as much for a jet liner or family car but it can matter a lot in the right domain.
Now do you want your storage subsystem to be a jet fighter or a jet liner – you decide.
Not a lot of Storage Performance Council (SPC) benchmark submissions this past quarter just a new SPC-1/E from HP StorageWorks on their 6400 EVA with SSDs and a new SPC-1 run for Oracle Sun StorageTek 6780. Recall that SPC-1/E executes all the same tests as SPC-1 but adds more testing with power monitoring equipment attached to measure power consumption at a number of performance levels.
With this chart we take another look at the storage energy consumption (see my previous discussion on SSD vs. drive energy use). As shown above we graph the IOPS/watt for three different performance environments: Nominal, Medium, and High as defined by SPC. These are contrived storage usage workloads to measure the varibility in power consumed by a subsystem. SPC defines the workloads as follows:
Nominal usage is 16 hours of idle time and 8 hours of moderate activity
Medium usage is 6 hours of idle time, 14 hours of moderate activity, and 4 hours of heavy activity
High usage is 0 hours of idle time, 6 hours of moderate activity and 18 hours of heavy activity
As for activity, SPC defines moderate activity at 50% of the subsystem’s maximum SPC-1 reported performance and heavy activity is at 80% of its maximum performance.
With that behind us, now on to the chart. The HP 6400 EVA had 8-73GB SSD drives for storage while the two Xiotech submissions had 146GB/15Krpm and 600GB/15Krpm drives with no flash. As expected the HP SSD subsystem delivered considerably more IOPS/watt at the high usage workload – ~2X the Xiotech with 600GB drives and ~2.3X the Xiotech with 146GB drives. The multipliers were slightly less for moderate usage but still substantial nonetheless.
SSD nominal usage power consumption
However, the nominal usage bears some explanation. Here both Xiotech subsystems beat out the HP EVA SSD subsystem at nominal usage with the 600GB drive Xiotech box supporting ~1.3X the IOPS/watt of the HP SSD system. How can this be? SSD idle power consumption is the culprit.
The HP EVA SSD subsystem consumed ~463.1W at idle while the Xiotech 600GB only consumed ~23.5W and the Xiotech 146GM drive subsystem consumed ~23.4w. I would guess that the drives and perhaps the Xiotech subsystem have considerable power savings algorithms that shed power when idle. For whatever reason the SSDs and HP EVA don’t seem to have anything like this. So nominal usage with 16Hrs of idle time penalizes the HP EVA SSD system resulting in the poors IOPS/watt for nominal usage shown above..
Rays reading: SSDs are not meant to be idled alot and disk drives, especially the ones that Xiotech are using have very sophisticated power management that maybe SSDs and/or HP should take a look at adopting.
The full SPC performance report will go up on SCI’s website next month in our dispatches directory. However, if you are interested in receiving this sooner, just subscribe by email to our free newsletter and we will send you the current issue with download instructions for this and other reports.
As always, we welcome any suggestions on how to improve our analysis of SPC performance information so please comment here or drop us a line.