The chart shown here reflects information from a SCI StorInt(tm) dispatch on the latest Storage Performance Council benchmark performance results and depicts the top IO operations done per second per installed drive for SPC-1 and SPC-1/E submissions. This particular storage performance metric is one of the harder ones to game. For example, adding more drives to perform better does nothing for this view.
The recent SPC-1 submissions were from Huwaei Symantec’s Oceanspace S2600 and S5600, Fujitsu Eternus DX400 and DX8400 and the latest IBM DS8700 with EasyTier, SSD and SATA drives were added. Of these results, the only one to show up on this chart was the low-end Huawei Symantec S2600. It used only 48 drives and attained ~17K IOPS as measured by SPC-1.
Other changes to this chart included the addition of Xiotech’s Emprise 5000 SPC-1/E runs with both 146GB and 600GB drives. We added the SPC-1/E results because they execute the exact same set of tests and generate the same performance summaries.
It’s very surprising to see the first use of 600GB drives in an SPC-1/E benchmark to show up well here and the very respectable #2 result from their 146GB drive version indicates excellent drive performance yields. The only other non-146GB drive result was for the Fujitsu DX80 which used 300GB drives.
Also as readers of our storage performance dispatches may recall the Sun (now Oracle) J4400 array provided no RAID support for their benchmark run. We view this as an unusable configuration and although it’s advantages vis a vis IOPS/drive are probably debatable.
A couple of other caveats to this comparison,
We do not include pure SSD configurations as they would easily dominate this metric.
We do not include benchmarks that use 73GB drives as they would offer a slight advantage and such small drives are difficult to purchase nowadays.
We are somewhat in a quandary about showing mixed drive (capacity) configurations. In fact an earlier version of this chart without the two Xiotech SPC-1/E results showed the IBM DS8700 EasyTier configuration with SSDs and rotating SATA disks. In that version the DS8700 came in at a rough tie with the then 7th place Fujitsu’s ETERNUS2000 subsystem. For the time being, we have decided not to include mixed drive configurations in this comparison but would welcome any feedback on this decision.
As always, we appreciate any comments on our performance analysis. Also if you are interested in receiving your own free copy of our newsletter with the full SPC performance report in it please subscribe to our newsletter. The full report will be made available on the dispatches section of our website in a couple of weeks.
This chart is from SCI’s last months report on recent Storage Performance Council (SPC) benchmark results. There were a couple of new entries this quarter but we decided to introduce this new chart as well.
This is a bubble scatter plot of SPC-1(TM) (online transaction workloads) results. Only storage subsystems that cost less than $100/GB, trying to introduce some fairness.
Bubble size is a function of the total cost of the subsystem
Horizontal access is subsystem capacity in GB
Vertical access is peak SPC-1 IOPS(TM)
Also we decided to show a linear regression line and equation to better analyze the data. As shown in the chart there is a pretty good correlation between capacity and IOPS (R**2 of ~0.8). The equation parameters can be read from the chart but it seems pretty tight from a visual perspective.
The one significant outlier here at ~250K IOPS is TMS RAMSAN which uses SSD technology. The two large bubbles at the top right were two IBM SVC 5.1 runs at similar backend capacity. The top SVC run had 6 nodes and the bottom SVC run only had 4.
As always, a number of caveats to this:
Not all subsystems on the market today are benchmarked with SPC-1
The pricing cap eliminated high priced storage from this analysis
IOPS may or may not be similar to your workloads.
Nevertheless, most storage professionals come to realize that having more disks can often result in better performance. This is often confounded by RAID type used, disk drive performance, and cache size. However, the nice thing about SPC-1 runs, is that most (nearly all) use RAID 1, have the largest cache size that makes sense, and the best performing disk drives (or SSDs). The conclusion cannot be more certain – the more RAID 1 capacity one has the higher the number of IOPS one can attain from a given subsystem.
Latest SPC-2 (Storage Performance Council-2) benchmark resultschart displaying the top ten in aggregate MBPS(TM) broken down into Large File Processing (LFP), Large Database Query (LDQ) and Video On Demand (VOD) throughput results. One problem with this chart is that it really only shows 4 subsystems: HDS and their OEM partner HP; IBM DS5300 and Sun 6780 w/8GFC at RAID 5&6 appear to be the same OEMed subsystem; IBM DS5300 and Sun 6780 w/ 4GFC at RAID 5&6 also appear to be the same OEMed subsystem; and IBM SVC4.2 (with IBM 4700’s behind it).
What’s interesting about this chart is what’s going on at the top end. Both the HDS (#1&2) and IBM SVC (#3) seem to have found some secret sauce for performing better on the LDQ workload or conversely some dumbing down of the other two workloads (LFP and VOD). According to the SPC-2 specification
LDQ is a workload consisting of 1024KiB and 64KiB transfers whereas the LFP consists of 1024KiB and 256KiB transfers and the VOD consists of only 256KiB, so transfer size doesn’t tell the whole story.
LDQ seems to have a lower write proportion (1%) while attempting to look like joining two tables into one, or scanning data warehouse to create output whereas, LFP processing has a read rate of 50% (R:W of 1:1) while executing a write-only phase, read-write phase and a read-only phase, and apparently VOD has a 100% read only workload mimicking streaming video.
50% of the LDQ workload uses 4 I/Os outstanding and the remainder 1 I/O outstanding. The LFP uses only 1 I/O outstanding and VOD uses only 8 I/Os outstanding.
These seem to be the major differences between the three workloads. I would have to say that some sort of caching sophistication is evident in the HDS and SVC systems that is less present in the remaining systems. And I was hoping to provide some sort of guidance as to what that sophistication looked like but
I was going to say they must have a better sequential detection algorithm but the VOD, LDQ and LFP workloads have 100%, 99% and 50% read ratios respectively and sequential detection should perform better with VOD and LDQ than LFP. So thats not all of it.
Next I was going to say it had something to do with I/O outstanding counts. But VOD has 8 I/Os outstanding and the LFP only has 1, so the if this were true VOD should perform better than LFP. While LDQ having two sets of phases with 1 and 4 I/Os outstanding should have results somewhere in between these two. So thats not all of it.
Next I was going to say stream (or file) size is an important differentiator but “Segment Stream Size” for all workloads is 0.5GiB. So that doesn’t help.
So now I am a complete loss as to understand why the LDQ workloads are so much better than the LFP and VOD workload throughputs for HDS and SVC.
I can only conclude that the little write activity (1%) thrown into the LDQ mix is enough to give the backend storage a breather and allow the subsystem to respond better to the other (99%) read activity. Why this would be so much better for the top performers than the remaining results is not entirely evident. But I would add that, being able to handle lots of writes or lots of reads is relatively straight forward, but handling a un-ballanced mixture is harder to do well.
To validate this conjecture would take some effort. I thought it would be easy to understand what’s happening but as with most performance conundrums the deeper you look the more confounding the results often seem to be.
The above chart shows the top 12 LRT(tm) (least response time) results for Storage Performance Council’s SPC-1 benchmark. The vertical axis is the LRT in milliseconds (msec.) for the top benchmark runs. As can be seen the two subsystems from TMS (RamSan400 and RamSan320) dominate this category with LRTs significantly less than 2.5msec. IBM DS8300 and it’s turbo cousin come in next followed by a slew of others.
The 1msec. barrier
Aside from the blistering LRT from the TMS systems one significant item in the chart above is that the two IBM DS8300 systems crack the <1msec. barrier using rotating media. Didn’t think I would ever see the day, of course this happened 3 or more years ago. Still it’s kind of interesting that there haven’t been more vendors with subsystems that can achieve this.
LRT is probably most useful for high cache hit workloads. For these workloads the data comes directly out of cache and the only thing between a server and it’s data is subsystem IO overhead, measured here as LRT.
Encryption cheap and fast?
The other interesting tidbit from the chart is that the DS5300 with full drive encryption (FDE), (drives which I believe come from Seagate) cracks into the top 12 at 1.8msec exactly equivalent with the IBM DS5300 without FDE. Now FDE from Seagate is a hardware drive encryption capability and might not be measurable at a subsystem level. Nonetheless, it shows that having data security need not reduce performance.
What is not shown in the above chart is that adding FDE to the base subsystem only cost an additional US$10K (base DS5300 listed at US$722K and FDE version at US$732K). Seems like a small price to pay for data security which in this case is simply turn it on, generate keys, and forget it.
FDE is a hard drive feature where the drive itself encrypts all data written and decrypts all data read to from a drive and requires a subsystem supplied drive key at power on/reset. In this way the data is never in plaintext on the drive itself. If the drive were taken out of the subsystem and attached to a drive tester all one would see is ciphertext. Similar capabilities have been available in enterprise and SMB tape drives is the past but to my knowledge the IBM DS5300 FDE is the first disk storage benchmark with drive encryption.
I believe the key manager for the DS5300 FDE is integrated within the subsystem. Most shops would need a separate, standalone key manager for more extensive data security. I believe the DS5300 can also interface with an standalone (IBM) key manager. In any event, it’s still an easy and simple step towards increased data security for a data center.
Recently, the Storage Performance Council (SPC) has introduced a new benchmark series, the SPC-1C/E, which provides detailed energy usage for storage subsystems. So far there have been only two published submissions in this category but we look forward to seeing more in the future. The two submissions are for an IBM SSD and a Seagate Savvio (10Krpm) SAS attached storage subsystems.
My only issue with the SPC-1C/E reports is that they focus on a value of nominal energy consumption rather than reporting peak and idle energy usage. I understand that this is probably closer to what an actual data center would see as energy cost but it buries some intrinsic energy use profile differences.
SSD vs Drive power profile differences
The deltas for reported energy consumption for the two current SPC-1C/E submissions show a ~9.6% difference in peak versus nominal energy use for rotating media storage. Similar results for the SSD storage show a difference of ~1.7%. Taking these results for peak versus idle periods, shows the difference for rotating media being 28.5% and for SSD, ~2.8%.
So, the upside for SSD is drive them as hard as you want and it will cost you only a little bit more energy. In contrast, the downside is leave them idle and it will cost almost as much as if you were driving them at peak IO rates.
Rotating media storage seems to have a much more responsive power profile. Drive them hard and it will consume more power, leave them idle and it consumes less power.
Data center view of storage power
Now these differences might not seem significant but given the amount of storage in most shops they could represent significant cost differentials. Although SSD storage consumes less power, it’s energy use profile is significantly flatter than rotating media and will always consume that level of power (when powered on). On the other hand, rotating media consumes more power on average but it’s power profile is more slanted than SSDs and at peak workload consumes much more power than when idle.
Usualy, it’s unwise to generalize from two results. However, everything I know says that these differences in their respective power profiles should persist across other storage subsystem results. As more results are submitted it should be easy to verify whether I am right.