We return now to the preeminent block storage benchmark, the Storage Performance Council SPC results*. There have been five new SPC-1submissions, the Fujitsu DX440 S2, Huawei Oceanspace Dorado5100, Kaminario K2-D (DRAM), IBM Storwize® V7000 (2-node, all SSDs), and NetApp FAS6240 (6-node cluster), and two new SPC-2 submissions, the HDS VSP and IBM SVC 6.4 (8-node) with Storwize V7000 backend storage since our last report. As just about every one of these showed up in one or more top 10 chart we will review both SPC-1 and SPC-2 results for this analysis.
We start our discussion with the top 10 I/O operations per second (IOPS™) performance for SPC-1.
Figure 1 SPC-1* Top 10 IOPS
Higher is better on the IOPS chart. We had to double the vertical scale to show the new Kamanirio’s K2-D (mostly DRAM) submission at over 1.2M IOPS. The K2 used 12-8GB DRAM SSSs per data module, used 33 data modules and another 15-IO director modules with 24GB of DRAM cache (?) each. We suppose this was a foregone conclusion but it’s amazing what you can do with ~3.5TB of DRAM in a storage system. The next new system (#2), also new to this analysis used a more conventional all flash storage configuration, the Huawei Oceanspace Dorado5100 used ~19TB of flash made up of 96-200GB SLC SSDs.
The other three new submissions did not place in the top 10 for IOPS, with the dual node IBM Storwize V7000 all flash configuration hitting ~120K IOPS, the NetApp FAS6240 (6-node cluster) reaching ~250K IOPS and the Fujitsu DX440 S2 achieving a respectable for midrange disk storage, ~103K IOPS. Soon, we are going to need to segregate SPC-1 results with all flash/DRAM arrays, hybrid storage systems and disk-only systems, each having their own top ten charts.
Next we turn review Least Response Time (LRT™).
Figure 2 SPC-1 Top 10 LRT results
Lower is better on LRT results. New results are at #4, 5 and 9 on this Top 10 LRT chart. It’s interesting to note that the mostly DRAM Kaminario submission only reached #4 on response time, with a ~0.37msec. LRT. But the IBM V7000 all flash array also showed well here with a ~0.45msec. response time and the Huawei Dorado5100 did too, with a ~0.62msec. LRT. The only disk backend system left on this chart is the NetApp FAS3270 which had 120 300GB SAS drives but still used a 512MB FlashCache card in their system. We have seen a steady improvement in LRT as flash has become a larger component of SPC-1 benchmarked storage systems but the older TMS RamSan-400, also an all-DRAM system, puts the others to shame at 0.09msec. LRT.
Next we turn to storage IOPS/$/GB.
Figure 3 Top 10 SPC-1 IOPS/$/GB
In contrast to the SPC-1 reported $/IOPS metric, we prefer our own IOPS/$/GB. For one thing it seems to not be as biased to flash arrays. In any event, the two new systems here are the NetApp FAS6240 6-node cluster (#6) and the Fujitsu DX440 S2 (#9). In all honesty I must report that the FAS6240 cluster did have 3TB of flash cache across its 6-nodes and the #10 Oracle Sun ZFS system had both read flash (4TB) and write flash (~0.6TB) for IO acceleration. All the rest of the systems on this chart were disk-only storage.
Finally, we turn to our bubble chart plotting IOPS vs. LRT using system price as bubble size.
Figure 4 Bubble chart: IOPS vs. LRT with system pricing
For some reason we really like this chart. It easily shows how storage that is priced similarly can perform vastly different. For example looking at the 1msec vertical line (LRT) we see two systems at approximately the same price, ~$1.5M, one system (IBM DS8700) at ~30K IOPS, with slightly less than 1msec response time and another system (NetApp FAS6240 6-node cluster) at ~250K IOPS with almost exactly 1msec response time. As a second example look at the ~250K IOPS horizontal, we already discussed the system at 1msec LRT but there are two slightly more expensive systems (over $2M) that are around 2msec LRT; the one system (HP 3PAR Inserv T800) right at 2msec LRT did ~225K IOPS while the system (Huawei Symantec Oceanspace s8100 8-node) at ~2.2msec achieved ~300K IOPS.
We turn now to the SPC-2 throughput benchmark results.
Figure 5 Top 10 SPC-2 MBPS™
The new IBM SVC 6.4 8-node with Storwize as backend storage is our new top performer in the MPBS metric at ~14.6GB/sec. The new SVC submission had 8 Storwize V7000’s behind the 8-SVC nodes that included 768-146GB disk drives. The next new submission was the HDS VSP (tied for #2) that exactly matched the HP P9500 storage array that is an HP OEMed version of the HDS VSP, each of these included 512-300GB disk drives.
Another plot we occasionally like to see is the MBPS per disk spindle scatter plot.
Figure 6 Scatter plot: SPC-2 MBPS per disks
In Figure 6 we have also plotted the Excel determined linear regression line for the data. The regression coefficient (R**2) is not that good at 0.42 and indicates a wide variance in results on a per disk spindle basis, especially as we get above 150 disk drives. For example looking at the vertical at around 800 disk drives, we see three systems one at ~4500 MBPS, one at ~9700 MBPS and the latest one at ~14,600 MBPS. These three are an older version of the IBM SVC 4.1, an IBM DS8800 and the latest submission, the IBM SVC 6.4 with Storwize V7000’s respectively.
Benchmarks are often taken to task as being entirely determined by how much hardware one throws at them, for the SPC-2 benchmark that does not seem to the case. There are plenty of submissions at ~300 to ~400 disk drives that provide a higher MBPS than other systems with 2X to up to almost 5X the number of drives.
It’s good to see some new SPC-1 and SPC-2 results show up in our top 10 charts. Why Kaminario decided to benchmark their DRAM system instead of their SSD system, I don’t know. Quite possibly they were more interested in its stellar results and the press associated with it. It’s hard for me to conceive of a need for a DRAM system like Kaminario but high frequency trading shops might be interested.
The all-flash arrays from traditional vendors are currently dominating SPC-1 Top 10 IOPS and LRT charts, but we have yet to see submissions from most of the new SSD startups, except for Kaminario. When the others start to benchmark their systems, that’s when SPC-1 results should start to really heat up.
SPC-2 and throughput intensive applications seem to be one of the last bastions for all-disk systems. Not clear why, considering flash read speeds vastly outperform disk. We surmise it has something to do with the more sequential nature of SPC-2 workload, which favors media that reads well sequentially over randomly accessed data. TMS is the only all-flash submission I can recall for SPC-2.
As always, suggestions on how to improve any of our performance analyses are welcomed. Additionally, if you are interested in more block performance details, we now provide a fuller version (top 30 results) of all these charts and a new ChampionsChart™ for SAN storage in SCI’s SAN Storage Buying Guide available from our website.
[This performance dispatch was originally sent out to our newsletter subscribers in August of 2012. If you would like to receive this information via email please consider signing up for our free monthly newsletter (see subscription request, above right) and we will send our current issue along with download instructions for this and other reports.
Silverton Consulting, Inc. is a Storage, Strategy & Systems consulting services company, based in the USA offering products and services to the data storage community.