Above one can see a chart from our September SPECsfs2008 Performance Dispatch displaying the scatter plot of NFS Throughput Operations/Second vs. number of disk drives in the solution. Over the last month or so there has been a lot of Twitter traffic on the theory that benchmark results such as this and Storage Performance Council‘s SPC-1&2 are mostly a measure of the number of disk drives in a system under test and have little relation to the actual effectiveness of a system. I disagree.
As proof of my disagreement I offer the above chart. On the chart we have drawn a linear regression line (supplied by Microsoft Excel) and displayed the resultant regression equation. A couple of items to note on the chart:
- Regression Coefficient – Even though there are only 37 submissions which span anywhere from 1K to over 330K NFS throughput operations a second, we do not have a perfect correlation (R**2=~0.8 not 1.0) between #disks and NFS ops.
- Superior systems exist – Any of the storage systems above the linear regression line have superior effectiveness or utilization of their disk resources than systems below the line.
As one example, take a look at the two circled points on the chart.
- The one above the line is from Avere Systems and is a 6-FXT 2500 node tiered NAS storage system which has internal disk cache (8-450GB SAS disks per node) and an external mass storage NFS server (24-1TB SATA disks) for data with each node having a system disk as well, totaling 79 disk drives in the solution. The Avere system was able to attain ~131.5K NFS throughput ops/sec on SPECsfs2008.
- The one below the line is from Exanet Ltd., (recently purchased by Dell) and is an 8-ExaStore node clusterd NAS system which has attached storage (576-146GB SAS disks) as well as mirrored boot disks (16-73GB disks) totaling 592 disks drives in the solution. They were only able to attain ~119.6K NFS throughput ops/sec on the benchmark.
Now the two systems respective architectures were significantly different but if we just count the data drives alone, Avere Systems (with 72 data disks) was able to attain 1.8K NFS throughput ops per second per data disk spindle and Exanet (with 576 data disks) was able to attain only 0.2K NFS throughput ops per second per data disk spindle. A 9X difference in per drive performance for the same benchmark.
As far as I am concerned this definitively disproves the contention that benchmark results are dictated by the number of disk drives in the solution. Similar comparisons can be seen looking horizontally at any points with equivalent NFS throughput levels.
Rays reading: NAS system performance is driven by a number of factors and the number of disk drives is not the lone determinant of benchmark results. Indeed, one can easily see differences in performance of almost 10X on a throughput ops per second per disk spindle for NFS storage without looking very hard.
We would contend that similar results can be seen for block and CIFS storage benchmarks as well which we will cover in future posts.
The full SPECsfs2008 performance report will go up on SCI’s website next month in our dispatches directory. However, if you are interested in receiving this sooner, just subscribe by email to our free newsletter and we will send you the current issue with download instructions for this and other reports.
As always, we welcome any suggestions on how to improve our analysis of SPECsfs2008 performance information so please comment here or drop us a line.