The above chart comes from our August performance analysis [yes, I am a bit behind] and is a scatter plot of Storage Performance Council SPC-2 submissions. In the above we plot MBPS™ on the vertical axis and the number of disk drives in the submission on the horizontal. We have also added a linear regression line using the data with the regression formula listed.
Unlike SPC-1 performance results and IOPS™ vs. drives documented in an earlier post, SPC-2 MPBS results have a much wider variance and the regression coefficient (R**2) at ~0.42, shows it. In the earlier SPC-1 post the IOPS-drive count linear regression had a R**2 of ~0.96.
Why would SPC-1 IOPS be more driven by drive counts than MBPS? We can only speculate of course, but it seems to me that SPC-2 MBPS is more a function of system caching effectiveness rather than pure IO transaction speed.
All the SPC-2 workloads (VOD, LFP, and LDQ) are sequential in nature and as such, caching sequential lookahead sophistication can make more effective use of fewer spindles. In contrast, SPC-1 IOPS workloads are almost inherently random in nature and as such, are poor cache candidates which by natur depend on high counts of spindles to perform well.
In additon, SPC-2 has never been as popular as SPC-1 and as a result, doesn’t have as many submissions. It’s never been clear to me why this is the case as not all enterprise class workloads are random and as such, sequential activity is a necessary requirement for many enterprise storage systems.
The complete SPC-1 & SPC-2 performance report with more top 10 charts went out in SCI’s August newsletter. But a copy of the report will be posted on our dispatches page sometime this month (if all goes well). However, you can get the latest SPC performance analysis now and subscribe to future free newsletters by just using the signup form above right.
For a more extensive discussion of current SAN block system storage performance covering SPC (Top 30) results as well as ESRP results with our new ChampionsChart™ for SAN storage systems, please see SCI’s SAN Storage Buying Guide available from our website.
As always, we welcome any suggestions or comments on how to improve our analysis of SPC results or any of our other storage performance analyses.
SSD and/or SSS (solid state storage) performance is a mystery to most end-users. The technology is inherently asymmetrical, i.e., it reads much faster than it writes. I have written on some of these topics before (STEC’s new MLC drive, Toshiba’s MLC flash, Tape V Disk V SSD V RAM) but the issue is much more complex when you put these devices behind storage subsystems or in client servers.
Some items that need to be considered when measuring SSD/SSS performance include:
Is this a new or used SSD?
What R:W ratio will we use?
What blocksize should be used?
Do we use sequential or random I/O?
What block inter-reference interval should be used?
This list is necessarily incomplete but it’s representative of the sort of things that should be considered to measure SSD/SSS performance.
New device or pre-conditioned
Hard drives show little performance difference whether new or pre-owned, defect skips notwithstanding. In contrast, SSDs/SSSs can perform very differently when they are new versus when they have been used for a short period depending on their internal architecture. A new SSD can write without erasure throughout it’s entire memory address space but sooner or later wear leveling must kick in to equalize the use of the device’s NAND memory blocks. Wear leveling causes both reads and rewrites of data during it’s processing. Such activity takes bandwidth and controller processing away from normal IO. If you have a new device it may take days or weeks of activity (depending on how fast you write) to attain the device’s steady state where each write causes some sort of wear leveling activity.
Historically, hard drives have had slightly slower write seeks than reads, due to the need to be more accurately positioned to write data than to read it. As such, it might take .5msec longer to write than to read 4K bytes. But for SSDs the problem is much more acute, e.g. read times can be in microseconds while write times can almost be in milliseconds for some SSDs/SSSs. This is due to the nature of NAND flash, having to erase a block before it can be programmed (written) and the programming process taking a lot’s longer than a read.
So the question for measuring SSD performance is what read to write (R:W) ratio to use. Historically a R:W of 2:1 was used to simulate enterprise environments but most devices are starting to see more like 1:1 for enterprise applications due to the caching and buffering provided by controllers and host memory. I can’t speak as well for desktop environments but it wouldn’t surprise me to see 2:1 used to simulate desktop workloads as well.
SSDs operate a lot faster if their workload is 1000:1 than for 1:1 workloads. Most SSD data sheets tout a significant read I/O rate but only for 100% read workloads. This is like a subsystem vendor quoting a 100% read cache hit rate (which some do) but is unusual in the real world of storage.
Blocksize to use
Hard drives are not insensitive to blocksizes, as blocks can potentially span tracks which will require track-to-track seeks to be read or written. However, SSDs can also have some adverse interaction with varying blocksizes. This is dependent on the internal SSD architecture and is due to over optimizing write performance.
With an SSD, you erase a block of NAND and write a page or sector of NAND at a time. As writes takes much longer than reads, many SSD vendors add parallelism to improve write throughput. Parallelism writes or programs multiple sectors at the same time. Thus, if your blocksize is an integral multiple of the multi-sector size written performance is great, if not, performance can suffer.
In all honesty, similar issues exist with hard drive sector sizes. If your blocksize is an integral multiple of the drive sector size then performance is great, if not too bad. In contrast to SSDs, drive sector size is often configurable at the device level.
Sequential vs. random IO
Hard drives perform sequential IO much better than random IO. For SSDs this is not much of a problem, as once wear leveling kicks in, it’s all random to the NAND flash. So when comparing hard drives to SSDs the level of sequentiality is a critical parameter to control.
Cache hit rate
The block inter-reference interval is simply measures how often the same block is re-referenced. This is important for caching devices and systems because it ultimately determines the cache hit rate (reading data directly from cache instead of the device storage). Hard drives have onboard cache of 8 to 32MB today. SSD drives also have a DRAM cache for data buffering and other uses. SSDs typically publicize their cache size so in order to insure 0 cache hits one needs an block inter-reference interval close to the device’s capacity. Not a problem today with 146GB devices but as they move to 300GB and larger it becomes more of a problem to completely characterize device performance.
So how do we get a handle on SSD performance? SNIA and others are working on a specification on how to measure SSD performance that will one day become a standard. When the standard is available we will have benchmarks and service groups that can run these benchmarks to validate SSD vendor performance claims. Until then – caveat emptor.
Of course most end users would claim that device performance is not as important as (sub)system performance which is another matter entirely…