Why SSD performance is a mystery?

SSDs! :) by gimpbully (cc) (from flickr)
SSDs! 🙂 by gimpbully (cc) (from flickr)

SSD and/or SSS (solid state storage) performance is a mystery to most end-users. The technology is inherently asymmetrical, i.e., it reads much faster than it writes. I have written on some of these topics before (STEC’s new MLC drive, Toshiba’s MLC flash, Tape V Disk V SSD V RAM) but the issue is much more complex when you put these devices behind storage subsystems or in client servers.

Some items that need to be considered when measuring SSD/SSS performance include:

  • Is this a new or used SSD?
  • What R:W ratio will we use?
  • What blocksize should be used?
  • Do we use sequential or random I/O?
  • What block inter-reference interval should be used?

This list is necessarily incomplete but it’s representative of the sort of things that should be considered to measure SSD/SSS performance.

New device or pre-conditioned

Hard drives show little performance difference whether new or pre-owned, defect skips notwithstanding. In contrast, SSDs/SSSs can perform very differently when they are new versus when they have been used for a short period depending on their internal architecture. A new SSD can write without erasure throughout it’s entire memory address space but sooner or later wear leveling must kick in to equalize the use of the device’s NAND memory blocks. Wear leveling causes both reads and rewrites of data during it’s processing. Such activity takes bandwidth and controller processing away from normal IO. If you have a new device it may take days or weeks of activity (depending on how fast you write) to attain the device’s steady state where each write causes some sort of wear leveling activity.

R:W Ratio

Historically, hard drives have had slightly slower write seeks than reads, due to the need to be more accurately positioned to write data than to read it. As such, it might take .5msec longer to write than to read 4K bytes. But for SSDs the problem is much more acute, e.g. read times can be in microseconds while write times can almost be in milliseconds for some SSDs/SSSs. This is due to the nature of NAND flash, having to erase a block before it can be programmed (written) and the programming process taking a lot’s longer than a read.

So the question for measuring SSD performance is what read to write (R:W) ratio to use. Historically a R:W of 2:1 was used to simulate enterprise environments but most devices are starting to see more like 1:1 for enterprise applications due to the caching and buffering provided by controllers and host memory. I can’t speak as well for desktop environments but it wouldn’t surprise me to see 2:1 used to simulate desktop workloads as well.

SSDs operate a lot faster if their workload is 1000:1 than for 1:1 workloads. Most SSD data sheets tout a significant read I/O rate but only for 100% read workloads. This is like a subsystem vendor quoting a 100% read cache hit rate (which some do) but is unusual in the real world of storage.

Blocksize to use

Hard drives are not insensitive to blocksizes, as blocks can potentially span tracks which will require track-to-track seeks to be read or written. However, SSDs can also have some adverse interaction with varying blocksizes. This is dependent on the internal SSD architecture and is due to over optimizing write performance.

With an SSD, you erase a block of NAND and write a page or sector of NAND at a time. As writes takes much longer than reads, many SSD vendors add parallelism to improve write throughput. Parallelism writes or programs multiple sectors at the same time. Thus, if your blocksize is an integral multiple of the multi-sector size written performance is great, if not, performance can suffer.

In all honesty, similar issues exist with hard drive sector sizes. If your blocksize is an integral multiple of the drive sector size then performance is great, if not too bad. In contrast to SSDs, drive sector size is often configurable at the device level.

Sequential vs. random IO

Hard drives perform sequential IO much better than random IO. For SSDs this is not much of a problem, as once wear leveling kicks in, it’s all random to the NAND flash. So when comparing hard drives to SSDs the level of sequentiality is a critical parameter to control.

Cache hit rate

The block inter-reference interval is simply measures how often the same block is re-referenced. This is important for caching devices and systems because it ultimately determines the cache hit rate (reading data directly from cache instead of the device storage). Hard drives have onboard cache of 8 to 32MB today. SSD drives also have a DRAM cache for data buffering and other uses. SSDs typically publicize their cache size so in order to insure 0 cache hits one needs an block inter-reference interval close to the device’s capacity. Not a problem today with 146GB devices but as they move to 300GB and larger it becomes more of a problem to completely characterize device performance.

The future

So how do we get a handle on SSD performance? SNIA and others are working on a specification on how to measure SSD performance that will one day become a standard. When the standard is available we will have benchmarks and service groups that can run these benchmarks to validate SSD vendor performance claims. Until then – caveat emptor.

Of course most end users would claim that device performance is not as important as (sub)system performance which is another matter entirely…

One thought on “Why SSD performance is a mystery?

Comments are closed.