Latest ESRPv3 (Exchange 2010) results analysis for 1K-to-5Kmailboxes – chart of the month

The chart is from SCI’s October newsletter/performance dispatch on Exchange 2010 Solution Reviewed Program (ESRP v3.0) and shows the mailbox database access latencies for read, write and log write.  For this report we are covering solutions supporting from 1001 up to 5000 mailboxes (1K-to-5Kmbx), larger and (a few) smaller configurations have been covered in previous performance dispatches.  On latency charts like this – lower is better.

We like this chart because in our view this represents a reasonable measure of email user experience.  As users read and create new emails they are actually reading Exchange databases and writing database and logs.  Database and log latencies should show up as longer or shorter delays in these activities.  (Ok, not exactly true, email client and Exchange server IO aren’t the same thing.  But ultimately every email sent has to be written to an Exchange database and log sometime and every new email read-in has to come from an Exchange database as well).

A couple of caveats are in order for this chart.

  • Xiotech’s top run (#1) did not use database redundancy or DAGs (Database Availability Groups) in their ESRPv3 run. Their feeling is that this technology is fairly new and it will take some time before it’s widely adopted.
  • There is quite the mix of SAS (#2,3,6,7,9&10), FC (#1,5&8) and iSCSI (#4) connected storage in this mailbox range.  Some would say that SAS connected storage should have an advantage here but that’s not obvious from the rankings.
  • Vendors get to select the workload intensity for any ESRPv3/Jetstress run, e.g. the solutions shown here used between 0.15 IO/sec/mailbox (#9&10) and 0.36 IO/sec/mailbox (#1).  IO intensity is just one of the myriad of Jetstress tweakable parameters that make analyzing ESRP so challenging.  Normally this would only matter with database and log access counts but heavier workloads can also impact latencies as well.

Wide variance between read and write latencies

The other thing of interest in this chart is the interesting span between read latencies and write (database and log) latencies for the same solution. Take the #10 Dell PowerEdge system for example.  It showed a database read latency of ~18msec. but a database write latency of ~0.4msec.  Why?

It turns out this Dell system had only 6 disk drives (2TB/7200 RPM).  So few disk drives don’t seem adequate to support the read workload and as a result, show up poorly in database read latencies.  However, write activity can mostly be masked with cache until it fills up, forcing write delays.  With only 1100 mailboxes and 0.15 IOs/sec/mailbox, the write workload apparent fits in cache well enough to be destaged over time, without delaying ongoing write activity.  Similar results appear for the other Dell PowerEdge (#6) and the HP Smart Array (#7) which had 12-2TB/7200 RPM and 24-932GB/7200 RPM drives respectively.

On the other hand, Xiotech’s #1 position had 20-360GB/15Krpm drives and EMC’s Celerra #4 run had 15-400GB/10Krpm drives, both of which were able to sustain a more balanced performance across reads and writes (database and logs).  For Xiotech’s #5 run they used 40-500GB/10Krpm drives.

It seems there is a direct correlation between drive speed and read database latencies.  Most of the systems in the bottom half of this chart have 7200 RPM drives (except for #8, HP StorageWorks MSA) and the top 3 all had 15Krpm drives.  However, write latencies don’t seem to be as affected by drive speed and have more to do with the balance between workload, cache size and effective destaging.

The other thing that’s apparent from this chart is that SAS connected storage continues to be an effective solution for this range of Exchange configurations, following a trend first shown in ESRP v2 (Exchange 2007) results.  We reported on this in our  January ESRPv2 analysis dispatch for this year .

As mentioned previously ESRP/Jetstress results are difficult to compare/analyze and we continue to welcome any constructive suggestions on how to improve.

As mentioned previously ESRP/Jetstress results are difficult to compare/analyze and we continue to welcome any constructive suggestions on how to improve.

SPC-1/E IOPS per watt – chart of the month

SPC*-1/E IOPs per Watt as of 27Aug2010

Not a lot of Storage Performance Council (SPC) benchmark submissions this past quarter just a new SPC-1/E from HP StorageWorks on their 6400 EVA with SSDs and a new SPC-1 run for Oracle Sun StorageTek 6780.  Recall that SPC-1/E executes all the same tests as SPC-1 but adds more testing with power monitoring equipment attached to measure power consumption at a number of performance levels.

With this chart we take another look at the storage energy consumption (see my previous discussion on SSD vs. drive energy use). As shown above we graph the IOPS/watt for three different performance environments: Nominal, Medium, and High as defined by SPC.  These are contrived storage usage workloads to measure the varibility in power consumed by a subsystem.  SPC defines the workloads as follows:

  • Nominal usage is 16 hours of idle time and 8 hours of moderate activity
  • Medium usage is 6 hours of idle time, 14 hours of moderate activity, and 4 hours of heavy activity
  • High usage is 0 hours of idle time, 6 hours of moderate activity and 18 hours of heavy activity

As for activity, SPC defines moderate activity at 50% of the subsystem’s maximum SPC-1 reported performance and heavy activity is at 80% of its maximum performance.

With that behind us, now on to the chart.  The HP 6400 EVA had 8-73GB SSD drives for storage while the two Xiotech submissions had 146GB/15Krpm and 600GB/15Krpm drives with no flash.  As expected the HP SSD subsystem delivered considerably more IOPS/watt at the high usage workload – ~2X the Xiotech with 600GB drives and ~2.3X the Xiotech with 146GB drives.  The multipliers were slightly less for moderate usage but still substantial nonetheless.

SSD nominal usage power consumption

However, the nominal usage bears some explanation.  Here both Xiotech subsystems beat out the HP EVA SSD subsystem at nominal usage with the 600GB drive Xiotech box supporting ~1.3X the IOPS/watt of the HP SSD system. How can this be?  SSD idle power consumption is the culprit.

The HP EVA SSD subsystem consumed ~463.1W at idle while the Xiotech 600GB only consumed ~23.5W and the Xiotech 146GM drive subsystem consumed ~23.4w.  I would guess that the drives and perhaps the Xiotech subsystem have considerable power savings algorithms that shed power when idle.  For whatever reason the SSDs and HP EVA don’t seem to have anything like this.  So nominal usage with 16Hrs of idle time penalizes the HP EVA SSD system resulting in the poors IOPS/watt for nominal usage shown above..

Rays reading: SSDs are not meant to be idled alot and disk drives, especially the ones that Xiotech are using have very sophisticated power management that maybe SSDs and/or HP should take a look at adopting.

As always, we welcome any suggestions on how to improve our analysis of SPC performance information so please comment here or drop us a line.

As always, we welcome any suggestions on how to improve our analysis of SPC performance information so please comment here or drop us a line.

SPC-1&-1/E results IOPS/Drive – chart of the month

Top IOPS(tm) per drive for SPC-1 & -1/E results as of 27May2010
Top IOPS(tm) per drive for SPC-1 & -1/E results as of 27May2010

The chart shown here reflects information from a SCI StorInt(tm) dispatch on the latest Storage Performance Council benchmark performance results and depicts the top IO operations done per second per installed drive for SPC-1 and SPC-1/E submissions.  This particular storage performance  metric is one of the harder ones to game.  For example, adding more drives to perform better does nothing for this view.

The recent SPC-1 submissions were from Huwaei Symantec’s Oceanspace S2600 and S5600, Fujitsu Eternus DX400 and DX8400 and the latest IBM DS8700 with EasyTier, SSD and SATA drives were added. Of these results, the only one to show up on this chart was the low-end Huawei Symantec S2600.  It used only 48 drives and attained ~17K IOPS as measured by SPC-1.

Other changes to this chart included the addition of Xiotech’s Emprise 5000 SPC-1/E  runs with both 146GB and 600GB drives.  We added the SPC-1/E results because they execute the exact same set of tests and generate the same performance summaries.

It’s very surprising to see the first use of 600GB drives in an SPC-1/E benchmark to show up well here and the very respectable #2 result from their 146GB drive version indicates excellent drive performance yields.  The only other non-146GB drive result was for the Fujitsu DX80 which used 300GB drives.

Also as readers of our storage performance dispatches may recall the Sun (now Oracle) J4400 array provided no RAID support for their benchmark run.  We view this as an unusable configuration and although it’s advantages vis a vis IOPS/drive are probably debatable.

A couple of other caveats to this comparison,

  • We do not include pure SSD configurations as they would easily dominate this metric.
  • We do not include benchmarks that use 73GB drives as they would offer a slight advantage and such small drives are difficult to purchase nowadays.

We are somewhat in a quandary about showing mixed drive (capacity) configurations.  In fact an earlier version of this chart without the two Xiotech SPC-1/E results showed the IBM DS8700 EasyTier configuration with SSDs and rotating SATA disks.  In that version the DS8700 came in at a rough tie with the then 7th place Fujitsu’s ETERNUS2000 subsystem.  For the time being, we have decided not to include mixed drive configurations in this comparison but would welcome any feedback on this decision.

As always, we appreciate any comments on our performance analysis.