Latest Microsoft ESRP ChampionsChart™ for over 5K mailboxes – chart of the month

(c) 2013 Silverton Consulting, Inc., All Rights Reserved
(c) 2013 Silverton Consulting, Inc., All Rights Reserved

The above, from our November 2012 StorInt Performance Dispatch, is another of our ChampionsCharts™ showing optimum storage performance.  This one displays the  Q4-2012, Exchange Solution Reviewed Program champions for the over 5,000 mailbox solutions.

All of SCI’s ChampionsCharts are divided into four quadrants, the best is upper right and is labeled Champions, the next best is upper left and is labeled Marathoners, then Sprinters and finally Slowpokes.

The ESRP chart shows relative database latency across the horizontal axis and relative log playback performance on the vertical access for all ESRP submissions for the over 5,000 mailbox category.  Systems provide better relative log playback performance the higher in the chart they are placed and better relative database latencies the further to the right one goes.

We believe that ESRP performance is a great way to see how well a similarly configured storage system will perform on the more diverse and mixed workloads seen in every day data center application environments.

All SCI ChampionsCharts represent normalized storage performance for the metrics we choose and as such for ESRP indicates that given the hardware used in the submission, Champions performed relatively better than expected in both log playback and database latency.  Marathoner systems performed relatively better in log playback and not as well in database latency.  Similarly, for the Sprinters quadrant, these systems provided relatively better database latency but worse log playback performance. In the Slowpokes quadrant these systems performed relatively worse on both performance measures\.

There are a gaggle of storage systems in the Champions quadrant.  Here we identify the top 5 systems that stand out over all the rest and are readily separated into 3 groups. Specifically;

Group 1 is a single system and is the IBM DS8700, which had both best relative log playback and database latency in their group of storage systems. The IBM DS8700 was configured to support 20,000 mailboxes.

Group 2 represents four storage systems of which two are the HDS USP-V (the previous generation of HDS VSP enterprise class storage) and the other two HDS AMS 2100 storage systems (the previous generation entry-level, mid-range class storage systems from HDS). The two USP-V systems were configured for 96,000 and 32,000 mailboxes respectively. The two AMS 2100 storage systems were configured to support 17,000 and 5,800 mailboxes respectively.

Group 3 represents a single storage system and is the HP4400 EVA configured with 6000 mailboxes.

The November ESRP Performance Dispatch identified top storage system performers for both the 1,001 to 5000 and 1000 and under mailbox categories as well.  More performance information and ChampionCharts for ESRP, SPC-1 and SPC-2 are  available in our SAN Storage Buying Guide, available for purchase on our website.

~~~~

The complete ESRP performance report went out in SCI’s November 2012 newsletter.  But a copy of the report will be posted on our dispatches page sometime this month (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

As always, we welcome any suggestions or comments on how to improve our ESRP  performance reports or any of our other storage performance analyses

SCI’s Latest ESRP (v3) Performance Analysis for Over 5K mailboxes – chart of the month

Bar chart showing ESRP Top 10 total database backup throughput results
(SCIESRP120125-001) (c) 2012 Silverton Consulting, All Rights Reserved

This chart comes from our analysis of Microsoft Exchange Reviewed Solutions Program (ESRP) v3 (Exchange 2010) performance results for the over 5000 mailbox category, a report sent out to SCI Storage Intelligence Newsletter subscribers last month.

The total database backup throughput reported here is calculated based on the MB read per second per database times the number of databases in a storage configuration. ESRP currently reports two metrics for database backups the first used in our calculation above is the backup throughput on a database basis and the second is backup throughput on a server basis.  I find neither of these that useful from a storage system perspective so we have calculated a new metric for our ESRP analysis which represents the total Exchange database backup per storage system.

I find three submissions (#1, #3 & #8 above) for the same storage system (HDS USP-V) somewhat unusual in any top 10 chart and as such, provides a unique opportunity to understand how to optimize storage for Exchange 2010.

For example, the first thing I noticed when looking at the details behind the above chart is that disk speed doesn’t speed up database throughput.  The #1, #3 and #8 submissions above (HDS USP-V using Dynamic or thin provisioning) had 7200rpm, 15Krpm and 7200rpm disks respectively with 512 disk each.

So what were the significant differences between the USP-V submissions (#1, #3 and #8) aside from mailbox counts and disk speed:

  • Mailboxes per server differed from 7000 to 6000 to 4700 respectively, with the top 10 median = 7500
  • Mailboxes per database differed from 583 to 1500 to 392, with the top 10 median = 604
  • Number of databases per host (Exchange server) differed from 12 to 4 to 12, with the top 10 median = 12
  • Disk size differed from 2TB to 600GB to 2TB, with the top 10 median = 2TB
  • Mailbox size differed from 1GB to 1GB to 3GB, with the top 10 median = 1.0 GB
  • % storage capacity used by Exchange databases differed from 27.4% to 80.0% to 55.1%, with the top 10 median = 60.9%

If I had to guess, the reason the HDS USP-V system with faster disks didn’t backup as well as the #1 system is that its mailbox databases spanned multiple physical disks.  For instance, in the case of the (#3) 15Krpm/600GB FC disk system it took at least 3 disks to hold a 1.5TB mailbox database.  For the #1 performing 7200rpm/2TB SATA disk system, a single disk could hold almost 4-583GB databases on a single disk.  The slowest performer (#8) also with 7200rpm/2TB SATA disks could hold at most 1-1.2TB mailbox database per disk.

One other thing that might be a factor between the #1 and #3 submissions is that the data being backed up per host was significantly different.  Specifically for a host in the #1 HDS USP-V solution they only backed up  ~4TB but for a host in the #3 submission they had to backup ~9TB.   However, this doesn’t help explain the #8 solution, which only had to backup 5.5TB per host.

How thin provisioning and average space utilization might have messed all this up is another question entirely.  RAID 10 was used for all the USP-V configurations, with a 2d+2d disk configuration per RAID group.  The LDEV configuration in the RAID groups was pretty different, i.e., for #1 & #8 there were two LDEVs one 2.99TB and the other .58TB whereas for the #3 there was one LDEV of 1.05TB.  These LDEVs were then added to Dynamic Provisioning pools for database and log files.  (There might be something in how the LDEVs were mapped to V-VOL groups but I can’t seem to figure it out.)

Probably something else I might be missing here but I believe a detailed study of these three HDS USP-V systems ESRP performance would illustrate some principles on how to accelerate Exchange storage performance.

I suppose the obvious points to come away with here are to keep your Exchange databases  smaller than your physical disk sizes and don’t overburden your Exchange servers.

~~~~

The full ESRP performance report went out to our newsletter subscriber’s this past January.  A copy of the full report will be up on the dispatches page of our website sometime next month (if all goes well). However, you can get the full ESRP performance analysis now and subscribe to future newsletters by just sending us an email or using the signup form above right.

For a more extensive discussion of block or SAN storage performance covering SPC-1&-2 (top 30) and ESRP (top 20) results please consider purchasing SCI’s SAN Storage Buying Guide available on our website.

As always, we welcome any suggestions on how to improve our analysis of ESRP results or any of our other storage performance analyses.

Comments?

 

Microsoft Exchange Performance, ESRP v3.0 results – chart of the month

(c) 2010 Silverton Consulting, Inc.
(c) 2010 Silverton Consulting, Inc.

There have been a number of Microsoft ESRP submissions this past quarter, especially in the over 5K mailbox category and they now total 12 submissions in this category alone.

The above chart is one or a series of charts from our recent StorInt(tm) dispatch on Exchange performance.   This chart displays an Exchange email counterpart to last month’s SpecSFS 2008 CIFS ORT chart only this time depicting the Top 10 Exchange database read, write and log latencies (sorted by read latency).

Except for the HP Smart Array (at #4) and Dell PowerVault MD1200 (#7), all the remaining submissions are FC attached subsystems.  The HP Smart Array and Dell exceptions used SAS attached storage.

For some reason the HP Smart Array had an almost immeasurable log write response time (<~0.1msec.) and a very respectable database read response time of 8.4msec.

As log writes are essentially sequential, we would expect a SAS/JBOD to do well here. But the random database reads and writes seem indicative of a well tuned, caching (sub-)system, not a JBOD!?

One secret to good Exchange 2010 JBOD performance appears to be matching your Exchange email database and log LUN size to disk drive size.  This seems to be a significant difference between Dell’s SAS storage and HP’s SAS storage.  For instance, both systems had 15Krpm SAS drives at ~600GB, but Dell’s LUN size was 13.4TB while HP’s database and log LUN size was 558GB.   Database and log LUN size relative to disk size didn’t seem to significantly impact Exchange performance for FC subsystems.

The other secret to good SAS Exchange 2010 performance is to stick with relatively small mailbox counts.  Both the HP and Dell JBODs had the smallest mailbox counts of this category at 6K and 7.2K respectively.

Exchange database write latency

There appears to be little correlation between read and write latencies in this data.  All of these results used Exchange database resiliency or DAGs, so they had similar types of database activity to contend with. Also the number of DAGs typically increased with higher mailbox counts but this wasn’t universal, e.g, the HDS AMS 2100 (#1) with 17.2K mailboxes had four DAGs while the last two IBM XIVs (#9&10) with 40K mailboxes had one each.  But the number of database activity groups shouldn’t matter much to Exchange database latencies.

On the other hand, the number of DAG copies may matter to Exchange write performance.  It is unclear how DAG copy writes are measured/simulated in Jetstress, the program used to drive ESRP workloads.   But, the number of database copies stood between two (#1,2,5,8&10) and three (#3,4,6,7&9) for all these submissions with no significant advantage for fewer copies.  So that’s not the answer.

I will make a stand here and say that high variability between read and write database latencies has something to do with storage (sub-)system caching effectiveness and Exchange 2010’s larger block sizes but it’s not clear from the available data.   However, this could easily be an artifact of the limited data available.

Why we like database access latency metrics

In our view, database read latencies correlates well with average Microsoft Exchange user experience for email read/search activities.  Also, log write and database write times can be good substitutes for Exchange Server email send times.  We like to think of database latencies as a end-user view of Exchange email performance.

The full ESRP v3.0 performance report will go up on SCI’s website next month in our dispatches directory.  However, if you are interested in receiving this sooner, just subscribe by email to our free newsletter and we will send you the current issue with download instructions for this and other reports.

Exchange 2010 is just a year old now and everyone is still trying to figure out how to perform well within the new architecture, so I expect some significant revisions to this chart over time.  Nonetheless, the current crop clearly indicates that there is a wide disparity in Exchange storage performance.

As always, we welcome any constructive comments on how to improve our analysis of ESRP results.

ESRP results over 5K mbox-chart of the month

ESRP Results, over 5K mailboxr, normalized (per 5Kmbx) read and write DB transfers as of 30 October 2009
ESRP Results, over 5K mailbox, normalized (per 5Kmbx) read and write DB transfers as of 30 October 2009

In our quarterly study on Exchange Solution Reviewed Program (ESRP) results we show a number of charts to get a good picture of storage subsystem performance under Exchange workloads. The two that are of interest to most data centers are both the normalized and un-normalized database transfer (DB xfer) charts. The problem with un-normalized DB xfer charts is that the subsystem supporting the largest mailbox count normally shows up best, and the rest of the results are highly correlated to mailbox count. In contrast, the normalized view of DB xfers tends to discount high mailbox counts and shows a more even handed view of performance.

 

We show above a normalized view of ESRP results for the category that were available last month. A couple of caveats are warranted here:

  • Normalized results don’t necessarily scale – results shown in the chart range from 5,400 mailboxes (#1) to 100,000 mailboxes (#6). While normalization should allow one to see what a storage subsystem could do for any mailbox count. It is highly unlikely that one would configure the HDS AMS2100 to support 100,000 mailboxes and it is equally unlikely that one would configure the HDS USP-V to support 5,400 mailboxes.
  • The higher count mailbox results tend to cluster when normalized – With over 20,000 mailboxes, one can no longer just use one big Exchange server and given the multiple servers driving the single storage subsystem, results tend to shrink when normalized. So one should probably compare like mailbox counts rather than just depend on normalization to iron out the mailbox count differences.

There are a number of storage vendors in this Top 10. There are no standouts here, the midrange systems from HDS, HP, and IBM seem to hold down the top 5 and the high end subsystems from EMC, HDS, and 3PAR seem to own the bottom 5 slots.

However, Pillar is fairly unusual in that their 8.5Kmbx result came in at #4 and their 12.8Kmbx result came in at #8. In contrast, the un-normalized results for this chart appear exactly the same. Which brings up yet another caveat, when running two benchmarks with the same system, normalization may show a difference where none exists.

The full report on the latest ESRP results will be up on our website later this month but if you want to get this information earlier and receive your own copy of our newsletter – just subscribe by emailing us.