Microsoft ESRP database access latencies – chart of the month

sciesrp1030-001
The above chart was included in last month’s SCI e-Newsletter and depicts recent Microsoft Exchange (2013) Solution Reviewed Program (ESRP) results for database access latencies. Storage systems new to this 5000 mailbox and over category include, Oracle, Pure and Tegile. As all of these systems are all flash arrays we are starting to see significant reductions in database access latencies and the #4 system (Nimble Storage) was a hybrid (disk and flash) array.

As you recall, ESRP reports on three database access latencies: read database, write database and log write. All three are shown above but we sort and rank this list based on database read activity alone.

Hard to see above, but reading the ESRP reports, one finds that the top 3 systems had 1.04, 1.06 and 1.07 milliseconds. average read database latencies. So the separation between the top 3 is less than 40 microseconds.

The top 3 database write access latencies were 1.75, 1.62 and 3.07 milliseconds, respectively. So if we were ranking the above on write response times Pure would have come in #1.

The top 3 log write access latencies were 0.67, 0.41 and 0.82 milliseconds and once again if we were ranking based on log response times Pure would be #1.

It’s unclear whether Exchange customers would want to deploy AFAs for their database and log files but these three ESRP reports and Nimble’s show that there should be no problem with the performance of AFAs in these environments.

What about data reduction?

Unclear to me is how much data reduction technologies played in the AFA and hybrid solution ESRP performance. Data reduction advantages would most likely show up in database IOPS counts more so than response times but if present, may still reduce access latencies, as there would be potentially less data to be transferred to-from the backend of the storage system into-out of storage system cache.

ESRP reports do not officially report on a vendor’s data reduction effectiveness, so we are left with whatever the vendor decides to say.

In that respect, Pure FlashArray//m20 indicated in their ESRP report that their “data reduction is significantly higher” than what they see normally (4:1) because Jetstress (ESRP benchmark program) generates lots of duplicated data.

I couldn’t find anything that Tegile T3800 or Nimble Storage said similar to the above, indicating how well their data reduction technologies worked in Jetstress as compared to normal. However, they did make a reference to their compression effectiveness in database size but I have found this number to be somewhat less effective as it historically showed the amount of over provisioning used by disk-only systems and for AFA’s and hybrid – storage, it’s unclear how much is data reduction effectiveness vs. over provisioning.

For example, Pure, Tegile and Nimble also reported a “database capacity utilization” of 4.2%60% and 74.8% respectively. And Nimble did report that over their entire customer base, Exchange data has on average, a 61.2% capacity savings.

So you tell me what was the effective data reduction for their Pure’s, Tegile’s and Nimble’s respective Jetstress runs? From my perspective Pure’s report of 4.2% looks about right (that says that actual database data fit in 4.2% of SSD storage for a ~23.8:1 reduction effectiveness for Jetstress/ESRP data. I find it harder to believe what Tegile and Nimble have indicated as it doesn’t seem to be as believable as they would imply a 1.7:1 and 1.33:1 reduction effectiveness for Jetstress/ESRP data.

Oracle FS1-2 doesn’t seem to have any data reduction capabilities and reported a 100% storage capacity used by Exchange database.

So that’s it, Jetstress uses “significantly reducible” data for some AFAs systems. But in the field the advantage of data reduction techniques are much less so.

I think it’s time that ESRP stopped using significantly reducible data in their Jetstress program and tried to more closely mimic real world data.

Want more?

The October 2016 and our other ESRP reports have much more information on Microsoft Exchange performance. Moreover, there’s a lot more performance information, covering email and other (OLTP and throughput intensive) block storage workloads, in our SAN Storage Buying Guide, available for purchase on our website. More information on file and block protocol/interface performance is also included in SCI’s SAN-NAS Buying Guidealso available from our website. And if your interested in file system performance please consider purchasing our NAS Buying Guide also available on our website.

~~~~

The complete ESRP performance report went out in SCI’s October 2016 Storage Intelligence e-newsletter.  A copy of the report will be posted on our SCI dispatches (posts) page over the next quarter or so (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free SCI Storage Intelligence e-newsletters, by just using the signup form in the sidebar or you can subscribe here.

As always, we welcome any suggestions or comments on how to improve our ESRP  performance reports or any of our other storage performance analyses.

 

Microsoft ESRP database transfer performance by storage interface – chart of the month

SCIESRP160728-001The above chart was included in our e-newsletter Microsoft Exchange Solution Reviewed Program (ESRP) performance report, that went out at the end of July. ESRP reports on a number of metrics but one of the more popular is total (reads + writes) Exchange database transfers per second.

Categories reported on in ESRP include: over 5,000 mailboxes; 1001 to 5000 mailboxes; and 1000 and under mailboxes. For the above chart we created our own category using all submissions up to 10,000 mailboxes. Then we grouped the data using the storage  interface between the host Exchange servers and the storage, and only included ESRP reports that had 10 KRPM disk drives.
Continue reading “Microsoft ESRP database transfer performance by storage interface – chart of the month”

Microsoft Exchange database backup performance – chart of the month

Microsoft Exchange 1001-5000 mailboxes, top 10 database backup per server
In last month’s Storage Intelligence newsletter we discussed the latest Exchange storage system performance for 1001 to 5000 mailboxes. One  charts we updated was the above Exchange database backup on a per server basis. The were two new submissions for this quarter, and both the Dell PowerEdge R730xd (#2 above) and the HP D3600 drive shelf with P441 storage controller (#10) ranked well on this metric.

This ESRP reported metric only measures backup throughput at a server level. However, because these two new submissions only had one server, it’s not as much of a problem here.

The Dell system had a SAS connected JBOD with 14-4TB 7200RPM disks and the HP system had a SAS connected JBOD with 11-6TB 7200RPM disks. The other major difference is that the HP system had 4GB of “flash backed write cache” and the Dell system only had 2GB of  “flash backed cache”.

As far as I can tell the fact that the Dell storage managed ~2.3GB/sec. and the HP storage only managed ~1.1GB/sec is probably mostly due to their respective drive configurations than anything else.

RAID 0 vs. RAID 1

One surprising characteristic of the HP setup is that they used RAID 0 while the Dell system used RAID1. This would offer a significant benefit to the Dell system during heavy read activity, but as I understand it, the database backup activity is run with a standard email stress environment. So in this case, there is a healthy mix of reads/writes going on at the time the backup activity. So the Dell system would have an advantage for reads and a penalty for writes (writing two copies of all data). So Dell’s RAID advantage is probably a wash.

Whether RAID 0 vs. RAID 1 would have made any difference to other ESRP metrics (database transfers per second, read/write/log access latencies, log processing, etc.) is subject for another post.

Of course,  with Exchange DAG’s there’s built in database redundancy so maybe RAID 0 is an OK configuration for some customers. Software based redundancy does seem to be Microsoft’s direction, at least since Exchange 2010, so maybe I’m the one that’s out of touch.

Still for such a small configuration I’m not sure I would have gone with RAID 0…

Comments?

Latest Microsoft ESRP ChampionsChart™ for over 5K mailboxes – chart of the month

(c) 2013 Silverton Consulting, Inc., All Rights Reserved
(c) 2013 Silverton Consulting, Inc., All Rights Reserved

The above, from our November 2012 StorInt Performance Dispatch, is another of our ChampionsCharts™ showing optimum storage performance.  This one displays the  Q4-2012, Exchange Solution Reviewed Program champions for the over 5,000 mailbox solutions.

All of SCI’s ChampionsCharts are divided into four quadrants, the best is upper right and is labeled Champions, the next best is upper left and is labeled Marathoners, then Sprinters and finally Slowpokes.

The ESRP chart shows relative database latency across the horizontal axis and relative log playback performance on the vertical access for all ESRP submissions for the over 5,000 mailbox category.  Systems provide better relative log playback performance the higher in the chart they are placed and better relative database latencies the further to the right one goes.

We believe that ESRP performance is a great way to see how well a similarly configured storage system will perform on the more diverse and mixed workloads seen in every day data center application environments.

All SCI ChampionsCharts represent normalized storage performance for the metrics we choose and as such for ESRP indicates that given the hardware used in the submission, Champions performed relatively better than expected in both log playback and database latency.  Marathoner systems performed relatively better in log playback and not as well in database latency.  Similarly, for the Sprinters quadrant, these systems provided relatively better database latency but worse log playback performance. In the Slowpokes quadrant these systems performed relatively worse on both performance measures\.

There are a gaggle of storage systems in the Champions quadrant.  Here we identify the top 5 systems that stand out over all the rest and are readily separated into 3 groups. Specifically;

Group 1 is a single system and is the IBM DS8700, which had both best relative log playback and database latency in their group of storage systems. The IBM DS8700 was configured to support 20,000 mailboxes.

Group 2 represents four storage systems of which two are the HDS USP-V (the previous generation of HDS VSP enterprise class storage) and the other two HDS AMS 2100 storage systems (the previous generation entry-level, mid-range class storage systems from HDS). The two USP-V systems were configured for 96,000 and 32,000 mailboxes respectively. The two AMS 2100 storage systems were configured to support 17,000 and 5,800 mailboxes respectively.

Group 3 represents a single storage system and is the HP4400 EVA configured with 6000 mailboxes.

The November ESRP Performance Dispatch identified top storage system performers for both the 1,001 to 5000 and 1000 and under mailbox categories as well.  More performance information and ChampionCharts for ESRP, SPC-1 and SPC-2 are  available in our SAN Storage Buying Guide, available for purchase on our website.

~~~~

The complete ESRP performance report went out in SCI’s November 2012 newsletter.  But a copy of the report will be posted on our dispatches page sometime this month (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

As always, we welcome any suggestions or comments on how to improve our ESRP  performance reports or any of our other storage performance analyses

Latest ESRP results for 1K and under mailboxes – chart of the month

SCIESRP120724(004) (c) 2012 Silverton Consulting, All Rights Reserved

The above chart was from our July newsletter Exchange Solution Reviewed Program (ESRP) performance analysis for 1000 and under mailbox submissions. I have always liked response times as they seem to be mostly the result of tight engineering, coding and/or system architecture.  Exchange response times represent a composite of how long it takes to do a database transaction (whether read, write or log write).  Latencies are measured at the application (Jetstress) level.

On the chart we show the top 10 data base read response times for this class of storage.  We assume that DB reads are a bit more important than writes or log activity but they are all probably important.  As such,  we also show the response times for DB writes and log writes but the ranking is based on DB reads alone.

In the chart above, I am struck by the variability in write and log write performance.  Writes range anywhere from ~8.6 down to almost 1 msec. The extreme variability here begs a bunch of questions.  My guess is the wide variability probably signals something about caching, whether it’s cache size, cache sophistication or drive destage effectiveness is hard to say.

Why EMC seems to dominate DB read latency in this class of storage is also interesting. EMC’s Celerra NX4, VNXe3100, CLARiiON CX4-120, CLARiiON AX4-5i, Iomega ix12-300 and VNXe3300 placed in the top 6 slots, respectively.  They all had a handful of disks (4 to 8), mostly 600GB or larger and used iSCSI to access the storage.  It’s possible that EMC has a great iSCSI stack, better NICs or just better IO scheduling. In any case, they have done well here at least with read database latencies.  However, their write and log latency was not nearly as good.

We like ESRP because it simulates a real application that’s pervasive in the enterprise today, i.e., email.  As such, it’s less subject to gaming, and typically shows a truer picture of multi-faceted storage performance.

~~~~

The complete ESRP performance report with more top 10 charts went out in SCI’s July newsletter.  But a copy of the report will be posted on our dispatches page sometime next month (if all goes well).  However, you can get the ESRP performance analysis now and subscribe to future free newsletters by just using the signup form above right.

For a more extensive discussion of current SAN block system storage performance covering SPC (Top 30) results as well as ESRP results with our new ChampionsChart™ for SAN storage systems, please see SCI’s SAN Storage Buying Guide available from our website.

As always, we welcome any suggestions or comments on how to improve our analysis of ESRP results or any of our other storage performance analyses.


Latest ESRP 1K-5K mailbox DB xfers/sec/disk results – chart-of-the-month

(SCIESRP120429-001) 2012 (c) Silverton Consulting, All Rights Reserved

The above chart is from our April newsletter on Microsoft Exchange 2010 Solution Reviewed Program (ESRP) results for the 1,000 (actually 1001) to 5,000 mailbox category.  We have taken the database transfers per second, normalized them for the number of disk spindles used in the run and plotted the top 10 in the chart above.

A couple of caveats first, we chart disk-only systems in this and similar charts  on disk spindle performance. Although, it probably doesn’t matter as much at this mid-range level, for other categories SSD or Flash Cache can be used to support much higher performance on a per spindle performance measure like the above.  As such, submissions with SSDs or flash cache are strictly eliminated from these spindle level performance analysis.

Another caveat, specific to this chart is that ESRP database transaction rates are somewhat driven by Jetstress parameters (specifically simulated IO rate) used during the run.  For this mid-level category, this parameter can range from a low of 0.10 to a high of 0.60 simulated IO operations per second with a median of ~0.19.  But what I find very interesting is that in the plot above we have both the lowest rate (0.10 in #6, Dell PowerEdge R510 1.5Kmbox) and the highest (0.60 for #9, HP P2000 G3 10GbE iSCSI MSA 3.2Kmbx).  So that doesn’t seem to matter much on this per spindle metric.

That being said, I always find it interesting that the database transactions per second per disk spindle varies so widely in ESRP results.  To me this says that storage subsystem technology, firmware and other characteristics can still make a significant difference in storage performance, at least in Exchange 2010 solutions.

Often we see spindle count and storage performance as highly correlated. This is definitely not the fact for mid-range ESRP (although that’s a different chart than the one above).

Next, we see disk speed (RPM) can have a high impact on storage performance especially for OLTP type workloads that look somewhat like Exchange.  However, in the above chart the middle 4 and last one (#4-7 & 10) used 10Krpm (#4,5) or slower disks.  It’s clear that disk speed doesn’t seem to impact Exchange database transactions per second per spindle either.

Thus, I am left with my original thesis that storage subsystem design and functionality can make a big difference in storage performance, especially for ESRP in this mid-level category.  The range in the top 10 contenders spanning from ~35 (Dell PowerEdge R510) to ~110 (Dell EqualLogic PS Server) speaks volumes on this issue or a multiple of over 3X from top to bottom performance on this measure.  In fact, the overall range (not shown in the chart above spans from ~3 to ~110 which is a factor of almost 37 times from worst to best performer.

Comments?

~~~~

The full ESRP 1K-5Kmailbox performance report went out in SCI’s April newsletter.  But a copy of the full report will be posted on our dispatches page sometime next month (if all goes well). However, you can get the full SPC performance analysis now and subscribe to future free newsletters by just sending us an email or using the signup form above right.

For a more extensive discussion of current SAN or block storage performance covering SPC-1 (top 30)SPC-2 (top 30) and all three levels of ESRP (top 20) results please see SCI’s SAN Storage Buying Guide available on our website.

As always, we welcome any suggestions or comments on how to improve our analysis of ESRP results or any of our other storage performance analyses.


SCI’s Latest ESRP (v3) Performance Analysis for Over 5K mailboxes – chart of the month

Bar chart showing ESRP Top 10 total database backup throughput results
(SCIESRP120125-001) (c) 2012 Silverton Consulting, All Rights Reserved

This chart comes from our analysis of Microsoft Exchange Reviewed Solutions Program (ESRP) v3 (Exchange 2010) performance results for the over 5000 mailbox category, a report sent out to SCI Storage Intelligence Newsletter subscribers last month.

The total database backup throughput reported here is calculated based on the MB read per second per database times the number of databases in a storage configuration. ESRP currently reports two metrics for database backups the first used in our calculation above is the backup throughput on a database basis and the second is backup throughput on a server basis.  I find neither of these that useful from a storage system perspective so we have calculated a new metric for our ESRP analysis which represents the total Exchange database backup per storage system.

I find three submissions (#1, #3 & #8 above) for the same storage system (HDS USP-V) somewhat unusual in any top 10 chart and as such, provides a unique opportunity to understand how to optimize storage for Exchange 2010.

For example, the first thing I noticed when looking at the details behind the above chart is that disk speed doesn’t speed up database throughput.  The #1, #3 and #8 submissions above (HDS USP-V using Dynamic or thin provisioning) had 7200rpm, 15Krpm and 7200rpm disks respectively with 512 disk each.

So what were the significant differences between the USP-V submissions (#1, #3 and #8) aside from mailbox counts and disk speed:

  • Mailboxes per server differed from 7000 to 6000 to 4700 respectively, with the top 10 median = 7500
  • Mailboxes per database differed from 583 to 1500 to 392, with the top 10 median = 604
  • Number of databases per host (Exchange server) differed from 12 to 4 to 12, with the top 10 median = 12
  • Disk size differed from 2TB to 600GB to 2TB, with the top 10 median = 2TB
  • Mailbox size differed from 1GB to 1GB to 3GB, with the top 10 median = 1.0 GB
  • % storage capacity used by Exchange databases differed from 27.4% to 80.0% to 55.1%, with the top 10 median = 60.9%

If I had to guess, the reason the HDS USP-V system with faster disks didn’t backup as well as the #1 system is that its mailbox databases spanned multiple physical disks.  For instance, in the case of the (#3) 15Krpm/600GB FC disk system it took at least 3 disks to hold a 1.5TB mailbox database.  For the #1 performing 7200rpm/2TB SATA disk system, a single disk could hold almost 4-583GB databases on a single disk.  The slowest performer (#8) also with 7200rpm/2TB SATA disks could hold at most 1-1.2TB mailbox database per disk.

One other thing that might be a factor between the #1 and #3 submissions is that the data being backed up per host was significantly different.  Specifically for a host in the #1 HDS USP-V solution they only backed up  ~4TB but for a host in the #3 submission they had to backup ~9TB.   However, this doesn’t help explain the #8 solution, which only had to backup 5.5TB per host.

How thin provisioning and average space utilization might have messed all this up is another question entirely.  RAID 10 was used for all the USP-V configurations, with a 2d+2d disk configuration per RAID group.  The LDEV configuration in the RAID groups was pretty different, i.e., for #1 & #8 there were two LDEVs one 2.99TB and the other .58TB whereas for the #3 there was one LDEV of 1.05TB.  These LDEVs were then added to Dynamic Provisioning pools for database and log files.  (There might be something in how the LDEVs were mapped to V-VOL groups but I can’t seem to figure it out.)

Probably something else I might be missing here but I believe a detailed study of these three HDS USP-V systems ESRP performance would illustrate some principles on how to accelerate Exchange storage performance.

I suppose the obvious points to come away with here are to keep your Exchange databases  smaller than your physical disk sizes and don’t overburden your Exchange servers.

~~~~

The full ESRP performance report went out to our newsletter subscriber’s this past January.  A copy of the full report will be up on the dispatches page of our website sometime next month (if all goes well). However, you can get the full ESRP performance analysis now and subscribe to future newsletters by just sending us an email or using the signup form above right.

For a more extensive discussion of block or SAN storage performance covering SPC-1&-2 (top 30) and ESRP (top 20) results please consider purchasing SCI’s SAN Storage Buying Guide available on our website.

As always, we welcome any suggestions on how to improve our analysis of ESRP results or any of our other storage performance analyses.

Comments?

 

ESRP v3 (Exchange 2010) log playback results, 1Kmbox&under – chart-of-the-month

(SCIESRP111029-003) (c) 2011 Silverton Consulting, All Rights Reserved
(SCIESRP111029-003) (c) 2011 Silverton Consulting, All Rights Reserved

The above chart is from our last Exchange [2010] Solution Review Program (ESRP) performance dispatch released in our October newsletter (sign-up upper right).  The 1K mailbox and under category for ESRP represents Exchange storage solutions for SMB data centers.

As one can see from the above the NetApp FAS2040 has done well but an almost matching result came in from the HP P2000 G3 MSA system.  What’s not obvious here is that the FAS2040 had 8 disks and the P2000 had 78 so there was quite a difference in the spindle counts. The #3&4 runs from EMC VNXe3100 also posted respectable results (within 1sec of top performer) and only had 5 and 7 disks respectively, so they were much more inline with the FAS2040 run.  The median number of drives for this category is 8 drives which probably makes sense for SMB storage solutions.

Why log playback

I have come to prefer a few metrics in the Exchange 2010 arena that seem to me to capture a larger part of the information available from an ESRP report.  The Log Playback metric is one of them that seems to me to fit the bill nicely.  Specifically:

  • It doesn’t depend on the Jetstress IO/rate parameter that impacts the database transfers per second rate.  The log playback is just the average time it takes to playback a 1MB log file against a database.
  • It is probably a clear indicator of how well a storage system (configured matching the ESRP) can support DAG log processing.

In addition, I believe Log Playback is a great stand-in for any randomized database transaction processing. Now I know that Exchange is not necessarily a pure relational database but it does have a significant component of indexes, tables, and sequentiality to it.

My problem is that there doesn’t seem to be any other real database performance benchmark out there for storage.  I know that TPC has a number of benchmarks tailored to database transaction activity but these seem to be more a measure of the database server than the storage.  SPC-2 has some database oriented queries but it’s generally focused on through put and doesn’t really represent randomized database activity and for other reasons it’s not as highly used as SPC-1 or ESRP so there is not as much data to report on.

That leaves ESRP.  For whatever reason (probably the popularity of Exchange), almost everyone submits for ESRP. Which makes it ripe for product comparisons.

Also, there are a number of other good metrics in ESRP results that I feel have general applicability outside Exchange as well.  I will be reporting on them in future posts.

~~~~

Comments?

Sorry, I haven’t been keeping up with our chart-of-the-month posts, but I promise to do better in the future.  I plan to be back in synch with our newsletter dispatches before month end.

The full ESRP performance report for the 1K and under mailbox category went out to our newsletter subscriber’s last October.  A copy of the full report will be up on the dispatches page of our website sometime this month (if all goes well). However, you can get performance information now and subscribe to future newsletters to receive these reports even earlier by just sending us an email or using the signup form above right.

For a more extensive discussion of block storage performance in ESRP (top 20) and SPC-1&-2 (top 30) results please consider purchasing our recently updated SAN Storage Buying Guide available on our website.

As always, we welcome any suggestions on how to improve our analysis of ESRP results or any of our other storage system performance discussions.