Latest Microsoft ESRP ChampionsChart™ for over 5K mailboxes – chart of the month

(c) 2013 Silverton Consulting, Inc., All Rights Reserved
(c) 2013 Silverton Consulting, Inc., All Rights Reserved

The above, from our November 2012 StorInt Performance Dispatch, is another of our ChampionsCharts™ showing optimum storage performance.  This one displays the  Q4-2012, Exchange Solution Reviewed Program champions for the over 5,000 mailbox solutions.

All of SCI’s ChampionsCharts are divided into four quadrants, the best is upper right and is labeled Champions, the next best is upper left and is labeled Marathoners, then Sprinters and finally Slowpokes.

The ESRP chart shows relative database latency across the horizontal axis and relative log playback performance on the vertical access for all ESRP submissions for the over 5,000 mailbox category.  Systems provide better relative log playback performance the higher in the chart they are placed and better relative database latencies the further to the right one goes.

We believe that ESRP performance is a great way to see how well a similarly configured storage system will perform on the more diverse and mixed workloads seen in every day data center application environments.

All SCI ChampionsCharts represent normalized storage performance for the metrics we choose and as such for ESRP indicates that given the hardware used in the submission, Champions performed relatively better than expected in both log playback and database latency.  Marathoner systems performed relatively better in log playback and not as well in database latency.  Similarly, for the Sprinters quadrant, these systems provided relatively better database latency but worse log playback performance. In the Slowpokes quadrant these systems performed relatively worse on both performance measures\.

There are a gaggle of storage systems in the Champions quadrant.  Here we identify the top 5 systems that stand out over all the rest and are readily separated into 3 groups. Specifically;

Group 1 is a single system and is the IBM DS8700, which had both best relative log playback and database latency in their group of storage systems. The IBM DS8700 was configured to support 20,000 mailboxes.

Group 2 represents four storage systems of which two are the HDS USP-V (the previous generation of HDS VSP enterprise class storage) and the other two HDS AMS 2100 storage systems (the previous generation entry-level, mid-range class storage systems from HDS). The two USP-V systems were configured for 96,000 and 32,000 mailboxes respectively. The two AMS 2100 storage systems were configured to support 17,000 and 5,800 mailboxes respectively.

Group 3 represents a single storage system and is the HP4400 EVA configured with 6000 mailboxes.

The November ESRP Performance Dispatch identified top storage system performers for both the 1,001 to 5000 and 1000 and under mailbox categories as well.  More performance information and ChampionCharts for ESRP, SPC-1 and SPC-2 are  available in our SAN Storage Buying Guide, available for purchase on our website.

~~~~

The complete ESRP performance report went out in SCI’s November 2012 newsletter.  But a copy of the report will be posted on our dispatches page sometime this month (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

As always, we welcome any suggestions or comments on how to improve our ESRP  performance reports or any of our other storage performance analyses

New deduplication solutions from Sepaton and NEC

In the last few weeks both Sepaton and NEC have announced new data deduplication appliance hardware and for Sepaton at least, new functionality. Both of these vendors compete against solutions from EMC Data Domain, IBM ProtectTier, HP StoreOnce and others.

Sepaton v7.0 Enterprise Data Protection

From Sepaton’s point of view data growth is exploding, with little increase in organizational budgets and system environments are becoming more complex with data risks expanding, not shrinking. In order to address these challenges Sepaton has introduced a new version of their hardware appliance with new functionality to help address the rising data risks.

Their new S2100-ES3 Series 2925 Enterprise Data Protection Platform with latest Sepaton software now supports up to 80 TB/hour of cluster data ingest (presumably with Symantec OST) and up to 2.0 PB of raw storage in an 8-node cluster. The new appliances support 4-8Gbps FC and 2-10GbE host ports per node, based on HP DL380p Gen8 servers with Intel Xeon E5-2690 processors, 8 core, dual 2.9Ghz CPU, 128 GB DRAM and a new high performance compression card from EXAR. With the bigger capacity and faster throughput, enterprise customers can now support large backup data streams with fewer appliances, reducing complexity and maintenance/licensing fees. S2100-ES3 Platforms can scale from 2 to 8 nodes in a single cluster.

The new appliance supports data-at-rest encryption for customer data security as well as data compression, both of which are hardware based, so there is no performance penalty. Also, data encryption is an optional licensed feature and uses OASIS KMIP 1.0/1.1 to integrate with RSA, Thales and other KMIP compliant, enterprise key management solutions.

NEC HYDRAstor Gen 4

With Gen4 HYDRAstor introduces a new Hybrid Node which contains both the logic for accelerator nodes and capacity for storage nodes, in one 2U rackmounted server. Before the hybrid node similar capacity and accessibility would have required 4U of rack space, 2U for the accelerator node and another 2U for the storage node.

The HS8-4000 HN supports 4.9TB/hr standard or 5.6TB/hr per node with NetBackup OST IO express ingest rates and 12-4TB, 3.5in SATA drives, with up to 48TB of raw capacity. They have also introduced an HS8-4000 SN which just consists of the 48TB of additional storage capacity. Gen4 is the first use of 4TB drives we have seen anywhere and quadruples raw capacity per node over the Gen3 storage nodes. HYDRAstor clusters can scale from 2- to 165-nodes and performance scales linearly with the number of cluster nodes.

With the new HS8-4000 systems, maximum capacity for a 165 node cluster is now 7.9PB raw and supports up to 920.7 TB/hr (almost a PB/hr, need to recalibrate my units) with an all 165-HS8-4000 HN node cluster. Of course, how many customers need a PB/hr of backup ingest is another question. Let alone, 7.9PB of raw storage which of course gets deduplicated to an effective capacity of over 100PBs of backup data (or 0.1EB, units change again).

NEC has also introduced a new low end appliance the HS3-410 for remote/branch office environments that has a 3.2TB/hr ingest with up to 24TB of raw storage. This is only available as a single node system.

~~~~
Maybe Facebook could use a 0.1EB backup repository?

Image: Intel Team Inside Facebook Data Center by IntelFreePress

 

Cold hands, better exercise

Read an article a couple of weeks ago on Stanford researchers that are inventing a glove that can warm or cool someone’s hands.  (See Stanford’s cooling glove research). They found that the palm is one of the easiest way to warm or cool a body.

Originally they were researching how bears cooled themselves during summer and discovered that certain areas of their skin was optimized for cooling.  These patches of skin had more blood vessels than necessary for nutrient delivery and seemed to be optimized for blood flow and bodily thermal management. In the case of the bears they were studying these thermal control areas happened to be their palms and feet pads.

They next took their idea and created a crude prototype of a warming glove and used it to warm up patients after surgery.  This usually takes the better part of 2-3 hours to do for patients but with their warming glove on, they were able to warm them in a 8-10 minutes instead of hours.

It appears that all mammals have a built in cooling mechanism, for some it’s ears (rabbits), others it’s the tongue (dogs) and for humans and primates it’s their palms and feet pads.  These areas are used primarily for bodily thermal management and is used essentially as a way to cool off hairy mammals such as ourselves.

But why a cooling glove?

Unclear to me where they got the idea but somehow they discovered that one item which limited exercise intensity was the overheating of muscle tissue.  As muscles are exercised they warm up and an enzyme used by muscles to generate energy  heats up it breaks down and starts to work less effectively providing a built in safety switch for muscle overheating.  It turns out that heat is a key item limiting muscle recovery and inducing muscle fatigue.

The researchers found was that cooling the palms after exercise allowed a person to continue to work at their maximum level without fatigue or degradation. In the case of pull ups, they found that a person who was properly cooled could continue to do their maximum pullups time after time, without any reduction in reps.  Indeed one gym rat was able to work themselves from a maximum of 160 pullups to a maximum of 620 pull-ups  in just six weeks. “Better than steroids” and legal.

So the glove is being developed that can be used to cool athletes down. Prototypes are currently in use by Stanford’s football team, the Oakland Raiders, the San Francisco 49ers, and Manchester United.

Other potential applications

This is full of much more possibilities than just a cooling glove.  For example,

  • Bicycles – that instead of having insulated handlebars were metal and perforated so that they increased air flow to cool down the palms of bicycle racers
  • Weight machines  – that have hand holds that are suffused with liquids and some sort of a radiator device attachment that would cool the liquid to cool the hands
  • Barbell refrigerators – similar to the above only keeping the bars in a refrigerator to keep them cool to lower the temperature of the palms of weight lifters as they used them to lift weights.  Ditto for dumbbells, kettlebells, medicine balls, etc.
  • Treadmills – that have an onboard cooling mechanism to cool the hands of people using them. Ditto for rowing machines, nordic track, elipticals etc.
  • Tennis rackets – that have perforated handles without any insulation which could be used to cool down tennis players hands. Ditto for racquetball rackets, squash rackets, etc.
  • Baseball bats, golf clubs, hockey sticks, lacrosse sticks, etc.  – essentially any other sporting equipment that has a hand held artifact could be improved by having some sort of built in cooling mechanism, or worse case have a cabinet (bag) etc, which could cool these to the proper temperature.

I suppose one key to any of this is what’s the proper cooling temperature and whether any of these sports cause some (any) muscles to overheat.  Perforations could be tailored to reach the proper cooling temperature and for the sport/speed with which the artifact is moved.  Probably another reason for running barefoot or using Vibram five fingers shoes as they seem to cool down the foot pad.

I find myself looking for places to cool my palms between workout sets to reduce fatigue. It may be only psychosomatic because it’s certainly scientifically controlled, but it seems to be helping.  Also when I run/jog nowadays I am doing so with an open hand rather than a closed fist, hoping that this helps cool me down.

~~~~

I read this a while back and couldn’t stop thinking about all the possibilities inherent in their research. Yes a glove is probably a great, portable and universal way to cool people during exercise but there are so many other possibilities that could much more easily be employed to have the same effect.

Image Credits: bangkok by Roberto Trmbunch o racquets by dennis

Thinly provisioned compute clouds

Thin provisioning has been around in storage since StorageTek’s Iceberg hit the enterprise market in 1995.  However, thin provisioning has never taken off for system servers or virtual machines (VMs).

But recently a paper out of MIT Making cloud computing more efficient discusses some recent research that came up with the idea of monitoring system activity to model and predict application performance.

So how does this enable thinly provision VMs?

With a model like this in place, one could concievably provide a thinly provisioned virtual server that could guarantee a QoS and still minimize resource consumption.  For example, have the application VM just consume the resources needed at any instant in time which could be adjusted as demands on the system change.  Thus, as an application  needs grew, more resources could be supplied and as needs shrink, resources could be given up for other uses.

With this sort of server QoS, certain classes of application VMs would need to have variable or no QoS to be sacrificed in times of need to those that required guaranteed QoS. But in a cloud service environment a multiplicity of service classes like these could be supplied at different price points.

Thin provisioning grew up in storage because it’s relatively straightforward for a storage subsystem to understand capacity demands at any instant in time.  A storage system only needs to monitor data write activity and if a data block was written or consumed then it would be backed by real storage. If it had never been written, then it was relatively easy to fabricate a block of zeros if it ever was read.

Prior to thinly provisioned storage, fat provisioning required that storage be configured to the maximum capacity required of it. Similarly, with fully (or fat) provisioned VMs, they must be configured for peak workloads. With the advent of thin provisioning on storage wasted resources (capacity in the case of storage) could be shared across multiple thinly provisioned volumes (LUNs) thereby freeing up these resources for other users.

Problems with server thin provisioning

I see some potential problems with the model and my assumptions as to how thinly provisioned VM would wore. First, the modeled performance is a lagging indicator at best.  Just as system transactions start to get slower, a hypervisor would need to interrupt the VM to add more physical (or virtual) resources.  Naturally during the interruption system performance would suffer.

It would be helpful if resources could be added to a VM dynamically, in real time without impacting the applications running in the VM. But it seems to me that adding physical or virtual CPU cores,  memory, bandwidth, etc., to a VM would require at least some sort of interruption to a pair of VMs [the one giving up the resource(s) and the one gaining the freed up resource(s)].

Similar issues occur for thinly provisioned storage. As storage is consumed for a thinly provisioned volume, allocating more physical capacity takes some amount of storage subsystem resources and time to accomplish.

How does the model work?

It appears that the software model works by predicting system performance based on a limited set of measurements. Indeed, their model is bi-modal. That is there are two approaches:

  • Black box model – tracks server or VM indictors such as “number and type of user requests” as well as system performance and uses AI to correlate the two. This works well for moderate fluctuations in demand but doesn’t help when requests for services falls beyond those boundaries.
  • Grey box model – is more sophisticated and is based on an understanding of a specific database functionality, such as how frequently they flush host buffers, commit transactions to disk logs, etc.  In this case, they are able to predict system performance when demand peaks at 4X to 400X current system requirements.

They have implemented the grey box model for MySQL and are in the process of doing the same for PostGres.

Model validation and availability

They tested their prediction algorithm against published TPC-C benchmark results and were able to come within 80% accuracy for CPU use and 99% accuracy for disk bandwidth consumption.

It appears that the team has released their code as open source. At least one database vendor, Teradata is porting it over to their own database machine to better allocate physical resources to data warehouse queries.

It seems to me that this would be a natural for cloud compute providers and even more important for hypervisor solutions such as vSphere, Hyper-V, etc.  Anyplace one could use more flexibility in assigning virtual or physical resources to an application or server would find use for this performance modeling.

~~~~

Now, if they could just do something to help create thinly provisioned highways, …

Image: Intel Team Inside Facebook Data Center By IntelFreePress

The shrinking low-end

I was updating my features list for my SAN Buying Guide the other day when I noticed that low-end storage systems were getting smaller.

That is NetApp, HDS  and others had recently reduced the number of drives they supported on their newest low-end storage systems (e.g, see specs for HDS HUS-110 vs AMS2100 and Netapp FAS2220 vs FAS2040). And as the number of drives determines system capacity, the size of their SMB storage was shrinking.

But what about the data deluge?

With the data explosion going on, data growth in most IT organizations is something like  65%.  But these problems seem to be primarily in larger organizations or in data warehouses databases used for operational analytics.  In the case of analytics, these are typically done on database machines or Hadoop clusters and don’t use low-end storage.

As for larger organizations, the most recent storage systems all seem to be flat to growing in capacity, not shrinking. So, the shrinking capacity we are seeing in new low-end storage doesn’t seem to be an issue in these other market segments.

What else could explain this?

I believe the introduction of SSDs is changing the drive requirements for low-end storage.  In the past, prior to SSDs, organizations would often over provision their storage to generate better IO performance.

But with most low-end systems now supporting SSDs, over provisioning is no longer an economical solution to increase performance.  As such, for those needing higher IO performance the most economical solution (CAPex and OPex) is to buy a small amount of SSD capacity in conjunction with the remaining storage in disk capacity.

That and the finding that maybe SMB data centers don’t need as much disk storage as was originally thought.

The downturn begins

So this is the first downturn in capacity to come along in my long history with data storage.  Never before have I seen capacities shrink in new versions of storage systems designed for the same market space.

But if SSDs are driving the reduction in SMB storage systems, shouldn’t we start to see the same trends in mid-range and enterprise class systems?

But disk enclosure re-tooling may be holding these system capacities flat.  It takes time, effort and expense to re-implement disk enclosures for storage systems.  And as the reductions we are seeing in low-end is not that significant, maybe it’s just not worth it for these other systems – just yet.

But it would be useful to see something that showed the median capacity shipments per storage subsystem. I suppose weighted averages are available from something like IDC disk system shipments and overall capacity shipped. But there’s no real way to derive median from these measures and I think thats the only stat that might show how this trend is being felt in other market segments.

Comments?

Image credit: Photo of Dell EqualLogic PSM4110 Blade Array disk drawer, taken at Dell Storage Forum 2012

 

 

 

Latest SPECsfs2008 results NFS vs. CIFS – chart-of-the-month

SCISFS121227-010(001) (c) 2013 Silverton Consulting, Inc. All Rights Reserved
SCISFS121227-010(001) (c) 2013 Silverton Consulting, Inc. All Rights Reserved

We return to our perennial quest to understand file storage system performance and our views on NFS vs. CIFS performance.  As you may recall, SPECsfs2008 believes that there is no way to compare the two protocols because

  • CIFS/SMB is “statefull” and NFS is “state-less”
  • The two protocols are issuing different requests.

Nonetheless, I feel it’s important to go beyond these concerns and see if there is any way to assess the relative performance of the two protocols.  But first a couple of caveats on the above chart:

  • There are 25 CIFS/SMB submissions and most of them are for SMB environments vs. 64 NFS submissions which are all over the map
  • There are about 12 systems that have submitted exact same configurations for CIFS?SMB and NFS SPECsfs2008 benchmarks.
  • This chart does not include any SSD or FlashCache systems, just disk drive only file storage.

All that being said, let us now see what the plot has to tell us. First the regression line is computed by Excel and is a linear regression.  The regression coefficient for CIFS/SMB is much better at 0.98 vs NFS 0.80. But this just means that their is a better correlation between CIFS/SMB throughput operations per second to the number of disk drives in the benchmark submission than seen in NFS.

Second, the equation and slope for the two lines is a clear indicator that CIFS/SMB provides more throughput operations per second per disk than NFS. What this tells me is that given the same hardware, all things being equal the CIFS/SMB protocol should perform better than NFS protocol for file storage access.

Just for the record the CIFS/SMB version used by SPECsfs2008 is currently SMB2 and the NFS version is NFSv3.  SMB3 was just released last year by Microsoft and there aren’t that many vendors (other than Windows Server 2012) that support it in the field yet and SPECsfs2008 has yet to adopt it as well.   NFSv4 has been out now since 2000 but SPECsfs2008 and most vendors never adopted it.  NFSv4.1 came out in 2010 and still has little new adoption.

So these results are based on older, but current versions of both protocols available in the market today.

So, given all that, if I had an option I would run CIFS/SMB protocol for my file storage.

Comments?

More information on SPECsfs2008 performance results as well as our NFS and CIFS/SMB ChampionsCharts™ for file storage systems can be found in our NAS Buying Guide available for purchase on our web site.

~~~~

The complete SPECsfs2008 performance report went out in SCI’s December newsletter.  But a copy of the report will be posted on our dispatches page sometime this month (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

As always, we welcome any suggestions or comments on how to improve our SPECsfs2008  performance reports or any of our other storage performance analyses.