Storage changes in vSphere 5.5 announced at VMworld 2013

Pat Gelsinger, VMworld2013 Keynote, vSphere 5.5 storage changesVMworld2013 is going on in San Francisco this week. The big news is the roll out of network virtualization in NSX and vCloud Hybrid Service (vCHS) but there were a few tidbits in the storage arena worth discussing.

  • Virtual SAN public beta – VSAN was released as a public beta and customers can now download a copy of VSAN from www.vsanbeta.com. VSAN will construct a pool of storage out of local attached disks and flash across two or more hosts. It uses the flash as a read-write cache for the local disks. With VSAN customers can elect to have multiple tiers of storage be supported within a single VSAN pool, as well as support different availability (replication) levels, and some other, select characteristics. VSAN can easily scale in performance and capacity by just adding more hosts that have local storage. Now all that stranded local storage and flash server level resources can be used as a VM storage pool. VMware stated that they see VSAN as usefull for tier 2/tier 3 application storage and/or backup-archive storage uses. However they showed one chart with a View Planner application simulation using a 3-host VSAN (presumably with lots of SSD and disk storage) compared against an all-flash array (vendor unknown). In this benchmark the VSAN exactly matched the all-flash external storage in performance (VMs supported). [late update] Lot’s of debate on what VSAN means to enterprise storage but it appears to be a limited in scope and mainly focused on SMB applications.  Chad Sakac did a (real) lengthy post on EMC’s perspective on VSAN and Software Defined Storage if you want to know more check it out.
  • Virsto – VMware announced GA of Virsto which uses any external storage and creates a new global storage pool out of them. Apparently, it maps a log structured file system across the external SAN storage. By doing this it sequentializes all the random write IO coming off of ESX hosts. It supports thin provisioning, snapshot and read-write clones. One could see this as almost a write cache for VM IO activity but read IOs are also by definition spread across (extremely wide striped) across the storage pool which should improve read performance as well. You configure external storage as normal and present those LUNs to Virsto which then converts that storage pool into “vDisks” which can then be configured as VM storage. Probably more to see here but it’s available today. Before acquisition one had to install Virsto into each physical host that was going to define VMs using Virsto vDisks. It’s unclear how much Virsto has been integrated into the hypervisor but over time one would assume like VSAN this would be buried underneath the hypervisor and be available to any vSphere host.
  • vSphere Flash Read Cache – customers with PCIe flash cards and vCenter Ops Manager, can now use them to support a read cache for data access. vSphere Flash Read Cache is apparently vmotion aware such that as you move VMs from one ESX host to another the read cache buffer will move with it. Flash Read Cache is transparent to the VMs and can be assigned on a VMDK basis.
  • vSphere 5.5 low-latency support – unclear what VMware actually did but they now claim vSphere 5.5 now supports low latency applications, like FinServ apps. They claim to have reduced the “jitter” or variability in IO latency that was present in previous versions of vSphere. Presumably they shortened the IO and networking paths through the hypervisor which should help.  I suppose if you have a VMDK which ends up on an SSD storage someplace one can have a more predictable response time. But the critical question is how much overhead does the hypervisor IO path add to the base O/S. When all-flash arrays now sporting latencies under 100 µsecs, adding another 10 or 100 µsecs can make a big difference. In VMware’s quest to virtualize any and all mission critical apps, low-latency apps are one of the last bastions of physical server apps left to conquer. Consider this a step to accommodate them.
  • vVols – VMware keeps talking about vVols as an attempt to extend their VSAN “policy driven control plane” functionality out to networked storage but there’s still no GA yet. The (VASA 2 or vVol) spec’s seem to be out for awhile now, and I have heard from at least two “major” vendors that they have support in place today but VMware still isn’t announcing formal availability yet. Unclear what the hold up is, but maybe the spec’s are more in a state of flux than what’s depicted externally.

Most of this week was spent talking about NSX, VMware’ network virtualization and vCloud Hybrid Services. When they flashed the list of NSX partners on the screen Cisco was absent. Not sure what this means but perhaps there’s some concern that NSX will take revenue away from Cisco.

As for vCHS apparently this is a VMware run public cloud with two now expanding to three data centers in US, that customers can use to support their own hybrid cloud services. VMware announced that SAVVIS is now offering vCHS services as well as VMware with data centers in NY and Chicago.  There was some talk about vCHS offering object storage services like Amazon’s S3 but there was nothing specific about when. [Late update] Pat did mention that a future offering will provide DR-as-a-Service using vCHS as a target for SRM. That seems to be matching what Microsoft seems to be planning for Azzure and Hyper-V DR.

That’s about it as far as I can tell. Didn’t hear any other news on storage changes in vSphere 5.5. But this is the year of network virtualization. Can’t wait to see what they roll out next year.

Peak server, the cloud & NetApp storage for AWS

I was at a conference a month or so ago and one speaker mentioned that the number of x86 servers being sold has peaked and is dropping. I can imagine a number of reasons for this and the main one being server virtualization. But this speaker had a different view and it seemed to be the cloud.

Peak server is here.

He said that three companies were purchasing over 1/2 the x86 servers these days. I feel that there should be at least four Google, Facebook, Amazon & Microsoft and maybe five, if you add in Apple.

Something has happened over the past year or so. Enterprise IT has continued along its merry way but the adoption of cloud services is starting to take off.

I have seen this before, with mainframes, then mini-computers, and now client-server. Minicomputers came out and were so easy to use and develop/deploy applications on, that people stopped creating new apps on the mainframe. Mainframes never died out, and probably have never really stopped shipping increasing MIPS every year. But the share of WW MIP installations for mainframes has been shrinking for decades and have never got going again.

Ultimately, the proprietary minicomputer was just a passing fad and only lasted about 25 years or so. It was wounded by the PC, and then killed off by proprietary Unix workstations.

Then it happened again, the new upstart this time was Windows Server and Linux. Once again it was just easier to build apps on these new and cheaper servers, than any of the older Unix servers. Of course there’s still plenty of business in proprietary Unix servers, but again I would venture to say that their share of WW installed MIPS has been shrinking for a long time.

Nowadays, the cloud is mortally wounding the server market. Server virtualization is helping a lot but it’s also enabling the cloud to eliminate many physical server sales. This is because new applications, new IT environments are being ported/moved/deployed onto the cloud.

Peak server means less enterprise networking, storage and server hardware

In this new, cloud world, customers need less servers, less networking and less enterprise class storage. Yes not every application is suitable to cloud deployment but that’s why there’s still mainframes, still Unix servers, and a continuing need for standalone, physical or virtual x86 servers in the enterprise. But their share of MIPs will start shrinking soon if it hasn’t already.

Ok, so enterprise data center share of MIPs will start shrinking vis a vis cloud data centers. But what happens to networking and storage. My view is that networking becomes software defined and there’s a component of that which operates on special purpose hardware. This will increase in shipments but the more complex, enterprise class networking equipment will flatline and never see any more substantial growth.

And up until yesterday I felt much the same about enterprise class storage. Software defined storage in my future, DAS and SSDs for the capacity and the smarts exist in software if at all. Today, most of the cloud and many service providers have been moving off enterprise class storage and onto DAS.

NetApp’s new enterprise storage in AWS

But yesterday I heard about NetApp private storage for the cloud. This is a configuration of NetApp storage installed in a CoLo facility with a “direct connection” to Amazon compute cloud. In this way, enterprise customers can maintain data stewardship/ownership/governance over their data while at the same time deploying applications onto AWS compute cloud.

This seems to be one of the sticking points to enterprise customers adopting the cloud. By having (data) storage owned lock/stock&barrel by the enterprise it seems much easier and less risky to deploy new and old applications to the cloud.

Whether this pans out and can provide enough value to cover the added expense of the enterprise class storage, only the market can decide. But this is the first time I can remember, where any vendor has articulated a role for enterprise class storage in the cloud. Let’s hope it works.

Image: PDP8/s by ajmexico

Windows Server 2012 R2 storage changes announced at TechEd

Microsoft TechEd Trends driving IT todayMicrosoft TechEd USA is this week and they announced a number of changes to the storage services that come with Windows Server 2012 R2

  • Azure DRaaS – Microsoft is attempting to democratize DR by supporting a new DR-as-a-Service (DRaaS).  They now have an Azure service that operates in conjunction with Windows Server 2012 R2 that provides orchestration and automation for DR site failover and fail back to/from remote sites.  Windows Server 2012 R2 uses Hyper-V Replica to replicate data across to the other site. Azure DRaaS supports DR plans (scripts) to identify groups of Hyper-V VMs which need to be brought up and their sequencing. VMs within a script group are brought up in parallel but different groups are brought up in sequence.  You can have multiple DR plans, just select the one to execute. You must have access to Azure to use this service. Azure DR plans can pause for manual activities and have the ability to invoke PowerShell scripts for more fine tuned control.  There’s also quite a lot of setup that must be done, e.g. configure Hyper-V hosts, VMs and networking at both primary and secondary locations.  Network IP injection is done via mapping primary to secondary site IP addresses. The Azzure DRaaS really just provides the orchestration of failover or fallback activity. Moreover, it looks like Azure DRaaS is going to be offered by service providers as well as private companies. Currently, Azure’s DRaaS has no support for SAN/NAS replication but they are working with vendors to supply an SRM-like API to provide this.
  • Hyper-V Replica changes – Replica support has been changed from a single fixed asynchronous replication interval (5 minutes) to being able to select one of 3 intervals: 15 seconds; 5 minutes; or 30 minutes.
  • Storage Spaces Automatic Tiering – With SSDs and regular rotating disk in your DAS (or JBOD) configuration , Windows Server 2012 R2 supports automatic storage tiering. At Spaces configuration time one dedicates a certain portion of SSD storage to tiering.  There is a scheduled Windows Server 2012 task which is then used to scan the previous periods file activity and identify which file segments (=1MB in size) that should be on SSD and which should not. Then over time file segments are moved to an  appropriate tier and then, performance should improve.  This only applies to file data and files can be pinned to a particular tier for more fine grained control.
  • Storage Spaces Write-Back cache – Another alternative is to dedicate a certain portion of SSDs in a Space to write caching. When enabled, writes to a Space will be cached first in SSD and then destaged out to rotating disk.  This should speed up write performance.  Both write back cache and storage tiering can be enabled for the same Space. But your SSD storage must be partitioned between the two. Something about funneling all write activity to SSDs just doesn’t make sense to me?!
  • Storage Spaces dual parity – Spaces previously supported mirrored storage and single parity but now also offers dual parity for DAS.  Sort of like RAID6 in protection but they didn’t mention the word RAID at all.  Spaces dual parity does have a write penalty (parity update) and Microsoft suggests using it only for archive or heavy read IO.
  • SMB3.1 performance improvements of ~50% – SMB has been on a roll lately and R2 is no exception. Microsoft indicated that SMB direct using a RAM DISK as backend storage can sustain up to a million 8KB IOPS. Also, with an all-flash JBOD, using a mirrored Spaces for backend storage, SMB3.1 can sustain ~600K IOPS.  Presumably these were all read IOPS.
  • SMB3.1 logging improvements – Changes were made to SMB3.1 event logging to try to eliminate the need for detail tracing to support debug. This is an ongoing activity but one which is starting to bear fruit.
  • SMB3.1 CSV performance rebalancing – Now as one adds cluster nodes,  Cluster Shared Volume (CSV) control nodes will spread out across new nodes in order to balance CSV IO across the whole cluster.
  • SMB1 stack can be (finally) fully removed – If you are running Windows Server 2012, you no longer need to install the SMB1 stack.  It can be completely removed. Of course, if you have some downlevel servers or clients you may want to keep SMB1 around a bit longer but it’s no longer required for Server 2012 R2.
  • Hyper-V Live Migration changes – Live migration can now take advantage of SMB direct and its SMB3 support of RDMA/RoCE to radically speed up data center live migration. Also, Live Migration can now optionally compress the data on the current Hyper-V host, send compressed data across the LAN and then decompress it at target host.  So with R2 you have three options to perform VM Live Migration traditional, SMB direct or compressed.
  • Hyper-V IO limits – Hyper-V hosts can now limit the amount of IOPS consumed by each VM.  This can be hierarchically controlled providing increased flexibility. For example one can identify a group of VMs and have a IO limit for the whole group, but each individual VM can also have an IO limit, and the group limit can be smaller than the sum of the individual VM limits.
  • Hyper-V supports VSS backup for Linux VMs – Windows Server 2012 R2 has now added support for non-application consistent VSS backups for Linux VMs.
  • Hyper-V Replica Cascade Replication – In Windows Server 2012, Hyper V replicas could be copied from one data center to another. But now with R2 those replicas at a secondary site can be copied to a third, cascading the replication from the first to the second and then the third data center, each with their own replication schedule.
  • Hyper-V VHDX file resizing – With Windows Server 2012 R2 VHDX file sizes can now be increased or reduced for both data and boot volumes.
  • Hyper-V backup changes – In previous generations of Windows Server, Hyper-V backups took two distinct snapshots, one instantaneously and the other at quiesce time and then the two were merged together to create a “crash consistent” backup. But with R2, VM backups only take a single snapshot reducing overhead and increasing backup throughput substantially.
  • NVME support – Windows Server 2012 R2 now ships with a Non-Volatile Memory Express (NVME) driver for PCIe flash storage.  R2’s new NVME driver has been tuned for low latency and high bandwidth and can be used for non-clustered storage spaces to improve write performance (in a Spaces write-back cache?).
  • CSV memory read-cache – Windows Server 2012 R2 can be configured to set aside some host memory for a CSV read cache.  This is different than the Spaces Write-Back cache.  CSV caching would operate in conjunction with any other caching done at the host OS or elsewhere.

That’s about it. Some of the MVPs had a preview of R2 up in Redmond, but all of this was to be announced in TechEd, New Orleans, this week.

~~~~

Image: Microsoft TechEd by BetsyWeber





EMCworld 2013 day 1

Lines for coffee at the Cafe were pretty long this morning and I missed my opportunity to have breakfast to do some work. But eventually made my way to the press room and got some food and coffee.

Spent the morning in Analyst sessions mostly under NDA but it seems safe to say that EMC sees plenty of opportunity ahead.

The first session Q&A with BRS executives and customers was enlightening but the main message from the customers was that data protection is hard, legacy systems often can’t adjust quick enough and sometimes a completely new architecture is warranted. The executives were upbeat about current BRS business and where they were headed in the future.

20130506-142735.jpgRest of the morning was with Jeremy Burton EVP Product, Operations and Marketing and John Roese, the new SVP and CTO of EMC (6 months on the job). Jeremy talked about an IDC insight that there’s a new world emerging so-called 3rd platform applications based on mobile and consumer grade technology  with literally billions of users, millions of apps built on mobile-cloud-bigdata-social infrastructure which complements the 2nd platform built on lan/wan, client server frameworks.

For an example of this environment Jeremy mentioned that AT&T provisions 12PB of storage a month.

What’s needed for this new platform is a new type of storage built for the 3rd platform but taking advantage of current enterprise storage characteristics.  This is ViPR (more on that later)

John comes by way of Huawei, Nortel and myriad others and offers a broad insight to the way forward for EMC. It looks like a bright future ahead if they can do half of what John has outlined.

John talked about the intersections between the carrier market (or services), enterprise IT and consumer market.  There is convergence between these regions and at each of these intersections new technology is going to answer many of the problems which exist. For instance in the carrier space:

  • The amount of information they gather is frightening they know everything about you. Pivotal will be the key here because its good at 1) ability to correlate information across different information sources. Most carriers have a whole bunch of disparate information stores; and 2) It’s not just focused on Big Data as a non-realtime problem but also provides realtime analytics as well.
  • Capital costs are going down but $/bits are going way down.  VMware & Software defined data center is the right way to drive down costs.  Today servers are ~50% virtualized but networking is not virtualized at all.
  • Customers are dissatisfied with service providers (carriers).  Again Pivotal is key here. One carrier customer was focused on customer churn and tried to figure out how to minimize this. They used  Gemfire’ high speed infrastructure that could watchc all transactions on cell tower infrastructure pick out dropped calls, send it to Greenplum and correlate this with the customer attributes (good or bad), and within 100msec supply an interaction with the customer in to apologize and offer some services to make it better.
  • Internet is the new wild west –use at your own risk,  spoofing websites, respond to email could be anyone, chaos to security. RSA can become the trusted internet provider by looking at the internet holistically, combining information from many customers, aggregating and sharing these interactions to deterimine the trust of every transaction. Trust is becoming a new big data problem.
  • Hybrid and public cloud is their biggest opportunity but they don’t know how to attack it. VMware and SDDC will evolve to provide orchestrated movement from private to public and closed to open.

The thinking seems pretty straightforward given what they are trying to accomplish and the framework he applied to EMC’s strategy going forward made a lot of sense.

20130506-172955.jpgBrian Gallagher did a keynote on enterprise storage new functions and features which covered VMAX, VPLEX, RecoverPoint, and XtremIO/SF/SW. Mentioned RecoverPoint virtual appliance and sort of a statement of direction on being able to move application functionality directly on VMAX. He kind of demoed this with VPLEX running on VMAX.

He also talked about FAST speed of reaction versus the competition, mentioned that FAST provides information about the storage tiering to up to 4 different VMAX arrays. Showed a comparison of VMAX 10K against another prime competitor that looked downright embarrassing.  And talked about VMAX cloud edition.

20130506-173022.jpgAfter that 1 on 1 meetings all under strict NDA. But then the big Keynote with Jeremy again and David Goulden President and COO on ViPR. They have implemented software defined storage (SDS).  Last week I did a post on SDS trying to layout some of the problems and promises of SDS (please see The promise of SDS post).

But what I missed was the data path transformation that ViPR can do to provide object and HDFS access to traditional and commodity storage systems.  ViPR starts out primarily in the control layer providing automated provisioning, self management, across heterogeneous storage pools. With ViPR one can define virtual storage arrays and then configure virtual storage pools across those arrays regardless of the physical infrastructure underneath them.

More on ViPR in a separate post but suffice it to say EMC has been working on this for awhile now. But how it’s positioned with VPLEX and the other storage virtualization capabilities in VMAX and other products is another matter. But it seems they are carving out a space for ViPR between and above the current storage solutions.

End of day one is in the Expo and then cocktail parties… stay tuned for day 2.

 

SNWUSA Spring 2013 summary

SNWUSA, SNIA, partyFor starters the parties were a bit more subdued this year although I heard Wayne’s suite was hopping to 4am last night (not that I would ever be there that late).

And a trend seen the past couple of years was even more evident this year, many large vendors and vendor spokespeople went missing. I heard that there were a lot more analyst presentations this SNW than prior ones although it was hard to quantify.  But it did seem that the analyst community was pulling double duty in presentations.

I would say that SNW still provides a good venue for storage marketing across all verticals. But these days many large vendors find success elsewhere, leaving SNW Expo mostly to smaller vendors and niche products.  Nonetheless, there were a\ a few big vendors (Dell, Oracle and HP) still in evidence. But EMC, HDS, IBM and NetApp were not   showing on the floor.

I would have to say the theme for this years SNW was hybrid storage. It seemed last fall the products that impressed me were either cloud storage gateways or all flash arrays but this year there weren’t as many of these at the show but hybrid storage certainly has arrived.

Best hybrid storage array of the show

It’s hard to pick just one hybrid storage vendor as my top pick, especially since there were at least 3 others talking to me under NDA, but from my perspective the Hybrid vendor of the show had to be Tegile (pronounced I think, as te’-jile). They seemed to have a fully functional system with snapshot, thin provisioning, deduplication and pretty good VMware support (only time I have heard a vendor talk about VASA “stun” support for thin provisioned volumes).

They made the statement that SSD in their system is used as a cache, not a tier. This use is similar to NetApp’s FlashCache and has proven to be a particularly well performing approach to use of hybrid storage. (For more information on that take a look at some of my NFS and recent SPC-1 benchmark review dispatches. How well this is integrated with their home grown dedupe logic is another question.

On the negative side, they seem to be lacking a true HA/dual controller version but could use two separate systems with synch (I think) replication between them to cover this ground?? They also claimed their dedupe had no performance penalty, a pretty bold claim that cries out for some serious lab validation and/or benchmarking to prove. They also offer an all flash version of their storage (but then how can it be used as a cache?).

The marketing team seemed pretty knowledgeable about the market space and they seem to be going after mid-range storage space.

The product supports file (NFS and CIFS/SMB), iSCSI and FC with GigE, 10GbE and 8Gbps FC. They quote “effective” capacities with dedupe enabled but it can be disabled on a volume basis.

Overall, I was impressed by their marketing and the product (what little I saw).

Best storage tool of the show

Moving onto other product categories, it was hard to see anything that caught my eye. Perhaps I have just been to too many storage conferences but I did get somewhat excited when I looked at SwiftTest.  Essentially they offer a application profiling, storage modeling, workload generating tool set.

The team seems to be branching out of their traditional vendor market focus and going after large service providers and F100 companies with large storage requirements.

Way back, when I was in Engineering, we were always looking for some information as to how customers actually used storage products. Well what SwiftTest has, is an appliance to instrument your application environment (through network taps/hardware port connections) to monitor your storage IO and create a statistical operational profile of your I/O environment. Then take that profile and play it against a storage configuration model to show how well it’s likely to perform.  And if that’s not enough the same appliance can be used to drive a simulated version of the operational profile back onto a storage system.

It offers NFS (v2,v3, v4) CIFS/SMB (SMB1, SMB2, SMB3) FC, iSCSI, and HTTP/REST (what no FCoe?). They mentioned an $8oK price tag for the base appliance (one protocol?) but grows up pretty fast, if you want all of them.  They also seem to have three levels of appliances (my guess more performance and more protocols come with the bigger boxes).

Not sure where they top out but simulating an operational profile can be quite complex especially when you have to be able to control data patterns to match deduplication potential in customer data, drive markov chains with probability representations of operational profiles, and actually execute IO operations. They said something about their ports have dedicated CPU cores to insure adequate performance or something similar but still it seems to little to hit high IO workloads.

Like I said, when I was in engineering were searching for this type of solution back in the late 90s and we would have probably bought it in a moment, if it was available.

GoDaddy.com, the domain/web site services provider was one of their customers that used the appliance to test storage configurations. They presented at SNW on some of their results but I missed their session (the case study is available on SwiftTests website).

~~~~

In short, SNW had a diverse mixture of end user customers but lacked a full complement of vendors to show off to them.   The ratio of vendors to customers has definitely shifted to end-users the last couple of years and if anything has gotten more skewed to end-users, (which paradoxically should appeal to more storage vendors?!).

I talked with lots of end-users, from companies like FedEx, Northrop Grumman and AOL to name just a few big ones. But there were plenty of smaller ones as well.

The show lasted three days and had sessions scheduled all the way to the end. I was surprised at the length and the fact that it started on Tuesday rather than Monday as in years past.  Apparently, SNIA and Computerworld are still tweaking the formula.

It seemed to me there were more cancelled sessions than in years past but again this was hard to quantify.

Some of the customers I talked with thought SNW should go to a once a year and can’t understand why it’s still twice a year.  Many mentioned VMworld as having taken the place of SNW in being a showplace for storage vendors of all sizes and styles.  That and the vendor specific shows from EMC, IBM, Dell and others.

The fall show is moving to Long Beach, CA. Probably, a further experiment to find a formula that works.  Let’s hope they succeed.

Comments?

 

vSphere 5.1 storage enhancements and future vision

We discussed last year’s vSphere 5 storage changes in a previous post.  And at last week’s VMworld2012 in San Francisco, VMware announced a few new enhancements for vSphere 5.1 but showed more on their vision for the future of storage in VMware environments.

vSphere 5.1 storage enhancements were not as significant as last year’s enhancements.  Specifically, vSphere 5.1 storage oriented changes include:

  • VDP – vSphere Data Protector is a new agentless, deduplicating backup solution from VMware (and EMC) which is now bundled into vSphere and comes free for all users at the Essentials+ level and above. VDP is based on EMC’s Avamar Virtual Edition and provides a new integrated data protection management tab in vCenter Operations Manager GUI.  VDP replaces VDR.
  • vMotion changes – vMotion now supports non-shared storage and specifically, VSA storage environments.  To do this vMotion will now perform a standard storage vMotion to the targeted host before the VM vMotion takes place to move the data to the new location.
  • vSphere replication auto-failback with SRM – SRM 5.1 now supports vSphere replication service automated failback. SRM 5 supported storage array based replication automated failback but had no support for the then announced new VMware, host based replication service. This has been rectified with SRM 5.1.
  • SRM packaging changes – SRM standard now comes at no additional charge with the vCloud Suite Standard license option.  And a new entry level SRM (for 6 CPUs, 3 hosts) comes with Essentials+ to match and provide DR services for VSA environments.

VMware storage vision

VMware took the opportunity to discuss their vision for future offerings in the storage arena.  Specifically,

  • vSphere volumes (vVols) –  vVols will become the new defacto standard unit of granularity and abstraction for storage systems, providing a new allocation unit behind VMDKs and eliminating VMFS.  vVols are intended to define a new interface between vSphere and networked storage systems so that VMDKs can now be replicated, snapshot, cloned, etc.  alone without impacting other VMDKs on the storage system.  vVols are intended to replace LUNs and/or files used as previous holding containers for VMDKs.  vVols -should eliminate the mess of having to define 1000s of LUNs required to support VDI or cloud data centers implementations
  • Virtual flash – VMware’s first internal support for server side flash.  VMware will now be able to partition and allocate the flash on PCIe cards to VMs executing in the ESX server just like physical memory and vCPUs are today.  Also VMware will be able to copy flash cache contents when vMotion-ing VMs to other physical servers.  The intent is to fully support PCIe flash cards for vMotion by warm starting the flash in the target server and bring fast access storage closer to VMs.
  • VSAN – also called distributed storage, takes VSA like services and scales it out to support many more hosts/CPUs and networked storage.   The ultimate goal here seems to be to provide a shared, mid-tier, distributed storage system based on VMware DAS, which will better support vSphere execution and high availability.  VSAN will provide compute and storage within the same host.  It’s intended that VSAN be easier to configure, deploy and manage than current VM shared storage solutions.

Where are they going with all this?

I believe VMware is signaling an intent to get more involved in the storage arena.  Last years move with VSA now seems like just the beginning.

If examined together with their other thrusts for the virtual data center, it all starts to make sense. When these three future storage capabilities are in place, VMware should be better able to configure and support virtual cloud data centers (VCD) built out of commodity servers, commodity storage and commodity networking gear.  With all this in place VCDs should be better able to compete with AWS and other cloud service providers.

The end of enterprise storage, …

I was talking with one IT analyst, Dr. Kevin McIsaac with IBRS in Australia who feels when these three capabilities start rolling out, it signals the beginning of the end of enterprise storage as we know it.  He compares  this to what happened to specialized Unix servers (from HP, Sun, IBM, etc.) prominent at the end of the last century and early this century with the introduction of VMware and commodity high-performance, Intel servers/microprocessor chips.  Although these proprietary Unix servers still exist they are no longer growing market share.

In Kevin’s view, VMware is just following that playbook again, only this time it’s enterprise storage in their sights.  Of course, the other side of this is the enterprise networking that starts to be commoditized by all the virtual networking capabilities VMware is rolling out in VxLAN and Nicira integration as well. (Perhaps subject for another post).

… Not quite yet.

I understand his point and can’t help but agree with parts of it at least at the low end and potentially mid-tier storage.  IMHO however, enterprise storage vendors have a viable defense to all this but it involves providing even more functionality, performance and capabilities than they available today in their systems.

I see it every time I look at my performance charts, anytime you start getting over 300 disk drives, storage sophistication matters more to performance, than just throwing more hardware in the mix.  For an example of this effect checkout my last post on SPC-2 performance correlations.

And of course, VMware might be straining their very profitable relationship with storage vendors today such as Dell, HP, IBM, NetApp, EMC, etc. all of which today highlight and push their virtualization solution throughout their partner community.   If they decide to stop recommending VMware and start focusing on other virtualization offerings this might also stall VMware’s vision.

~~~~

In the end I can’t help but feel that in VMware’s view their challenge, in the long run will come from AWS, Google and other cloud service providers. Whatever they can do to better prepare to compete with this gaggle of cloud purveyors, the better they succeed for their enterprise customer. And ultimately that means more business for VMware.  If enterprise networking and storage vendors have to adapt to that vision, then so be it.

Comments?

Data hypervisor

(c) 2012 Silverton Consulting, Inc. All rights reserved

With all this talk of software defined networking and server virtualization where does storage virtualization stand.  I blogged about some problems with storage virtualization a week or so ago in my post on Storage Utilization is broke and this post takes it to the next level.  Also I was at a financial analyst conference this week in Vail where I heard Mark Lewis of Tekrocket but formerly of EMC discuss the need for a data hypervisor to provide software defined storage.

I now believe what we really need for true storage virtualization is a renewed focus on data hypervisor functionality.  The data hypervisor would need both a control plane and a data plane in order to function properly.   Ideally the control plane would set up the interface and routing for the data plane hardware and the server and/or backend storage would be none the wiser.

DMs everywhere

I envision a scenario where a customer’s application data is packaged with a data hypervisor which runs on a commodity data switch hardware with data plane and control plane software running on it.  Sort of creating (virtual) data machines or DMs.

All enterprise and nowadays most midrange storage provide most of the functionality of a storage control plane such as defining units of storage, setting up physical to logical storage mapping, incorporating monitoring, and management of the physical storage layer, etc.  So control planes are pervasive in today’s storage but proprietary.

In addition most storage systems have data plane functionality which operates to connect a host IO request to the actual data which resides in backend storage or internal cache.  But again although data planes are everywhere in storage today they are all proprietary to a specific vendor’s storage system.

Data switch needed

But in order to utilize a data hypervisor and create a more general purpose control plane layer, we need a more generic data plane layer that operates on commodity hardware. This is different from today’s SAN storage switches or DCB switches but similar in a some ways.

The functions of the data switch/data plane layer would be to take routing instructions from the control plane layer and direct the server IO request to the proper storage unit using the data plane layer.  Somewhere in this world view, probably at the data plane level it would introduce data protection services like RAID or other erasure coding schemes, point in time copy/clone services and replication services and other advanced storage features needed by enterprise storage today.

Also it would need to provide some automated storage movement across and within tiers of physical storage and it would connect server storage interfaces at the front end to storage interfaces at the backend.  Not unlike SAN or DCB switches but with much more advanced functionality.

Ideally data switch storage interfaces could attach to dedicated JBOD, Flash arrays as well as systems using DAS  storage.  In addition, it would be nice if the data switch could talk to real storage arrays on SAN, IP/SANs or NFS&CIFS/SMB storage systems.

The other thing one would like out of a data switch is support for a universal translator that would map one protocol to another, such as iSCSI to SAS, NFS to FC, or FC to NFS and any other combination, depending on the needs of the server and the storage in the configuration.

Now if the data switch were built on top of commodity x86 hardware and software with the data switch as just a specialized application that would create the underpinnings for a true data hypervisor with a control and data plane that could be independent and use anybody’s storage.

Data hypervisor

Assuming all this were available then we would have true storage virtualization.  With these capabilities, storage could be repurposed on the fly, added to, subtracted from, and in general be a fungible commodity not unlike server processing MIPs under VMware or Hyper-V.

Application data would then needed to be packaged into a data machine which would offer all the host services required to support host data access.  The data hypervisor would handle the linkages required to interface with the control and data layers.

Applications could be configured to utilize available storage at ease and storage could grow,  shrink or move to accommodate the required workload just as easily as VMs can be deployed today.

How we get there

Aside from the VMware, Citrix, Microsoft thrusts towards virtual storage there are plenty of storage virtualization solutions that can control most backend enterprise SAN storage. However, the problem with these solutions is that in general the execute only on a specific vendors hardware and don’t necessarily talk to DAS or JBOD storage.

In addition, not all of the current generation storage virtualization solutions are unified. That is most of these today only talk FC, FCoE or iSCSI and don’t support NFS or CIFS/SMB.

These don’t appear to be insurmountable obstacles and with proper allocation of R&D funding, could all be solved.

However the more problematic is that none of these solutions operate on commodity hardware or commodity software.

The hardware is probably the easiest to deal with. Today many enterprise storage systems are built ontop of x86 processor storage controllers. Albeit sometimes they incorporate specialized packaging for redundancy and high availability.

The harder problem may be commodity software. Although the genesis for a few storage virtualization systems might come from BSD or other “commodity” software operating systems. They have been modified over the years to no longer represent anything that can run on standard off the shelf operating systems.

Then again some storage virtualization systems started out with special home grown hardware and software. As such, converting these over to something more commodity oriented would be a major transition.

But the challenge is how to get there from here and would anyone want to take this on.  The other problem is that the value add that storage vendors supply currently would be somewhat eroded.  Not unlike what happened to proprietary Unix systems with the advent of VMware.

But this will not take place overnight and the company that takes this on and makes a go at it can have a significant software monopoly that would be hard to crack.

Perhaps it will take a startup to do this but I believe the main enterprise storage vendors are best positioned to take this on.

Comments?

Latest ESRP 1K-5K mailbox DB xfers/sec/disk results – chart-of-the-month

(SCIESRP120429-001) 2012 (c) Silverton Consulting, All Rights Reserved

The above chart is from our April newsletter on Microsoft Exchange 2010 Solution Reviewed Program (ESRP) results for the 1,000 (actually 1001) to 5,000 mailbox category.  We have taken the database transfers per second, normalized them for the number of disk spindles used in the run and plotted the top 10 in the chart above.

A couple of caveats first, we chart disk-only systems in this and similar charts  on disk spindle performance. Although, it probably doesn’t matter as much at this mid-range level, for other categories SSD or Flash Cache can be used to support much higher performance on a per spindle performance measure like the above.  As such, submissions with SSDs or flash cache are strictly eliminated from these spindle level performance analysis.

Another caveat, specific to this chart is that ESRP database transaction rates are somewhat driven by Jetstress parameters (specifically simulated IO rate) used during the run.  For this mid-level category, this parameter can range from a low of 0.10 to a high of 0.60 simulated IO operations per second with a median of ~0.19.  But what I find very interesting is that in the plot above we have both the lowest rate (0.10 in #6, Dell PowerEdge R510 1.5Kmbox) and the highest (0.60 for #9, HP P2000 G3 10GbE iSCSI MSA 3.2Kmbx).  So that doesn’t seem to matter much on this per spindle metric.

That being said, I always find it interesting that the database transactions per second per disk spindle varies so widely in ESRP results.  To me this says that storage subsystem technology, firmware and other characteristics can still make a significant difference in storage performance, at least in Exchange 2010 solutions.

Often we see spindle count and storage performance as highly correlated. This is definitely not the fact for mid-range ESRP (although that’s a different chart than the one above).

Next, we see disk speed (RPM) can have a high impact on storage performance especially for OLTP type workloads that look somewhat like Exchange.  However, in the above chart the middle 4 and last one (#4-7 & 10) used 10Krpm (#4,5) or slower disks.  It’s clear that disk speed doesn’t seem to impact Exchange database transactions per second per spindle either.

Thus, I am left with my original thesis that storage subsystem design and functionality can make a big difference in storage performance, especially for ESRP in this mid-level category.  The range in the top 10 contenders spanning from ~35 (Dell PowerEdge R510) to ~110 (Dell EqualLogic PS Server) speaks volumes on this issue or a multiple of over 3X from top to bottom performance on this measure.  In fact, the overall range (not shown in the chart above spans from ~3 to ~110 which is a factor of almost 37 times from worst to best performer.

Comments?

~~~~

The full ESRP 1K-5Kmailbox performance report went out in SCI’s April newsletter.  But a copy of the full report will be posted on our dispatches page sometime next month (if all goes well). However, you can get the full SPC performance analysis now and subscribe to future free newsletters by just sending us an email or using the signup form above right.

For a more extensive discussion of current SAN or block storage performance covering SPC-1 (top 30)SPC-2 (top 30) and all three levels of ESRP (top 20) results please see SCI’s SAN Storage Buying Guide available on our website.

As always, we welcome any suggestions or comments on how to improve our analysis of ESRP results or any of our other storage performance analyses.