New deduplication solutions from Sepaton and NEC

In the last few weeks both Sepaton and NEC have announced new data deduplication appliance hardware and for Sepaton at least, new functionality. Both of these vendors compete against solutions from EMC Data Domain, IBM ProtectTier, HP StoreOnce and others.

Sepaton v7.0 Enterprise Data Protection

From Sepaton’s point of view data growth is exploding, with little increase in organizational budgets and system environments are becoming more complex with data risks expanding, not shrinking. In order to address these challenges Sepaton has introduced a new version of their hardware appliance with new functionality to help address the rising data risks.

Their new S2100-ES3 Series 2925 Enterprise Data Protection Platform with latest Sepaton software now supports up to 80 TB/hour of cluster data ingest (presumably with Symantec OST) and up to 2.0 PB of raw storage in an 8-node cluster. The new appliances support 4-8Gbps FC and 2-10GbE host ports per node, based on HP DL380p Gen8 servers with Intel Xeon E5-2690 processors, 8 core, dual 2.9Ghz CPU, 128 GB DRAM and a new high performance compression card from EXAR. With the bigger capacity and faster throughput, enterprise customers can now support large backup data streams with fewer appliances, reducing complexity and maintenance/licensing fees. S2100-ES3 Platforms can scale from 2 to 8 nodes in a single cluster.

The new appliance supports data-at-rest encryption for customer data security as well as data compression, both of which are hardware based, so there is no performance penalty. Also, data encryption is an optional licensed feature and uses OASIS KMIP 1.0/1.1 to integrate with RSA, Thales and other KMIP compliant, enterprise key management solutions.

NEC HYDRAstor Gen 4

With Gen4 HYDRAstor introduces a new Hybrid Node which contains both the logic for accelerator nodes and capacity for storage nodes, in one 2U rackmounted server. Before the hybrid node similar capacity and accessibility would have required 4U of rack space, 2U for the accelerator node and another 2U for the storage node.

The HS8-4000 HN supports 4.9TB/hr standard or 5.6TB/hr per node with NetBackup OST IO express ingest rates and 12-4TB, 3.5in SATA drives, with up to 48TB of raw capacity. They have also introduced an HS8-4000 SN which just consists of the 48TB of additional storage capacity. Gen4 is the first use of 4TB drives we have seen anywhere and quadruples raw capacity per node over the Gen3 storage nodes. HYDRAstor clusters can scale from 2- to 165-nodes and performance scales linearly with the number of cluster nodes.

With the new HS8-4000 systems, maximum capacity for a 165 node cluster is now 7.9PB raw and supports up to 920.7 TB/hr (almost a PB/hr, need to recalibrate my units) with an all 165-HS8-4000 HN node cluster. Of course, how many customers need a PB/hr of backup ingest is another question. Let alone, 7.9PB of raw storage which of course gets deduplicated to an effective capacity of over 100PBs of backup data (or 0.1EB, units change again).

NEC has also introduced a new low end appliance the HS3-410 for remote/branch office environments that has a 3.2TB/hr ingest with up to 24TB of raw storage. This is only available as a single node system.

~~~~
Maybe Facebook could use a 0.1EB backup repository?

Image: Intel Team Inside Facebook Data Center by IntelFreePress

 

SPC-2 performance results MBPS/drive – chart of the month

(SCISPC121029-005B) (c) 2013 Silverton Consulting, Inc. All Rights Reserved
(SCISPC121029-005B) (c) 2013 Silverton Consulting, Inc. All Rights Reserved

The above chart is from our October newsletter and is one of 5 charts we discussed in the Storage Performance Council benchmarks analysis.  There’s something intriguing about the above chart. Specifically, the band of results in numbers 2 through 10 range from a high of 45.7 to a low of 41.5 MBPS/drive.  The lone outlier is the SGI InfiniteStorage system which managed to achieve 67.7 MBPS/drive.

It turns out that the SGI system is actually a NetApp E5460 (from their LSI acquisition) with 60-146GB disk drives in a RAID 6 configuration.  Considering that the configuration ASU (storage capacity used during the test) was 7TB and the full capacity was 8TB, it seemed to use all the drives to the fullest extent possible.  The only other interesting tidbit about the SGI/NetApp system was the 16GB of system memory (which I assume was mostly used for caching).  Other than that it just seemed to be a screamer of a system from a throughput perspective.

Earlier this year I was at an analyst session with NetApp where they were discussing there thoughts on where E-series was going to focus on. One of the items was going to be high throughput intensive applications. From what we see here, they seem to have the right machine to go after this market.

The only storage to come close was an older Oracle J4200 series system which had no RAID protection, which we would not recommend for any data application.   Not sure what the IBM DS5300 series storage is OEMed from but it might be another older E-Series system.

A couple of caveats are in order for our MBPS/drive charts:

  • These are disk-only systems, any system using SSDs or FlashCache are excluded from this analysis
  • These systems all use 140GB disks or larger. (Some earlier SPC benchmarks used 36GB drives).

Also, please note the MBPS SPC-2 metric is a composite (average) of Video-on-demand, Large database query and Large file processing workload.

More information on SPC-2 performance as well as our SPC-1, SPC-2 and ESRP ChampionsCharts for block storage systems can be found in our SAN Storage Buying Guide available for purchase on our web site).

~~~~

The complete SPC-1 and SPC-2 performance report went out in SCI’s October newsletter.  But a copy of the report will be posted on our dispatches page sometime this month (if all goes well).  However, you can get the latest storage performance analysis now and subscribe to future free newsletters by just using the signup form above right.

As always, we welcome any suggestions or comments on how to improve our SPC  performance reports or any of our other storage performance analyses.


 

HDS Influencer Summit wrap up

[Sorry for the length, it was a long day] There was an awful lot of information suppied today. The morning sessions were all open but most of the afternoon was under NDA.

Jack Domme,  HDS CEO started the morning off talking about the growth in HDS market share.  Another 20% y/y growth in revenue for HDS.  They seem to be hitting the right markets with the right products.  They have found a lot of success in emerging markets in Latin America, Africa and Asia.  As part of this thrust into emerging markets HDS is opening up a manufacturing facility in Brazil and a Sales/Solution center in Columbia.

Jack spent time outlining the infrastructure cloud to content cloud to information cloud transition that they believe is coming in the IT environment of the future.   In addition, there has been even greater alignment within Hitachi Ltd and consolidation of engineering teams to tackle new converged infrastructure needs.

Randy DeMont, EVP and GM Global Sales, Services and Support got up next and talked about their success with the channel. About 50% of their revenue now comes from indirect sources. They are focusing some of their efforts to try to attract global system integrators that are key purveyors to Global 500 companies and their business transformation efforts.

Randy talked at length about some of their recent service offerings including managed storage services. As customers begin to trust HDS with their storage they are start considering moving their whole data center to HDS. Randy said this was a $1B opportunity for HDS and the only thing holding them back is finding the right people with the skills necessary to provide this service.

Randy also mentioned that over the last 3-4 years HDS has gained 200-300 new clients a quarter, which is introducing a lot of new customers to HDS technology.

Brian Householder, EVP, WW Marketing, Business Development and Partners got up next and talked about how HDS has been delivering on their strategic vision for the last decade or so.    With HUS VM, HDS has moved storage virtualization down market, into a rack mounted 5U storage subsystem.

Brian mentioned that 70% of their customers are now storage virtualized (meaning that they have external storage managed by VSP, HUS VM or prior versions).  This is phenomenal seeing as how only a couple of years back this number was closer to 25%.  Later at lunch I probed as to what HDS thought was the reason for this rapid adoption, but the only explanation was the standard S-curve adoption rate for new technologies.

Brian talked about some big data applications where HDS and Hitachi Ltd, business units collaborate to provide business solutions. He mentioned the London Summer Olympics sensor analytics, medical imaging analytics, and heavy construction equipment analytics. Another example he mentioned was financial analysis firms usingsatellite images of retail parking lots to predict retail revenue growth or loss.  HDS’s big data strategy seems to be vertically focused building on the strength in Hitachi Ltd’s portfolio of technologies. This was the subject of a post-lunch discussion between John Webster of Evaluator group, myself and Brian.

Brian talked about their storage economics professional services engagement. HDS has done over 1200 storage economics engagements and  have written books on the topic as well as have iPad apps to support it.  In addition, Brian mentioned that in a late The Info Pro survey, HDS was rated number 1 in value for storage products.

Brian talked some about HDS strategic planning frameworks one of which was an approach to identify investments to maximize share of IT spend across various market segments.  Since 2003 when HDS was 80% hardware revenue company to today where they are over 50% Software and Services revenue they seem to have broaden their portfolio extensively.

John Mansfield, EVP Global Solutions Strategy and Development and Sean Moser, VP Software Platforms Product Management spoke next and talked about HCP and HNAS integration over time. It was just 13 months ago that HDS acquired BlueArc and today they have integrated BlueArc technology into HUS VM and HUS storage systems (it was already the guts of HNAS).

They also talked about the success HDS is having with HCP their content platform. One bank they are working with plans to have 80% of their data in an HCP object store.

In addition there was a lot of discussion on UCP Pro and UCP Select, HDS’s converged server, storage and networking systems for VMware environments. With UCP Pro the whole package is ordered as a single SKU. In contrast, with UCP Select partners can order different components and put it together themselves.  HDS had a demo of their UCP Pro orchestration software under VMware vSphere 5.1 vCenter that allowed VMware admins to completely provision, manage and monitor servers, storage and networking for their converged infrastructure.

They also talked about their new Hitachi Accelerated Flash storage which is an implementation of a Flash JBOD using MLC NAND but with extensive Hitachi/HDS intellectual property. Together with VSP microcode changes, the new flash JBOD provides great performance (1 Million IOPS) in a standard rack.  The technology was developed specifically by Hitachi for HDS storage systems.

Mike Walkey SVP Global Partners and Alliances got up next and talked about their vertical oriented channel strategy.  HDS is looking for channel partners perspective the questions that can expand their reach to new markets, providing services along with the equipment and that can make a difference to these markets.  They have been spending more time and money on vertical shows such as VMworld, SAPhire, etc. rather than horizontal storage shows (such as SNW). Mike mentioned key high level partnerships with Microsoft, VMware, Oracle, and SAP as helping to drive solutions into these markets.

Hicham Abhessamad, SVP, Global Services got up next and talked about the level of excellence available from HDS services.  He indicated that professional services grew by 34% y/y while managed services grew 114% y/y.  He related a McKinsey study that showed that IT budget priorities will change over the next couple of years away from pure infrastructure to more analytics and collaboration.  Hicham talked about a couple of large installations of HDS storage and what they are doing with it.

There were a few sessions of one on ones with HDS executives and couple of other speakers later in the day mainly on NDA topics.  That’s about all I took notes on.  I was losing steam toward the end of the day.

Comments?

Fall SNWUSA wrap-up

Attended SNWUSA this week in San Jose,  It’s hard to see the show gradually change when you attend each one but it does seem that the end-user content and attendance is increasing proportionally.  This should bode well for future SNWs. Although, there was always a good number of end users at the show but the bulk of the attendees in the past were from storage vendors.

Another large storage vendor dropped their sponsorship.  HDS no longer sponsors the show and the last large vendor still standing at the show is HP.  Some of this is cyclical, perhaps the large vendors will come back for the spring show, next year in Orlando, Fl.  But EMC, NetApp and IBM seemed to have pretty much dropped sponsorship for the last couple of shows at least.

SSD startup of the show

Skyhawk hardware (c) 2012 Skyera, all rights reserved (from their website)
Skyhawk hardware (c) 2012 Skyera, all rights reserved (from their website)

The best, new SSD startup had to be Skyera. A 48TB raw flash dual controller system supporting iSCSI block protocol and using real commercial grade MLC.  The team at Skyera seem to be all ex-SandForce executives and technical people.

Skyera’s team have designed a 1U box called the Skyhawk, with  a phalanx of NAND chips, there own controller(s) and other logic as well. They support software compression and deduplication as well as a special designed RAID logic that claims to reduce extraneous write’s to something just over 1 for  RAID 6, dual drive failure equivalent protection.

Skyera’s underlying belief is that just as consumer HDAs took over from the big monster 14″ and 11″ disk drives in the 90’s sooner or later commercial NAND will take over from eMLC and SLC.  And if one elects to stay with the eMLC and SLC technology you are destined to be one to two technology nodes behind. That is, commercial MLC (in USB sticks, SD cards etc) is currently manufactured with 19nm technology.  The EMLC and SLC NAND technology is back at 24 or 25nm technology.  But 80-90% of the NAND market is being driven by commercial MLC NAND.  Skyera came out this past August.

Coming in second place was Arkologic an all flash NAS box using SSD drives from multiple vendors. In their case a fully populated rack holds about 192TB (raw?) with an active-passive controller configuration.  The main concern I have with this product is that all their metadata is held in UPS backed DRAM (??) and they have up to 128GB of DRAM in the controller.

Arkologic’s main differentiation is supporting QOS on a file system basis and having some connection with a NIC vendor that can provide end to end QOS.  The other thing they have is a new RAID-AS which is special designed for Flash.

I just hope their USP is pretty hefty and they don’t sell it someplace where power is very flaky, because when that UPS gives out, kiss your data goodbye as your metadata is held nowhere else – at least that’s what they told me.

Cloud storage startup of the show

There was more cloud stuff going on at the show. Talked to at least three or four cloud gateway providers.  But the cloud startup of the show had to be Egnyte.  They supply storage services that span cloud storage and on premises  storage with an in band or out-of-band solution and provide file synchronization services for file sharing across multiple locations.  They have some hooks into NetApp and other major storage vendor products that allows them to be out-of-band for these environments but would need to be inband for other storage systems.  Seems an interesting solution that if succesful may help accelerate the adoption of cloud storage in the enterprise, as it makes transparent whether storage you access is local or in the cloud. How they deal with the response time differences is another question.

Different idea startup of the show

The new technology showplace had a bunch of vendors some I had never heard of before but one that caught my eye was Actifio. They were at VMworld but I never got time to stop by.  They seem to be taking another shot at storage virtualization. Only in this case rather than focusing on non-disruptive file migration they are taking on the task of doing a better job of point in time copies for iSCSI and FC attached storage.

I assume they are in the middle of the data path in order to do this and they seem to be using copy-on-write technology for point-in-time snapshots.  Not sure where this fits, but I suspect SME and maybe up to mid-range.

Most enterprise vendors have solved these problems a long time ago but at the low end, it’s a little more variable.  I wish them luck but although most customers use snapshots if their storage has it, those that don’t, seem unable to understand what they are missing.  And then there’s the matter of being in the data path?!

~~~~

If there was a hybrid startup at the show I must have missed them. Did talk with Nimble Storage and they seem to be firing on all cylinders.  Maybe someday we can do a deep dive on their technology.  Tintri was there as well in the new technology showcase and we talked with them earlier this year at Storage Tech Field Day.

The big news at the show was Microsoft purchasing StorSimple a cloud storage gateway/cache.  Apparently StorSimple did a majority of their business with Microsoft’s Azure cloud storage and it seemed to make sense to everyone.

The SNIA suite was hopping as usual and the venue seemed to work well.  Although I would say the exhibit floor and lab area was a bit to big. But everything else seemed to work out fine.

On Wednesday, the CIO from Dish talked about what it took to completely transform their IT environment from a management and leadership perspective.  Seemed like an awful big risk but they were able to pull it off.

All in all, SNW is still a great show to learn about storage technology at least from an end-user perspective.  I just wish some more large vendors would return once again, but alas that seems to be a dream for now.

Shingled magnetic recording disks

A couple of weeks ago I attended a day of the SNIA Storage Developers Conference (SDC) where Garth Gibson of Carnegie Mellon University Parallel Data Lab (CMU PDL) and Panasas was giving a talk of what they are up to at CMU’s storage lab.  His talk at the conference was on shingled magnetic recording (SMR) disks. We have discussed this topic before in our posts on Sequential only disks?!  and in Disk trends revisited.  SMR may require a re-thinking of how we currently access disk storage.

Recall that shingled magnetic recording uses a write head that overwrites multiple tracks at a time (see graphic above), with one track being properly written and the adjacent (inward) tracks being overwritten. As the head moves to the next track, that track can be properly written but more adjacent (inward) tracks are overwritten, etc. In this fashion data can be written sequentially, on overlapping write passes.  In contrast, read heads can be much narrower and are able to read a single track.

In my post, I assumed that this would mean that the new shingled magnetic recording disks would need to be accessed sequentially not unlike tape. Such a change would need a massive rewrite to only write data sequentially.  I had suggested this could potentially work if one were to add some SSD or other NVRAM to the device to help manage the mapping of the data to the disk.  Possibly that plus a very sophisticated drive controller, not unlike SSD wear leveling today, could handle mapping a physically sequentially accessed disk to a virtually randomly accessed storage protocol.

Garth’s approach to the SMR dilemma

Garth and his team of researchers are taking another tack at the problem. In his view there are multiple groups of tracks on an SMR disk (zones or bands).  Each band can be either written sequentially or randomly but all bands can be read randomly.  One can break up the disk to include sections of multiple shingled bands, that are sequentially written and less, non-shingled bands that can be randomly written. Of course there would be a gap between the shingled bands in order not to overwrite adjacent bands. And there would also be gaps between the randomly written tracks in a non-shingled partition to allow for the wider track writing that occurs with the SMR write head.

His pitch at the conference dealt with some characteristics of such a multi-band disk device.  Such as

  • How to determine the density for a device that has multiple bands of both shingled write data and randomly written data.
  • How big or small a shingled band should be in order to support “normal” small block and randomly accessed file IO.
  • How many randomly written tracks or what the capacity of the non-shingled bands would need to be to support “normal” file IO activity.

For maximum areal density one would want large shingled bands.  There are other interesting considerations that were not as obvious but I won’t go into here.

SCSI protocol changes for SMR disks

The other, more interesting section of Garth’s talk was on recent proposed T10 and T13 changes to support SMR disks that supported shingled and non-shingled partitions and what needed to be done to support SMR devices.

The SCSI protocol changes being considered to support SMR devices include:

  • A new write cursor for shingled write bands that indicates the next LBA to be written.  The write cursor starts out at a relative band address of 0 and as each LBA is written consecutively in the band it’s incremented by one.
  • A write cursor can be reset (to zero) indicating that the band has been erased
  • Each drive maintains the band map and current cursor position within each band and this can be requested by SCSI drivers to understand the configuration of the drive.

Probably other changes are required as well but these seem sufficient to flesh out the problem.

SMR device software support

Garth and his team implemented an SMR device, emulated in software using real random accessed devices.  They then implemented an SMR device driver that used the proposed standards changes and finally, implemented a ShingledFS file system to use this emulated SMR disk to see how it would work.  (See their report on Shingled Magnetic Recording for Big Data Applications for more information.)

The CMU team implemented a log structured file system for the ShingledFS that only wrote data to the emulated SMR disk shingled partition sequentially, except for mapping and meta-data information which was written and updated randomly in a non-shingled partition.

You may recall that a log structured file system is essentially written as a sequential stream of data (not unlike a log).  But there is additional mapping required that indicates where file data is located in the log which allows for randomly accessing the file data.

In their report and at the conference, Garth presented some benchmark results for a big data application called Terasort (essentially Teragen, Terasort and Teravalidate) which seems to use Hadoop to sort a large body of data.   Not sure I can replicate this information here but suffice it to say at the moment the emulated SMR device with ShingledFS did not beat a base EXT3 or FUSE using the same hardware for these applications.

Now the CMU project wAs done by a bunch of smart researchers but it’s still relatively new and not necessarily that optimized.  Thus, there’s probably some room for improvement in the ShingledFS and maybe even the emulated SMR device and/or the SMR device driver.

At the moment Garth and his team seem to believe that SMR devices are certainly feasible and would take only modest changes to the SCSI protocols to support such devices.  As for file system support there is plenty of history surrounding log structured file systems so these are certainly doable but would require probably extensive development to implemented in various OS to support an SMR device.  The device driver changes don’t seem to be as significant.

~~~~

It certainly looks like there’s going to be SMR devices in our future.  It’s just a question whether they will be ever as widely supported as the randomly accessed disk device we know and love today.  Possibly, this could all be behind a storage subsystem that makes the technology available as networked storage capacity and over time maybe SMR devices could be implemented in more standard OS device drivers and file systems.  Nevertheless, to keep capacity and areal density on their current growth trajectory, SMR disks are coming, it’s just a matter of time.

Comments?

Image: (c) 2012 Hitachi Global Storage Technologies, from IEEE SCV Magnetics Society presentation by Roger Wood

 

Latest ESRP results for 1K and under mailboxes – chart of the month

SCIESRP120724(004) (c) 2012 Silverton Consulting, All Rights Reserved

The above chart was from our July newsletter Exchange Solution Reviewed Program (ESRP) performance analysis for 1000 and under mailbox submissions. I have always liked response times as they seem to be mostly the result of tight engineering, coding and/or system architecture.  Exchange response times represent a composite of how long it takes to do a database transaction (whether read, write or log write).  Latencies are measured at the application (Jetstress) level.

On the chart we show the top 10 data base read response times for this class of storage.  We assume that DB reads are a bit more important than writes or log activity but they are all probably important.  As such,  we also show the response times for DB writes and log writes but the ranking is based on DB reads alone.

In the chart above, I am struck by the variability in write and log write performance.  Writes range anywhere from ~8.6 down to almost 1 msec. The extreme variability here begs a bunch of questions.  My guess is the wide variability probably signals something about caching, whether it’s cache size, cache sophistication or drive destage effectiveness is hard to say.

Why EMC seems to dominate DB read latency in this class of storage is also interesting. EMC’s Celerra NX4, VNXe3100, CLARiiON CX4-120, CLARiiON AX4-5i, Iomega ix12-300 and VNXe3300 placed in the top 6 slots, respectively.  They all had a handful of disks (4 to 8), mostly 600GB or larger and used iSCSI to access the storage.  It’s possible that EMC has a great iSCSI stack, better NICs or just better IO scheduling. In any case, they have done well here at least with read database latencies.  However, their write and log latency was not nearly as good.

We like ESRP because it simulates a real application that’s pervasive in the enterprise today, i.e., email.  As such, it’s less subject to gaming, and typically shows a truer picture of multi-faceted storage performance.

~~~~

The complete ESRP performance report with more top 10 charts went out in SCI’s July newsletter.  But a copy of the report will be posted on our dispatches page sometime next month (if all goes well).  However, you can get the ESRP performance analysis now and subscribe to future free newsletters by just using the signup form above right.

For a more extensive discussion of current SAN block system storage performance covering SPC (Top 30) results as well as ESRP results with our new ChampionsChart™ for SAN storage systems, please see SCI’s SAN Storage Buying Guide available from our website.

As always, we welcome any suggestions or comments on how to improve our analysis of ESRP results or any of our other storage performance analyses.


vSphere 5.1 storage enhancements and future vision

We discussed last year’s vSphere 5 storage changes in a previous post.  And at last week’s VMworld2012 in San Francisco, VMware announced a few new enhancements for vSphere 5.1 but showed more on their vision for the future of storage in VMware environments.

vSphere 5.1 storage enhancements were not as significant as last year’s enhancements.  Specifically, vSphere 5.1 storage oriented changes include:

  • VDP – vSphere Data Protector is a new agentless, deduplicating backup solution from VMware (and EMC) which is now bundled into vSphere and comes free for all users at the Essentials+ level and above. VDP is based on EMC’s Avamar Virtual Edition and provides a new integrated data protection management tab in vCenter Operations Manager GUI.  VDP replaces VDR.
  • vMotion changes – vMotion now supports non-shared storage and specifically, VSA storage environments.  To do this vMotion will now perform a standard storage vMotion to the targeted host before the VM vMotion takes place to move the data to the new location.
  • vSphere replication auto-failback with SRM – SRM 5.1 now supports vSphere replication service automated failback. SRM 5 supported storage array based replication automated failback but had no support for the then announced new VMware, host based replication service. This has been rectified with SRM 5.1.
  • SRM packaging changes – SRM standard now comes at no additional charge with the vCloud Suite Standard license option.  And a new entry level SRM (for 6 CPUs, 3 hosts) comes with Essentials+ to match and provide DR services for VSA environments.

VMware storage vision

VMware took the opportunity to discuss their vision for future offerings in the storage arena.  Specifically,

  • vSphere volumes (vVols) –  vVols will become the new defacto standard unit of granularity and abstraction for storage systems, providing a new allocation unit behind VMDKs and eliminating VMFS.  vVols are intended to define a new interface between vSphere and networked storage systems so that VMDKs can now be replicated, snapshot, cloned, etc.  alone without impacting other VMDKs on the storage system.  vVols are intended to replace LUNs and/or files used as previous holding containers for VMDKs.  vVols -should eliminate the mess of having to define 1000s of LUNs required to support VDI or cloud data centers implementations
  • Virtual flash – VMware’s first internal support for server side flash.  VMware will now be able to partition and allocate the flash on PCIe cards to VMs executing in the ESX server just like physical memory and vCPUs are today.  Also VMware will be able to copy flash cache contents when vMotion-ing VMs to other physical servers.  The intent is to fully support PCIe flash cards for vMotion by warm starting the flash in the target server and bring fast access storage closer to VMs.
  • VSAN – also called distributed storage, takes VSA like services and scales it out to support many more hosts/CPUs and networked storage.   The ultimate goal here seems to be to provide a shared, mid-tier, distributed storage system based on VMware DAS, which will better support vSphere execution and high availability.  VSAN will provide compute and storage within the same host.  It’s intended that VSAN be easier to configure, deploy and manage than current VM shared storage solutions.

Where are they going with all this?

I believe VMware is signaling an intent to get more involved in the storage arena.  Last years move with VSA now seems like just the beginning.

If examined together with their other thrusts for the virtual data center, it all starts to make sense. When these three future storage capabilities are in place, VMware should be better able to configure and support virtual cloud data centers (VCD) built out of commodity servers, commodity storage and commodity networking gear.  With all this in place VCDs should be better able to compete with AWS and other cloud service providers.

The end of enterprise storage, …

I was talking with one IT analyst, Dr. Kevin McIsaac with IBRS in Australia who feels when these three capabilities start rolling out, it signals the beginning of the end of enterprise storage as we know it.  He compares  this to what happened to specialized Unix servers (from HP, Sun, IBM, etc.) prominent at the end of the last century and early this century with the introduction of VMware and commodity high-performance, Intel servers/microprocessor chips.  Although these proprietary Unix servers still exist they are no longer growing market share.

In Kevin’s view, VMware is just following that playbook again, only this time it’s enterprise storage in their sights.  Of course, the other side of this is the enterprise networking that starts to be commoditized by all the virtual networking capabilities VMware is rolling out in VxLAN and Nicira integration as well. (Perhaps subject for another post).

… Not quite yet.

I understand his point and can’t help but agree with parts of it at least at the low end and potentially mid-tier storage.  IMHO however, enterprise storage vendors have a viable defense to all this but it involves providing even more functionality, performance and capabilities than they available today in their systems.

I see it every time I look at my performance charts, anytime you start getting over 300 disk drives, storage sophistication matters more to performance, than just throwing more hardware in the mix.  For an example of this effect checkout my last post on SPC-2 performance correlations.

And of course, VMware might be straining their very profitable relationship with storage vendors today such as Dell, HP, IBM, NetApp, EMC, etc. all of which today highlight and push their virtualization solution throughout their partner community.   If they decide to stop recommending VMware and start focusing on other virtualization offerings this might also stall VMware’s vision.

~~~~

In the end I can’t help but feel that in VMware’s view their challenge, in the long run will come from AWS, Google and other cloud service providers. Whatever they can do to better prepare to compete with this gaggle of cloud purveyors, the better they succeed for their enterprise customer. And ultimately that means more business for VMware.  If enterprise networking and storage vendors have to adapt to that vision, then so be it.

Comments?

Data hypervisor

(c) 2012 Silverton Consulting, Inc. All rights reserved

With all this talk of software defined networking and server virtualization where does storage virtualization stand.  I blogged about some problems with storage virtualization a week or so ago in my post on Storage Utilization is broke and this post takes it to the next level.  Also I was at a financial analyst conference this week in Vail where I heard Mark Lewis of Tekrocket but formerly of EMC discuss the need for a data hypervisor to provide software defined storage.

I now believe what we really need for true storage virtualization is a renewed focus on data hypervisor functionality.  The data hypervisor would need both a control plane and a data plane in order to function properly.   Ideally the control plane would set up the interface and routing for the data plane hardware and the server and/or backend storage would be none the wiser.

DMs everywhere

I envision a scenario where a customer’s application data is packaged with a data hypervisor which runs on a commodity data switch hardware with data plane and control plane software running on it.  Sort of creating (virtual) data machines or DMs.

All enterprise and nowadays most midrange storage provide most of the functionality of a storage control plane such as defining units of storage, setting up physical to logical storage mapping, incorporating monitoring, and management of the physical storage layer, etc.  So control planes are pervasive in today’s storage but proprietary.

In addition most storage systems have data plane functionality which operates to connect a host IO request to the actual data which resides in backend storage or internal cache.  But again although data planes are everywhere in storage today they are all proprietary to a specific vendor’s storage system.

Data switch needed

But in order to utilize a data hypervisor and create a more general purpose control plane layer, we need a more generic data plane layer that operates on commodity hardware. This is different from today’s SAN storage switches or DCB switches but similar in a some ways.

The functions of the data switch/data plane layer would be to take routing instructions from the control plane layer and direct the server IO request to the proper storage unit using the data plane layer.  Somewhere in this world view, probably at the data plane level it would introduce data protection services like RAID or other erasure coding schemes, point in time copy/clone services and replication services and other advanced storage features needed by enterprise storage today.

Also it would need to provide some automated storage movement across and within tiers of physical storage and it would connect server storage interfaces at the front end to storage interfaces at the backend.  Not unlike SAN or DCB switches but with much more advanced functionality.

Ideally data switch storage interfaces could attach to dedicated JBOD, Flash arrays as well as systems using DAS  storage.  In addition, it would be nice if the data switch could talk to real storage arrays on SAN, IP/SANs or NFS&CIFS/SMB storage systems.

The other thing one would like out of a data switch is support for a universal translator that would map one protocol to another, such as iSCSI to SAS, NFS to FC, or FC to NFS and any other combination, depending on the needs of the server and the storage in the configuration.

Now if the data switch were built on top of commodity x86 hardware and software with the data switch as just a specialized application that would create the underpinnings for a true data hypervisor with a control and data plane that could be independent and use anybody’s storage.

Data hypervisor

Assuming all this were available then we would have true storage virtualization.  With these capabilities, storage could be repurposed on the fly, added to, subtracted from, and in general be a fungible commodity not unlike server processing MIPs under VMware or Hyper-V.

Application data would then needed to be packaged into a data machine which would offer all the host services required to support host data access.  The data hypervisor would handle the linkages required to interface with the control and data layers.

Applications could be configured to utilize available storage at ease and storage could grow,  shrink or move to accommodate the required workload just as easily as VMs can be deployed today.

How we get there

Aside from the VMware, Citrix, Microsoft thrusts towards virtual storage there are plenty of storage virtualization solutions that can control most backend enterprise SAN storage. However, the problem with these solutions is that in general the execute only on a specific vendors hardware and don’t necessarily talk to DAS or JBOD storage.

In addition, not all of the current generation storage virtualization solutions are unified. That is most of these today only talk FC, FCoE or iSCSI and don’t support NFS or CIFS/SMB.

These don’t appear to be insurmountable obstacles and with proper allocation of R&D funding, could all be solved.

However the more problematic is that none of these solutions operate on commodity hardware or commodity software.

The hardware is probably the easiest to deal with. Today many enterprise storage systems are built ontop of x86 processor storage controllers. Albeit sometimes they incorporate specialized packaging for redundancy and high availability.

The harder problem may be commodity software. Although the genesis for a few storage virtualization systems might come from BSD or other “commodity” software operating systems. They have been modified over the years to no longer represent anything that can run on standard off the shelf operating systems.

Then again some storage virtualization systems started out with special home grown hardware and software. As such, converting these over to something more commodity oriented would be a major transition.

But the challenge is how to get there from here and would anyone want to take this on.  The other problem is that the value add that storage vendors supply currently would be somewhat eroded.  Not unlike what happened to proprietary Unix systems with the advent of VMware.

But this will not take place overnight and the company that takes this on and makes a go at it can have a significant software monopoly that would be hard to crack.

Perhaps it will take a startup to do this but I believe the main enterprise storage vendors are best positioned to take this on.

Comments?