Data hypervisor

(c) 2012 Silverton Consulting, Inc. All rights reserved

With all this talk of software defined networking and server virtualization where does storage virtualization stand.  I blogged about some problems with storage virtualization a week or so ago in my post on Storage Utilization is broke and this post takes it to the next level.  Also I was at a financial analyst conference this week in Vail where I heard Mark Lewis of Tekrocket but formerly of EMC discuss the need for a data hypervisor to provide software defined storage.

I now believe what we really need for true storage virtualization is a renewed focus on data hypervisor functionality.  The data hypervisor would need both a control plane and a data plane in order to function properly.   Ideally the control plane would set up the interface and routing for the data plane hardware and the server and/or backend storage would be none the wiser.

DMs everywhere

I envision a scenario where a customer’s application data is packaged with a data hypervisor which runs on a commodity data switch hardware with data plane and control plane software running on it.  Sort of creating (virtual) data machines or DMs.

All enterprise and nowadays most midrange storage provide most of the functionality of a storage control plane such as defining units of storage, setting up physical to logical storage mapping, incorporating monitoring, and management of the physical storage layer, etc.  So control planes are pervasive in today’s storage but proprietary.

In addition most storage systems have data plane functionality which operates to connect a host IO request to the actual data which resides in backend storage or internal cache.  But again although data planes are everywhere in storage today they are all proprietary to a specific vendor’s storage system.

Data switch needed

But in order to utilize a data hypervisor and create a more general purpose control plane layer, we need a more generic data plane layer that operates on commodity hardware. This is different from today’s SAN storage switches or DCB switches but similar in a some ways.

The functions of the data switch/data plane layer would be to take routing instructions from the control plane layer and direct the server IO request to the proper storage unit using the data plane layer.  Somewhere in this world view, probably at the data plane level it would introduce data protection services like RAID or other erasure coding schemes, point in time copy/clone services and replication services and other advanced storage features needed by enterprise storage today.

Also it would need to provide some automated storage movement across and within tiers of physical storage and it would connect server storage interfaces at the front end to storage interfaces at the backend.  Not unlike SAN or DCB switches but with much more advanced functionality.

Ideally data switch storage interfaces could attach to dedicated JBOD, Flash arrays as well as systems using DAS  storage.  In addition, it would be nice if the data switch could talk to real storage arrays on SAN, IP/SANs or NFS&CIFS/SMB storage systems.

The other thing one would like out of a data switch is support for a universal translator that would map one protocol to another, such as iSCSI to SAS, NFS to FC, or FC to NFS and any other combination, depending on the needs of the server and the storage in the configuration.

Now if the data switch were built on top of commodity x86 hardware and software with the data switch as just a specialized application that would create the underpinnings for a true data hypervisor with a control and data plane that could be independent and use anybody’s storage.

Data hypervisor

Assuming all this were available then we would have true storage virtualization.  With these capabilities, storage could be repurposed on the fly, added to, subtracted from, and in general be a fungible commodity not unlike server processing MIPs under VMware or Hyper-V.

Application data would then needed to be packaged into a data machine which would offer all the host services required to support host data access.  The data hypervisor would handle the linkages required to interface with the control and data layers.

Applications could be configured to utilize available storage at ease and storage could grow,  shrink or move to accommodate the required workload just as easily as VMs can be deployed today.

How we get there

Aside from the VMware, Citrix, Microsoft thrusts towards virtual storage there are plenty of storage virtualization solutions that can control most backend enterprise SAN storage. However, the problem with these solutions is that in general the execute only on a specific vendors hardware and don’t necessarily talk to DAS or JBOD storage.

In addition, not all of the current generation storage virtualization solutions are unified. That is most of these today only talk FC, FCoE or iSCSI and don’t support NFS or CIFS/SMB.

These don’t appear to be insurmountable obstacles and with proper allocation of R&D funding, could all be solved.

However the more problematic is that none of these solutions operate on commodity hardware or commodity software.

The hardware is probably the easiest to deal with. Today many enterprise storage systems are built ontop of x86 processor storage controllers. Albeit sometimes they incorporate specialized packaging for redundancy and high availability.

The harder problem may be commodity software. Although the genesis for a few storage virtualization systems might come from BSD or other “commodity” software operating systems. They have been modified over the years to no longer represent anything that can run on standard off the shelf operating systems.

Then again some storage virtualization systems started out with special home grown hardware and software. As such, converting these over to something more commodity oriented would be a major transition.

But the challenge is how to get there from here and would anyone want to take this on.  The other problem is that the value add that storage vendors supply currently would be somewhat eroded.  Not unlike what happened to proprietary Unix systems with the advent of VMware.

But this will not take place overnight and the company that takes this on and makes a go at it can have a significant software monopoly that would be hard to crack.

Perhaps it will take a startup to do this but I believe the main enterprise storage vendors are best positioned to take this on.

Comments?

OpenFlow, the next wave in networking

OpenFlow Logo (from www.OpenFlow.org)
OpenFlow Logo (from http://www.OpenFlow.org)

Read two articles recently about how OpenFlow‘s Software Defined Networking is going to take over the networking world, just like VMware and it’s brethern have taken over the server world.

Essentially, OpenFlow is a network protocol that separates the control management of a networking switch or router (control plane) from it’s data path activities (data plane).  For most current switches, control management consists of vendor supplied,  special purpose software which differs for each and every vendor and sometimes even varies  across vendor product lines.

In contrast, data path activities are fairly similar for most of today’s switches and is generally implemented in custom hardware so as to be lightening fast.

However, the main problem with today’s routers and switches is that there is no standard way to talk or even modify the control management software to modify it’s data plane activities.

OpenFlow to the rescue

OpenFlow changes all that. First it specifies a protocol or interface between a switches control plane and it’s data plane.  This allows that control plane to run on any server and still provide management for a router or switch data path activities.  By doing this OpenFlow provides Software Defined Networking (SDN).

Once OpenFlow switches and control software are in place, the SDN can better control and manage networking activity to optimize for performance, utilization or any other number of parameters.

Products are starting to come out which support OpenFlow protocols.  For example, a new OpenFlow compatible ethernet switch is available from IBM (their RackSwitch G8264 & G8264T) and HP has recently released OpenFlow software for their ethernet switches (see OpenFlow blog post).  At least some in the industry are starting to see the light.

Google implements OpenFlow

The surprising thing is that one article I read recently is about Google running an OpenFlow network on it’s data center backbone (see Wired’s Google goes with the Flow article).   In the article it discusses how a top Google scientist talked about how they implemented OpenFlow for their internal network architecture at the Open Networking Summit yesterday.

Google’s internal network connects it’s multiple data centers together to provide Google Apps and other web services.  Apparently, Google has been secretly creating/buying OpenFlow networking equipment and creating it’s own OpenFlow software. This new SDN they have constructed has given them the ability to change their internal network backbone in minutes which would have taken days, weeks or even months before. Also, OpenFlow has given Google the ability to simulate network changes ahead of time allowing them to see what potential changes will do for them.

One key metric is that Google now runs their backbone network close to 100% utilized at all times whereas before they worked hard to get it to 30-40% utilization.

Nicira revolutionizes networking

The other article I read was about a startup called Nicira out of Palo Alto, CA which is taking OpenFlow to the next level by defining a Network Virtual Platform (NVP) and Open vSwitches (OVS).

  • A NVP  is a network virtualization platform controller which consists of cluster of x86 servers running the network virtualization control software providing a RESTful web services API and defines/manages virtual networks.
  • An OVS is an Open vSwitch software designed for remote control that either runs as a complete software only service in various hypervisors or as gateway software connecting VLANs running on proprietary vendor hardware to the SDN.

OVS gateway services can be used with current generation switches/routers or be used with high performing, simple L3 switches specifically designed for OpenFlow management.

Nonetheless, with NVP and OVS deployed over your networking hardware it removes many of the limitations inherent in current networking services.  For example, Nicira network virtualization, allows the movement of application workloads across subnets while maintaining L2 adjacency, scalable multi-tenant isolation and the ability to repurpose physical infrastrucuture on demand.

By virtualizing the network, the network switching/router hardware becomes a pool of IP-switching services, available to be repurposed and/or reprogrammed at a moments notice.  Not unlike what VMware did with servers through virtualization.

Customers for Nicira include eBay, RackSpace and AT&T to name just a few.  It seems that networking virtualization is especially valuable to big web services and cloud services companies.

~~~~

Virtualization takes on another industry, this time networking and changes it forever.

We really need something like OpenFlow for storage.  Taking storage administration out of the vendor hands and placing it elsewhere.  Defining an open storage management protocol that all storage vendors would honor.

The main problem with storage virtualization today is it’s kind of like VLANs, all vendor specific.   Without, something like a standard protocol, that proscribes a storage management plane’s capabilities and a storage data plane’s capabilities we can not really have storage virtualization.