OpenFlow, the next wave in networking

OpenFlow Logo (from www.OpenFlow.org)
OpenFlow Logo (from http://www.OpenFlow.org)

Read two articles recently about how OpenFlow‘s Software Defined Networking is going to take over the networking world, just like VMware and it’s brethern have taken over the server world.

Essentially, OpenFlow is a network protocol that separates the control management of a networking switch or router (control plane) from it’s data path activities (data plane).  For most current switches, control management consists of vendor supplied,  special purpose software which differs for each and every vendor and sometimes even varies  across vendor product lines.

In contrast, data path activities are fairly similar for most of today’s switches and is generally implemented in custom hardware so as to be lightening fast.

However, the main problem with today’s routers and switches is that there is no standard way to talk or even modify the control management software to modify it’s data plane activities.

OpenFlow to the rescue

OpenFlow changes all that. First it specifies a protocol or interface between a switches control plane and it’s data plane.  This allows that control plane to run on any server and still provide management for a router or switch data path activities.  By doing this OpenFlow provides Software Defined Networking (SDN).

Once OpenFlow switches and control software are in place, the SDN can better control and manage networking activity to optimize for performance, utilization or any other number of parameters.

Products are starting to come out which support OpenFlow protocols.  For example, a new OpenFlow compatible ethernet switch is available from IBM (their RackSwitch G8264 & G8264T) and HP has recently released OpenFlow software for their ethernet switches (see OpenFlow blog post).  At least some in the industry are starting to see the light.

Google implements OpenFlow

The surprising thing is that one article I read recently is about Google running an OpenFlow network on it’s data center backbone (see Wired’s Google goes with the Flow article).   In the article it discusses how a top Google scientist talked about how they implemented OpenFlow for their internal network architecture at the Open Networking Summit yesterday.

Google’s internal network connects it’s multiple data centers together to provide Google Apps and other web services.  Apparently, Google has been secretly creating/buying OpenFlow networking equipment and creating it’s own OpenFlow software. This new SDN they have constructed has given them the ability to change their internal network backbone in minutes which would have taken days, weeks or even months before. Also, OpenFlow has given Google the ability to simulate network changes ahead of time allowing them to see what potential changes will do for them.

One key metric is that Google now runs their backbone network close to 100% utilized at all times whereas before they worked hard to get it to 30-40% utilization.

Nicira revolutionizes networking

The other article I read was about a startup called Nicira out of Palo Alto, CA which is taking OpenFlow to the next level by defining a Network Virtual Platform (NVP) and Open vSwitches (OVS).

  • A NVP  is a network virtualization platform controller which consists of cluster of x86 servers running the network virtualization control software providing a RESTful web services API and defines/manages virtual networks.
  • An OVS is an Open vSwitch software designed for remote control that either runs as a complete software only service in various hypervisors or as gateway software connecting VLANs running on proprietary vendor hardware to the SDN.

OVS gateway services can be used with current generation switches/routers or be used with high performing, simple L3 switches specifically designed for OpenFlow management.

Nonetheless, with NVP and OVS deployed over your networking hardware it removes many of the limitations inherent in current networking services.  For example, Nicira network virtualization, allows the movement of application workloads across subnets while maintaining L2 adjacency, scalable multi-tenant isolation and the ability to repurpose physical infrastrucuture on demand.

By virtualizing the network, the network switching/router hardware becomes a pool of IP-switching services, available to be repurposed and/or reprogrammed at a moments notice.  Not unlike what VMware did with servers through virtualization.

Customers for Nicira include eBay, RackSpace and AT&T to name just a few.  It seems that networking virtualization is especially valuable to big web services and cloud services companies.

~~~~

Virtualization takes on another industry, this time networking and changes it forever.

We really need something like OpenFlow for storage.  Taking storage administration out of the vendor hands and placing it elsewhere.  Defining an open storage management protocol that all storage vendors would honor.

The main problem with storage virtualization today is it’s kind of like VLANs, all vendor specific.   Without, something like a standard protocol, that proscribes a storage management plane’s capabilities and a storage data plane’s capabilities we can not really have storage virtualization.

Top 10 storage technologies over the last decade

Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)
Aurora's Perception or I Schrive When I See Technology by Wonderlane (cc) (from Flickr)

Some of these technologies were in development prior to 2000, some were available in other domains but not in storage, and some were in a few subsystems but had yet to become popular as they are today.  In no particular order here are my top 10 storage technologies for the decade:

  1. NAND based SSDs – DRAM and other technology solid state drives (SSDs) were available last century but over the last decade NAND Flash based devices have dominated SSD technology and have altered the storage industry forever more.  Today, it’s nigh impossible to find enterprise class storage that doesn’t support NAND SSDs.
  2. GMR head– Giant Magneto Resistance disk heads have become common place over the last decade and have allowed disk drive manufacturers to double data density every 18-24 months.  Now GMR heads are starting to transition over to tape storage and will enable that technology to increase data density dramatically
  3. Data DeduplicationDeduplication technologies emerged over the last decade as a complement to higher density disk drives as a means to more efficiently backup data.  Deduplication technology can be found in many different forms today, ranging from file and block storage systems, backup storage systems, to backup software only solutions.
  4. Thin provisioning – No one would argue that thin provisioning emerged last century but it took the last decade to really find its place in the storage pantheon.  One almost cannot find a data center class storage device that does not support thin provisioning today.
  5. Scale-out storage – Last century if you wanted to get higher IOPS from a storage subsystem you could add cache or disk drives but at some point you hit a subsystem performance wall.  With scale-out storage, one can now add more processing elements to a storage system cluster without having to replace the controller to obtain more IO processing power.  The link reference talks about the use of commodity hardware to provide added performance but scale-out storage can also be done with non-commodity hardware (see Hitachi’s VSP vs. VMAX).
  6. Storage virtualizationserver virtualization has taken off as the dominant data center paradigm over the last decade but a counterpart to this in storage has also become more viable as well.  Storage virtualization was originally used to migrate data from old subsystems to new storage but today can be used to manage and migrate data over PBs of physical storage dynamically optimizing data placement for cost and/or performance.
  7. LTO tape When IBM dominated IT in the mid to late last century, the tape format dejour always matched IBM’s tape technology.  As the decade dawned, IBM was no longer the dominant player and tape technology was starting to diverge into a babble of differing formats.  As a result, IBM, Quantum, and HP put their technology together and created a standard tape format, called LTO, which has become the new dominant tape format for the data center.
  8. Cloud storage Unclear just when over the last decade cloud storage emerged but it seemed to be a supplement to cloud computing that also appeared this past decade.  Storage service providers had existed earlier but due to bandwidth limitations and storage costs didn’t survive the dotcom bubble. But over this past decade both bandwidth and storage costs have come down considerably and cloud storage has now become a viable technological solution to many data center issues.
  9. iSCSI SCSI has taken on many forms over the last couple of decades but iSCSI has the altered the dominant block storage paradigm from a single, pure FC based SAN to a plurality of technologies.  Nowadays, SMB shops can have block storage without the cost and complexity of FC SANs over the LAN networking technology they already use.
  10. FCoEOne could argue that this technology is still maturing today but once again SCSI has taken opened up another way to access storage. FCoE has the potential to offer all the robustness and performance of FC SANs over data center Ethernet hardware simplifying and unifying data center networking onto one technology.

No doubt others would differ on their top 10 storage technologies over the last decade but I strived to find technologies that significantly changed data storage that existed in 2000 vs. today.  These 10 seemed to me to fit the bill better than most.

Comments?

5 Reasons to Virtualize Storage

Storage virtualization has been out for at least 5 years now and one can see more and more vendors offering products in this space. I have written before about storage virtualization in “Virtualization: Tales from the Trenches” article and I would say little has changed since then but it’s time for a refresher.

Storage virtualization differs from file or server virtualization by focusing only on FC storage domain. Unlike server virtualization there is no need to change host operating environments to support most storage virtualization products.

As an aside, there may be some requirement for iSCSI storage virtualization but to date I haven’t seen much emphasis on this. Some of the products listed below may support iSCSI frontends for FC backend storage subsystems but I am unaware of any that can support FC or iSCSI frontend for iSCSI backend storage.

I can think of at least the following storage virtualization products – EMC Invista, FalconStor IPStor, HDS USP-V, IBM SVC, and NetApp ONTAP. There are more than just these but they have the lion’s share of installations, Most of these products offer similar capabilities:

  1. Ability to non-disruptively migrate data from one storage subsystem to another. This can be used to help ease technology obsolescence by online migrating data from an old subsystem to a new subsystem. There are some tools and/or services on the market which can help automate this process but storage virtualization trumps them all in that it can help tech refresh as well as provide other services.
  2. Ability to better support multiple storage tiers by migrating data from one storage tier to another. Non-disruptive data migration can also ease implementation of multiple storage tiers such as slow/high capacity disk, fast/low capacity disk and SSD storage within one storage environment. Some high end subsystems can do this with multiple storage tiers within one subsystems, but only storage virtualization can do this across storage subsystems.
  3. Ability to aggregate heterogeneous storage subsystems under one storage management environment. The other major characteristic of most storage virtualization products is that they support multiple vendor storage subsystems under one storage cluster. This can be very valuable in multi-vendor shops by providing a single management interface to provision and administer all storage under a single storage virtualization environment.
  4. Ability to scale out rather than just scale up storage performance. By aggregating storage subsystems into a single storage cluster one can add storage performance by simple adding more storage virtualization cluster nodes. Not every storage virtualization system supports multiple cluster nodes but those that do offer another dimension to storage subsystem performance.
  5. Ability to apply high-end functionality to low-end storage. This takes many forms not the least of which is sophisticated caching, point-in-time copies and data replication or mirroring capabilities typically found only in higher end storage subsystems. Such capabilities can be supplied to any and all storage underneath the storage virtualization environment and can make storage much easier to use effectively.

There are potential downsides to storage virtualization as well, not the least of which is lock-in but this may be somewhat of a red-herring. Most storage virtualization products make it easy to migrate storage into the virtualization environment. Some of these products also make it relatively easy to migrate storage out of their environment as well. This is more complex because data that was once on this storage could be almost anywhere in the current virtualized storage subsystems and would need to be re-constituted back in one piece on the storage being exported.

The other reason for lock-in is that the functionality provided by storage virtualization makes it harder to remove. But it would probably be more correct to say “once you virtualize storage you never want to go back”. Many customers I talk with that have had a good initial experience with storage virtualization want to do it again, whenever given the chance.