Intel acquires InfiniBand fabric technology from Qlogic

Isilon Packaging by ChrisDag (cc) (from Flickr)”][InfiniBand interconnected] Isilon Packaging by ChrisDag (cc) (from Flickr)Intel announced today that they are going to acquire the InfiniBand (IB) fabric technology business from Qlogic.

From many analyst’s perspective, IB is one of the only technologies out there that can efficiently interconnect a cluster of commodity servers into a supercomputing system.

What’s InfiniBand?

Recall that IB is one of three reigning data center fabric technologies available today which include 10GbE, and 16 Gb/s FC.  IB is currently available in DDR, QDR and FDR modes of operation, that is 5Gb/s, 10Gb/s or 14Gb/s, respectively per single lane, according to the IB update (see IB trade association (IBTA) technology update).  Systems can aggregate multiple IB lanes in units of 4 or 12 paths (see wikipedia IB article), such that an IB QDRx4 supports 40Gb/s and a IB FDRx4 currently supports 56Gb/s.

The IBTA pitch cited above showed that IB is the most widely used interface for the top supercomputing systems and supports the most power efficient interconnect available (although how that’s calculated is not described).

Where else does IB make sense?

One thing IB has going for it is low latency through the use of RDMA or remote direct memory access.  That same report says that an SSD directly connected through a FC takes about ~45 μsec to do a read whereas the same SSD directly connected through IB using RDMA would only take ~26 μsec.

However, RDMA technology is now also coming out on 10GbE through RDMA over Converged Ethernet (RoCE, pronounced “rocky”).  But ITBA claims that IB RDMA has a 0.6 μsec latency and the RoCE has a 1.3 μsec.  Although at these speed, 0.7 μsec doesn’t seem to be a big thing, it doubles the latency.

Nonetheless, Intel’s purchase is an interesting play.  I know that Intel is focusing on supporting an ExaFLOP HPC computing environment by 2018 (see their release).  But IB is already a pretty active technology in the HPC community already and doesn’t seem to need their support.

In addition, IB has been gradually making inroads into enterprise data centers via storage products like the Oracle Exadata Storage Server using the 40 Gb/s IB QDRx4 interconnects.  There are a number of other storage products out that use IB as well from EMC IsilonSGI, Voltaire, and others.

Of course where IB can mostly be found today is in computer to computer interconnects and just about every server vendor out today, including Dell, HP, IBM, and Oracle support IB interconnects on at least some of their products.

Whose left standing?

With Qlogic out I guess this leaves Cisco (de-emphasized lately), Flextronix, Mellanox, and Intel as the only companies that supply IB switches. Mellanox, Intel (from Qlogic) and Voltaire supply the HCA (host channel adapter) cards which provide the server interface to the switched IB network.

Probably a logical choice for Intel to go after some of this technology just to keep it moving forward and if they want to be seriously involved in the network business.

IB use in Big Data?

On the other hand, it’s possible that Hadoop and other big data applications could conceivably make use of IB speeds and as these are mainly vast clusters of commodity systems it would be a logical choice.

There is some interesting research on the advantages of IB in HDFS (Hadoop) system environments (see Can high performance interconnects boost Hadoop distributed file system performance) out of Ohio State University.  This research essentially says that Hadoop HDFS can perform much better when you combine IB with IPoIB (IP over IB, see OpenFabrics Alliance article) and SSDs.  But SSDs alone do not provide as much benefit.   (Although my reading of the performance charts seems to indicate it’s not that much better than 10GbE with TOE?).

It’s possible other Big data analytics engines are considering using IB as well.  It would seem to be a logical choice if you had even more control over the software stack.

~~~~

Comments?

 

New wireless technology augmenting data center cabling

1906 Patent for Wireless Telegraphy by Wesley Fryer (cc) (from Flickr)
1906 Patent for Wireless Telegraphy by Wesley Fryer (cc) (from Flickr)

I read a report today in Technology Review about how Bouncing data would speed up data centers, which talked about using wireless technology and special ceiling tiles to create dedicated data links between servers.  The wireless signal was in the 60Ghz range and would yield something on the order of couple of Gb per second.

The cable mess

Wireless could solve a problem evident to anyone that has looked under data center floor tiles today – cabling.  Underneath our data centers today there is a spaghetti-like labyrinth of cables connecting servers to switches to storage and back again.  The amount of cables underneath some data centers is so deep and impenetrable that some shops don’t even try to extract old cables when replacing equipment just leaving them in place and layering on new ones as the need arises.

Bouncing data around a data center

The nice thing about the new wireless technology is that you can easily set up a link between two servers (or servers and switches) by just properly positioning antenna and ceiling tiles, without needing any cables.  However, in order to increase bandwidth and reduce interference the signal has to be narrowly focused which makes the technology point-to-point, requiring line of sight between the end points.   But with signal bouncing ceiling tiles, a “line-of-sight” pathway could readily be created around the data center.

This could easily be accomplished by different shaped ceiling tiles such as pyramids, flat panels, or other geometric configurations that would guide the radio signal to the correct transceiver.

I see it all now, the data center of the future would have its ceiling studded with geometrical figures protruding below the tiles, providing wave guides for wireless data paths, routing the signals around obstacles to its final destination.

Probably other questions remain.

  • It appears the technology can only support 4 channels per stream.  Which means it might not scale up to much beyond current speeds.
  • Electromagnetic radiation is something most IT equipment tries to eliminate rather than transmit.  Having something generate and receive radio waves in a data center may require different equipment regulations and having those types of signals bouncing around a data center may make proper shielding more of a concern..
  • Signaling interference is a real problem which might make routing these signals even more of a problem than routing cables.  Which is why I believe they need  some sort of multi-directional wireless switching equipment might help.

In the report, there wasn’t any discussion as to the energy costs of the wireless technology and that may be another issue to consider. However, any reduction in cabling can only help IT labor costs which are a major factor in today’s data center economics.

~~~~

It’s just in investigation stages now but Intel, IBM and others are certainly thinking about how wireless technology could help the data centers of tomorrow reduce costs, clutter and cables.

All this gives a whole new meaning to top of rack switching.

Comments?

Why Open-FCoE is important

FCoE Frame Format (from Wikipedia, http://en.wikipedia.org/wiki/File:Ff.jpg)
FCoE Frame Format (from Wikipedia, http://en.wikipedia.org/wiki/File:Ff.jpg)

I don’t know much about O/S drivers but I do know lots about storage interfaces. One thing that’s apparent from yesterday’s announcement from Intel is that Fibre Channel over Ethernet (FCoE) has taken another big leap forward.

Chad Sakac’s chart of FC vs. Ethernet target unit shipments (meaning, storage interface types, I think) clearly indicate a transition to ethernet is taking place in the storage industry today. Of course Ethernet targets can be used for NFS, CIFS, Object storage, iSCSI and FCoE so this doesn’t necessarily mean that FCoE is winning the game, just yet.

WikiBon did a great post on FCoE market dynamics as well.

The advantage of FC, and iSCSI for that matter, is that every server, every OS, and just about every storage vendor in the world supports them. Also there are plethera of economical, fabric switches available from multiple vendors that can support multi-port switching with high bandwidth. And there many support matrixes, identifying server-HBAs, O/S drivers for those HBA’s and compatible storage products to insure compatibility. So there is no real problem (other than wading thru the support matrixes) to implementing either one of these storage protocols.

Enter Open-FCoE, the upstart

What’s missing from 10GBE FCoE is perhaps a really cheap solution, one that was universally available, using commodity parts and could be had for next to nothing. The new Open-FCoE drivers together with the Intels x520 10GBE NIC has the potential to answer that need.

But what is it? Essentially Intel’s Open-FCoE is an O/S driver for Windows and Linux and a 10GBE NIC hardware from Intel. It’s unclear whether Intel’s Open-FCoE driver is a derivative of the Open-FCoe.org’s Linux driver or not but either driver works to perform some of the FCoE specialized functions in software rather than hardware as done by CNA cards available from other vendors. Using server processing MIPS rather than ASIC processing capabilities should make FCoE adoption in the long run, even cheaper.

What about performance?

The proof of this will be in benchmark results but it’s quite possible to be a non-issue. Especially, if there is not a lot of extra processing involved in a FCoE transaction. For example, if Open-FCoE only takes let’s say 2-5% of server MIPS and bandwidth to perform the added FCoE frame processing then this might be in the noise for most standalone servers and would showup only minimally in storage benchmarks (which always use big, standalone servers).

Yes, but what about virtualization?

However real world, virtualized servers is another matter. I believe that virtualized servers generally demand more intensive I/O activity anyway and as one creates 5-10 VMs, ESX server, it’s almost guaranteed to have 5-10X the I/O happening. If each standalone VM requires 2-5% of a standalone processor to perform Open-FCoE processing, then it could easily represent 5-7 X 2-5% on a 10VM ESX server (assumes some optimization for virtualization, if virtualization degrades driver processing, it could be much worse), which would represent a serious burden.

Now these numbers are just guesses on my part but there is some price to pay for using host server MIPs for every FCoE frame and it does multiply for use with virtualized servers, that much I can guarantee you.

But the (storage) world is better now

Nonetheless, I must applaud Intel’s Open-FCoE thrust as it can open up a whole new potential market space that today’s CNAs maybe couldn’t touch. If it does that, it introduces low-end systems to the advantages of FCoE then as they grow moving their environments to real CNAs should be a relatively painless transition. And this is where the real advantage lies, getting smaller data centers on the right path early in life will make any subsequent adoption of hardware accelerated capabilities much easier.

But is it really open?

One problem I am having with the Intel announcement is the lack of other NIC vendors jumping in. In my mind, it can’t really be “open” until any 10GBE NIC can support it.

Which brings us back to Open-FCoE.org. I checked their website and could see no listing for a Windows driver and there was no NIC compatibility list. So, I am guessing their work has nothing to do with Intel’s driver, at least as presently defined – too bad

However, when Open-FCoE is really supported by any 10GB NIC, then the economies of scale can take off and it could really represent a low-end cost point for storage infrastructure.

Unclear to me what Intel has special in their x520 NIC to support Open-FCoE (maybe some TOE H/W with other special sauce) but anything special needs to be defined and standardized to allow broader adoption by other Vendors. Then and only then will Open-FCoE reach it’s full potential.

—-

So great for Intel, but it could be even better if a standardized definition of an “Open-FCoE NIC” were available, so other NIC manufacturers could readily adopt it.

Comments?