Is hardware innovation accelerating – hardware vs. software innovation (round 6)

There’s something happening to the IT industry, that maybe has not happened in a couple of decades or so but hardware innovation is back. We’ve been covering bits and pieces of it in our hardware vs software innovation series (see Open source ASiCs – HW vs. SW innovation [round 5] post).

But first please take our new poll:

Hardware innovation never really went away, Intel, AMD, Apple and others had always worked on new compute chips. DRAM and NAND also have taken giant leaps over the last two decades. These were all major hardware suppliers. But special purpose chips, non CPU compute engines, and hardware accelerators had been relegated to the dustbins of history as the CPU giants kept assimilating their functionality into the next round of CPU chips.

And then something happened. It kind of made sense for GPUs to be their own electronics as these were SIMD architectures intrinsically different than SISD, standard von Neumann X86 and ARM CPUs architectures

But for some reason it didn’t stop there. We first started seeing some inklings of new hardware innovation in the AI space with a number of special purpose DL NN accelerators coming online over the last 5 years or so (see Google TPU, SC20-Cerebras, GraphCore GC2 IPU chip, AI at the Edge Mythic and Syntiants IPU chips, and neuromorphic chips from BrainChip, Intel, IBM , others). Again, one could look at these as taking the SIMD model of GPUs into a slightly different direction. It’s probably one reason that GPUs were so useful for AI-ML-DL but further accelerations were now possible.

But it hasn’t stopped there either. In the last year or so we have seen SPUs (Nebulon Storage), DPUs (Fungible, NVIDIA Networking, others), and computational storage (NGD Systems, ScaleFlux Storage, others) all come online and become available to the enterprise. And most of these are for more normal workload environments, i.e., not AI-ML-DL workloads,

I thought at first these were just FPGAs implementing different logic but now I understand that many of these include ASICs as well. Most of these incorporate a standard von Neumann CPU (mostly ARM) along with special purpose hardware to speed up certain types of processing (such as low latency data transfer, encryption, compression, etc.).

What happened?

It’s pretty easy to understand why non-von Neumann computing architectures should come about. Witness all those new AI-ML-DL chips that have become available. And why these would be implemented outside the normal X86-ARM CPU environment.

But SPU, DPUs and computational storage, all have typical von Neumann CPUs (mostly ARM) as well as other special purpose logic on them.

Why?

I believe there are a few reasons, but the main two are that Moore’s law (every 2 years halving the size of transistors, effectively doubling transistor counts in same area) is slowing down and Dennard scaling (as you reduce the size of transistors their power consumption goes down and speed goes up) has stopped almost. Both of these have caused major CPU chip manufacturers to focus on adding cores to boost performance rather than just adding more transistors to the same core to increase functionality.

This hasn’t stopped adding instruction functionality to each CPU, but it has slowed considerably. And single (core) processor speeds (GHz) have reached a plateau.

But what it has stopped is having the real estate available on a CPU chip to absorb lots of additional hardware functionality. Which had been the case since the 1980’s.

I was talking with a friend who used to work on math co-processors, like the 8087, 80287, & 80387 that performed floating point arithmetic. But after the 486, floating point logic was completely integrated into the CPU chip itself, killing off the co-processors business.

Hardware design is getting easier & chip fabrication is becoming a commodity

We wrote a post a couple of weeks back talking about an open foundry (see HW vs. SW innovation round 5 noted above)that would take a hardware design and manufacture the ASICs for you for free (or at little cost). This says that the tool chain to perform chip design is becoming more standardized and much less complex. Does this mean that it takes less than 18 months to create an ASIC. I don’t know but it seems so.

But the real interesting aspect of this is that world class foundries are now available outside the major CPU developers. And these foundries, for a fair but high price, would be glad to fabricate a 1000 or million chips for you.

Yes your basic state of the art fab probably costs $12B plus these days. But all that has meant is that A) they will take any chip design and manufacture it, B) they need to keep the factory volume up by manufacturing chips in order to amortize the FAB’s high price and C) they have to keep their technology competitive or chip manufacturing will go elsewhere.

So chip fabrication is not quite a commodity. But there’s enough state of the art FABs in existence to make it seem so.

But it’s also physics

The extremely low latencies that are available with NVMe storage and, higher speed networking (100GbE & above) are demanding a lot more processing power to keep up with. And just the physics of how long it takes to transfer data across a distance (aka racks) is starting to consume too much overhead and impacting other work that could be done.

When we start measuring IO latencies in under 50 microseconds, there’s just not a lot of CPU instructions and task switching that can go on anymore. Yes, you could devote a whole core or two to this process and keep up with it. But wouldn’t the data center be better served keeping that core busy with normal work and offloading that low-latency, realtime (like) work to a hardware accelerator that could be executing on the network rather than behind a NIC.

So real time processing has become faster, or rather the amount of time to execute CPU instructions to switch tasks and to process data that needs to be done in realtime to keep up with faster line speed is becoming shorter.

So that explains DPUs, smart NICS, DPUs, & SPUs. What about the other hardware accelerator cards.

  • AI-ML-DL is becoming such an important and data AND compute intensive workload that just like GPUs before them, TPUs & IPUs are becoming a necessary evil if we want to service those workloads effectively and expeditiously.
  • Computational storage is becoming more wide spread because although data compression can be easily done at the CPU, it can be done faster (less data needs to be transferred back and forth) at the smart Drive.

My guess we haven’t seen the end of this at all. When you open up the possibility of having a long term business model, focused on hardware accelerators there would seem to be a lot of stuff that needs to be done and could be done faster and more effectively outside the core CPU.

There was a point over the last decade where software was destined to “eat the world”. I get a lot of flack for saying that was BS and that hardware innovation is really eating the world. Now that hardtware innovation’s back, it seems to be a little of both.

Comments?

Photo Credits:

Can we back up a PB?

Tradition says no way. IT backup history says not on your life. Common sense would say never in a million years.

Most organizations with PB of data or more, depend on remote replication to protect against data center outage or massive loss of data. This of course costs ~2X your original data center. And for some organizations one copy is not enough, so ~3X .

I don’t know what a PB scale data storage costs these days but I can’t believe it’s under a couple Million $ USD in hw and sw costs and probably at least another Million or so in OpEx/year. Multiply that by 2 or 3X and you’re now talking real money.

How could backup help?

Well for one you wouldn’t need replicas, so that would cut your hw & sw acquisition costs by a factor of 2 or 3. But backup storage is not free either. So you’d probably need to add back 30-50% of the original data center in hw & sw costs for backups.

You certainly wouldn’t need as many admins. And power for backup storage should also be substantially less. So maybe your OpEx would only be 1.5X in total for the original PB and its backups.

But what could possibly back up a PB of data?

We were talking with Igneous at Cloud Field Day 8 (CFD8, see their video here)  a couple of weeks back and they said they could and do backup PBs of data for customers today. A while back, e also talked with them on a GreyBeards on Storage podcast.

The problems with backing up a PB seem insurmountable. First you have to be able to scan a PB of data. This means looking into multiple file systems on many different hardware platforms, across potentially multiple data centers, and that’s just to get a baseline of what all needs to be backed up.

Then at some point you actually have to store all that data on backup storage. So, to gain some cost advantage, you’d want to compress and deduplicate a PB of data, so that the first full backup wouldn’t take a full PB of backup storage.

Then of course you have to transfer a PB of data to your backup storage, in something that wouldn’t take months to perform. And that just gets you the first full backup.

Next, comes the daily scan of what’s changed. This has to re-scan your PB of data to find that 100TB or so, that’s changed over the last 24 hrs. Sometime after that scan completes, then all that 100TB or so of changed data needs to be compressed, deduped and transferred again to backup storage

And if that’s not enough, you have to do it all over again, every day, from now on, almost forever. And data continues to grow. So 1PB today is likely to be 2PB of more in 12 months (it’s great to be in the storage business). 

So those are the challenges. How can it be done, effectively, day in and day out, enough so that IT can depend on their data being backed up.

Igneous to the rescue…

First, Igneous came out of stealth a while back (listen to our podcast) with a couple of unique capabilities needed for massive data repository discovery and analysis. That is they built a unique engine to scan and index PB scale data repositories. This was so they couldd provide administrators better visibility into their PB scale data repositories. But this isn’t about that product, it’s about backup. 

But some of the capabilities they needed to support that product helped them perform backups as well. For instance, their scan needed to handle PBs of data. They came up with AdaptiveSCAN, which didn’t use standard NFS and SMB data transfer protocols to gain access to file metadata. To open a file on NFS or SMB takes quite a lot of NFS or SMB transactions. But to access metadata only, one doesn’t have to use all those NFS and SMB capabilities, it can be done with much less overhead even when using NFS or SMB.

Of course having a way to scan Billions of files was a major accomplishment, but then where do you put all that metadata. And how can you access it effectively to support backup up a PB data repository. So they needed some serious data indexing capabilities and so came up with InfiniteINDEX

Now a trillion item index, seems a bit much, even for PB scale repositories. But my guess is they have eyes on taking their PB scale backups and going after even bigger fish,. That is offering backups for EB scale data repository. And that might just take a trillion item index

Next, there’s moving PB or even TB of data quickly is no small trick. As the development team at Igneous mostly came from unstructured data providers, they also understood and have access to APIs for most storage vendors (NetApp, Dell-EMC Isilon, Pure FlashBlade, Qumulo, etc.). As such, where available, they utilized those native vendor storage API calls to help them move data rather than having to Open an NFS or SMB file and Read it. 

Of course, even doing all that, moving 100TBs of data around or scanning PB sized data repositories is going to take a lot of processing and IO bandwidth to do in a reasonable period of time. 

So another capability they developed is massive parallelism. That is being able to distribute scan, indexing or data movement work, out to multiple systems. In that fashion it can be accomplished in significantly less wall clock time. 

Well with all that, they pretty much had the guts of a backup application system for PB data repositories but they still didn’t have the glue to put it all together. But recently they announced just that a Igneous’s DataProtect, a full scale backup application for PB of data. 

I suppose I haven’t done justice to all of what they have developed or talked about at their session, so I would suggest viewing their talk at CFD8 and listening to our GBoS podcast to learn more. They did demo their product at CFD8 but I believe it was a canned demo.

I didn’t think I’d see the day when some vendor would offer backup services for PBs of data let alone be shooting for more, but there you have it. Igneous means to take your PB scale data repositories and make them as easy to operate as TB scale data repositories. They call that democratizing data.

Comments?

See these other CFD8 bloggers write ups on Igneous.

CFD8  – Igneous Follow Up  by Nate Avery (@Nathaniel_Avery)

Picture credit(s): All from screen saves during Igneous’s session at CFD8

Breaking optical data transmission speed records

Read an article this week about records being made in optical transmission speeds (see IEEE Spectrum, Optical labs set terabit records). Although these are all lab based records, the (data center) single mode optical transmission speed shown below is not far behind the single mode fibre speed commercially available today. But the multi-mode long haul (undersea transmission) speed record below will probably take a while longer until it’s ready for prime time.

First up, data center optical transmission speeds

Not sure what your data center transmission rates are but it seems pretty typical to see 100Gbps these days and inter switch at 200Gbps are commercially available. Last year at their annual Optical Fiber Communications (OFC) conference, the industry was releasing commercial availability of 400Gbps and pushing to achieve 800Gbps soon.

Since then, the researchers at Nokia Bell Labs have been able to transmit 1.52Tbps through a single mode fiber over 80 km distance. (Unclear, why a data center needs an 80km single mode fibre link but maybe this is more for a metro area than just a datacenter.

Diagram of a single mode (SM) optical fiber: 1.- Core 8-10 µm; 2.- Cladding 125 µm; 3.- Buffer 250 µm; & 4.- Jacket 400 µm

The key to transmitting data faster across single mode fibre, is how quickly one can encode/decode data (symbols) both on the digital to analog encoding (transmitting) end and the analog to digital decoding (receiving) end.

The team at Nokia used a new generation silicon-germanium chip (55nm CMOS process) able to generate 128 gigabaud symbol transmission (encoding/decoding) with 6.2 bits per symbol across single mode fiber.

Using optical erbium amplifiers, the team at Nokia was able to achieve 1.4Tbps over 240km of single mode fibre.

A wall-mount cabinet containing optical fiber interconnects. The yellow cables are single mode fibers; the orange and aqua cables are multi-mode fibers: 50/125 µm OM2 and 50/125 µm OM3 fibers respectively.

Used to be that transmitting data across single mode fibre was all about how quickly one could turn laser/light on and off. These days, with coherent transmission, data is being encoded/decoded in amplitude modulation, phase modulation and polarization (see Coherent data transmission defined article).

Nokia Lab’s is attempting to double the current 800Gbps data transmission speed or reach 1.6Tbps. At 1.52Tbps, they’re not far off that mark.

It’s somewhat surprising that optical single mode fibre technology is advancing so rapidly and yet, at the same time, commercially available technology is not that far behind.

Long haul optical transmission speed

Undersea or long haul optical transmission uses multi-core/mode fibre to transmit data across continents or an ocean. With multi-core/multi-mode fibre researchers and the Japan National Institute for Communications Technology (NICT) have demonstrated a 3 core, 125 micrometer wide long haul optical fibre transmission system that is able to transmit 172Tbps.

The new technology utilizes close-coupled multi-core fibre where signals in each individual core end up intentionally coupled with one another creating a sort of optical MIMO (Multi-input/Multi-output) transmission mechanism which can be disentangled with less complex electronics.

Although the technology is not ready for prime time, the closest competing technology is a 6-core fiber transmission cable which can transmit 144Tbps. Deployments of that cable are said to be starting soon.

Shouldn’t there be a Moore’s law for optical transmission speeds

Ran across this chart in a LightTalk Blog discussing how Moore’s law and optical transmission speeds are tracking one another. It seems to me that there’s a need for a Moore’s law for optical cable bandwidth. The blog post suggests that there’s a high correlation between Moore’s law and optical fiber bandwidth.

Indeed, any digital to analog optical encoding/decoding would involve TTL, by definition so there’s at least a high correlation between speed of electronic switching/processing and bandwidth. But number of transistors (as the chart shows) and optical bandwidth doesn’t seem to make as much sense probably makes the correlation evident. With the possible exception that processing speed is highly correlated with transistor counts these days.

But seeing the chart above shows that optical bandwidth and transistor counts are following each very closely.

~~~~

So, we all thought 100Gbps was great, 200Gbps was extraordinary and anything over that was wishful thinking. With, 400Gbps, 800 Gbps and 1.6Tbps all rolling out soon, data center transmission bottlenecks will become a thing in the past.

Picture Credit(s):

Where should IoT data be processed – part 1

I was at FlashMemorySummit 2019 (FMS2019) this week and there was a lot of talk about computational storage (see our GBoS podcast with Scott Shadley, NGD Systems). There was also a lot of discussion about IoT and the need for data processing done at the edge (or in near-edge computing centers/edge clouds).

At the show, I was talking with Tom Leyden of Excelero and he mentioned there was a real need for some insight on how to determine where IoT data should be processed.

For our discussion let’s assume a multi-layered IoT architecture, with 1000s of sensors at the edge, 100s of near-edge processing/multiplexing stations, and 1 to 3 core data center or cloud regions. Data comes in from the sensors, is sent to near-edge processing/multiplexing and then to the core data center/cloud.

Data size

Dans la nuit des images (Grand Palais) by dalbera (cc) (from flickr)
Dans la nuit des images (Grand Palais) by dalbera (cc) (from flickr)

When deciding where to process data one key aspect is the size of the data. Tin GB or TB but given today’s world, can be PB as well. This lone parameter has multiple impacts and can affect many other considerations, such as the cost and time to transfer the data, cost of data storage, amount of time to process the data, etc. All of these sub-factors include the size of the data to be processed.

Data size can be the largest single determinant of where to process the data. If we are talking about GB of data, it could probably be processed anywhere from the sensor edge, to near-edge station, to core. But if we are talking about TB the processing requirements and time go up substantially and are unlikely to be available at the sensor edge, and may not be available at the near-edge station. And PB take this up to a whole other level and may require processing only at the core due to the infrastructure requirements.

Processing criticality

Human or machine safety may depend on quick processing of sensor data, e. g. in a self-driving car or a factory floor, flood guages, etc.. In these cases, some amount of data (sufficient to insure human/machinge safety) needs to be done at the lowest point in the hierarchy, with the processing power to perform this activity.

This could be in the self-driving car or factory automation that controls a mechanism. Similar situations would probably apply for any robots and auto pilots. Anywhere some IoT sensor array was used to control an entity, that could jeopardize the life of human(s) or the safety of machines would need to do safety level processing at the lowest level in the hierarchy.

If processing doesn’t involve safety, then it could potentially be done at the near-edge stations or at the core. .

Processing time and infrastructure requirements

Although we talked about this in data size above, infrastructure requirements must also play a part in where data is processed. Yes sensors are getting more intelligent and the same goes for near-edge stations. But if you’re processing the data multiple times, say for deep learning, it’s probably better to do this where there’s a bunch of GPUs and some way of keeping the data pipeline running efficiently. The same applies to any data analytics that distributes workloads and data across a gaggle of CPU cores, storage devices, network nodes, etc.

There’s also an efficiency component to this. Computational storage is all about how some workloads can better be accomplished at the storage layer. But the concept applies throughout the hierarchy. Given the infrastructure requirements to process the data, there’s probably one place where it makes the most sense to do this. If it takes a 100 CPU cores to process the data in a timely fashion, it’s probably not going to be done at the sensor level.

Data information funnel

We make the assumption that raw data comes in through sensors, and more processed data is sent to higher layers. This would mean at a minimum, some sort of data compression/compaction would need to be done at each layer below the core.

We were at a conference a while back where they talked about updating deep learning neural networks. It’s possible that each near-edge station could perform a mini-deep learning training cycle and share their learning with the core periodicals, which could then send this information back down to the lowest level to be used, (see our Swarm Intelligence @ #HPEDiscover post).

All this means that there’s a minimal level of processing of the data that needs to go on throughout the hierarchy between access point connections.

Pipe availability

binary data flow

The availability of a networking access point may also have some bearing on where data is processed. For example, a self driving car could generate TB of data a day, but access to a high speed, inexpensive data pipe to send that data may be limited to a service bay and/or a garage connection.

So some processing may need to be done between access point connections. This will need to take place at lower levels. That way, there would be no need to send the data while the car is out on the road but rather it could be sent whenever it’s attached to an access point.

Compliance/archive requirements

Any sensor data probably needs to be stored for a long time and as such will need access to a long term archive. Depending on the extent of this data, it may help dictate where processing is done. That is, if all the raw data needs to be held, then maybe the processing of that data can be deferred until it’s already at the core and on it’s way to archive.

However, any safety oriented data processing needs to be done at the lowest level and may need to be reprocessed higher up in the hierachy. This would be done to insure proper safety decisions were made. And needless the say all this data would need to be held.

~~~~

I started this post with 40 or more factors but that was overkill. In the above, I tried to summarize the 6 critical factors which I would use to determine where IoT data should be processed.

My intent is in a part 2 to this post to work through some examples. If there’s anyone example that you feel may be instructive, please let me know.

Also, if there’s other factors that you would use to determine where to process IoT data let me know.

A steampunk Venusian rover

Read an article last week in theEngineer on “Designing a mechanical rover to explore … Venus“, on a group at JPL, led by Jonathon Sauder who are working on a mechanical rover to study Venus.

Venus has a temperature of ~470c, hot enough to melt lead, which will fry most electronics in seconds. Moreover, the Venusian surface is under a lot of pressure, roughly equivalent to a mile under water or ~160X the air pressure at Earth’s surface (from NASA Venus in depth). Extreme conditions for any rover.

Going mobile

Sauder and his team were brainstorming mechanical rovers, that operated similar to Theo Jansen’s StrandBeest which walks using wind energy alone. (Checkout the video of the BEEST walking).

Jansen had told Sauder’s team that his devices work much better on smooth surfaces and that uneven, beach like surfaces presented problems.

So, Sauder’s team started looking at using something with tracks instead of legs/feet, sort of like a World War 1 tank. That could operate upside down as well as rightside up.

Rather than sails (as the StrandBeest), they plan to use multiple vertical axis wind turbines, called Sarvonius rotors, located inside the tank to create energy and store that energy in springs for future use.

Getting data

They’re not planning to ditch electronics all together but need to minimize the rovers reliance on electronics.

There are some electronics that can operate at 450C based on silicon carbide and gallium carbide which have a very low level of integration at this time, just a 100 transistors per chip.  And they could use this to add electronic processing and control to their mechanical rover.

Solar panels can supply electricity to the high temperature electronics and can operate at 450C.

But to get information off the rover and back to the Earth, they plan to use a highly radio reflective spot on the rover and a mechanical shutter mechanism. The mechanism can be closed and opened and together with an orbiting satellite generating radio pulses and recording the rover’s reflectivity or not, send Morse code from rover to satellite. The orbiting satellite could record this information and then transmit it to Earth.

The rover will make use of simple chemical reactions to measure soil, rock and atmospheric chemistry. Soil and rocks suitable for analysis can be scooped up, drilled out and moved to the analysis chamber(s) via mechanical devices. Wind speed and direction can be sensed with simple mechanical devices.

In order to avoid obstacles wihile roving around the planet, they  plan to use a mechanical probe out othe front (and back?) of the rover with control systems attached to this to avoid obstacles. This way the rover can move around more of the planets surface.

Such a mechanical rover with high temperature electronics might also be suitable for other worlds in the solar system, Mercury for sure but moons of the Jovian planets, also have extreme pressure environments.

And such a electrical-mechanical rover also might work great to probe volcano’s on earth, although the temperatures are 700 to 1200C, ~2 to 3X Venus. Maybe such a rover could be used in highly radioactive environments to record information and send this back to personnel outside the environment or even effect some preprogrammed repairs. Ocean vents could also be another potential place where such a rover might work well.

Possible improvements

Mechanical probes would need to be moved vertically and swing horizontally to be effective and would necessarily have to poke outside the tanks envelope to read obstacles ahead.

Sonar could work better. Sounds or clicks could be produced mechanically and their reflections could be also received mechanically (a mic is just a mechanical transducer). At the pressures on Venus, sound should travel far.

Morse code was designed to efficiently send alpha-numerics and not much else. It would seem that another codec could be designed to send scientific information faster. And if one mechanical spot is good, multiple spots would be better assuming the satellite could detect multiple radio reflective spots located in close proximity to one another on the rover.

Radio works but why not use infrared. If there were some way to read an infrared signal from the probe, it could present more information per pass.

For instance, an infrared photo of the rover’s bottom or top, using with a flat surface, could encode information in cold and hot spots located across that surface.

This could work at whatever infrared resolution available from the satellite orbiting overhead and would send much more information per orbital pass.

In fact, such an infrared surface readout might allow the rover to send B&W pictures up to the satellite. Sonar could provide a mechanism to record a (sound) picture of the environment being scanned. The infrared information could be encoded across the surface via pipes of cool and hot liquids, sort of like core memory of old.

What about steam power. With 450C there ought to be more than enough heat to boil some liquid and have it cool via expansion. Having cool liquid could be used to cool electronics, chemical and solar devices.  And as the high temperatures on Venus seem constant, steam power and liquid cooling would be available all the time and eliminating any need for springs to hold energy.

And the cooling liquid from steam engines could be used to support an infrared signaling mechanism.

Still not sure why we need any electronics. A suitably configured, shrunken, analytical engine could provide the rudimentary information processing necessary to work the shutter or other transmitter mechanisms, initiate, readout and store mechanical/chemical/sonar sensors and control the other items on the rover.

And with a suitably complex analytical engine there might be some way to mechanically program it with various modes using something like punched tape or cards. Such a device could be used to hold and load information for separate programs in minimal space and could also be used to store information for later transmission, supplying a 100% mechanical storage device.

Going 100% mechanical could also lead to a potentially longer lived rover than something using some electronics and mostly mechanical devices on a planet like Venus. Mechanical devices can fail, but their failure modes are normally less catastrophic, well understood. Perhaps with sufficient mechanical redundancy and concern for tribology, such a 100% mechanical rover could last an awful long time, without any maintenance, e.g., like swiss watches.

Comments?

Photo Credit(s): World War One tank – mark 1 by Photos of the Past

Vintage Philmor morse code practice … by Joe Haupt

Accompanied by an instructor… by vy pham;

Core memory more detail by Kenneth Moore;

Model of the Analytical Engine By Bruno Barral (ByB), CC BY-SA 2.5;

Punched tape by Rositslav Lisovy

Steam locomotives by Jim Phillips

Data hypervisor

(c) 2012 Silverton Consulting, Inc. All rights reserved

With all this talk of software defined networking and server virtualization where does storage virtualization stand.  I blogged about some problems with storage virtualization a week or so ago in my post on Storage Utilization is broke and this post takes it to the next level.  Also I was at a financial analyst conference this week in Vail where I heard Mark Lewis of Tekrocket but formerly of EMC discuss the need for a data hypervisor to provide software defined storage.

I now believe what we really need for true storage virtualization is a renewed focus on data hypervisor functionality.  The data hypervisor would need both a control plane and a data plane in order to function properly.   Ideally the control plane would set up the interface and routing for the data plane hardware and the server and/or backend storage would be none the wiser.

DMs everywhere

I envision a scenario where a customer’s application data is packaged with a data hypervisor which runs on a commodity data switch hardware with data plane and control plane software running on it.  Sort of creating (virtual) data machines or DMs.

All enterprise and nowadays most midrange storage provide most of the functionality of a storage control plane such as defining units of storage, setting up physical to logical storage mapping, incorporating monitoring, and management of the physical storage layer, etc.  So control planes are pervasive in today’s storage but proprietary.

In addition most storage systems have data plane functionality which operates to connect a host IO request to the actual data which resides in backend storage or internal cache.  But again although data planes are everywhere in storage today they are all proprietary to a specific vendor’s storage system.

Data switch needed

But in order to utilize a data hypervisor and create a more general purpose control plane layer, we need a more generic data plane layer that operates on commodity hardware. This is different from today’s SAN storage switches or DCB switches but similar in a some ways.

The functions of the data switch/data plane layer would be to take routing instructions from the control plane layer and direct the server IO request to the proper storage unit using the data plane layer.  Somewhere in this world view, probably at the data plane level it would introduce data protection services like RAID or other erasure coding schemes, point in time copy/clone services and replication services and other advanced storage features needed by enterprise storage today.

Also it would need to provide some automated storage movement across and within tiers of physical storage and it would connect server storage interfaces at the front end to storage interfaces at the backend.  Not unlike SAN or DCB switches but with much more advanced functionality.

Ideally data switch storage interfaces could attach to dedicated JBOD, Flash arrays as well as systems using DAS  storage.  In addition, it would be nice if the data switch could talk to real storage arrays on SAN, IP/SANs or NFS&CIFS/SMB storage systems.

The other thing one would like out of a data switch is support for a universal translator that would map one protocol to another, such as iSCSI to SAS, NFS to FC, or FC to NFS and any other combination, depending on the needs of the server and the storage in the configuration.

Now if the data switch were built on top of commodity x86 hardware and software with the data switch as just a specialized application that would create the underpinnings for a true data hypervisor with a control and data plane that could be independent and use anybody’s storage.

Data hypervisor

Assuming all this were available then we would have true storage virtualization.  With these capabilities, storage could be repurposed on the fly, added to, subtracted from, and in general be a fungible commodity not unlike server processing MIPs under VMware or Hyper-V.

Application data would then needed to be packaged into a data machine which would offer all the host services required to support host data access.  The data hypervisor would handle the linkages required to interface with the control and data layers.

Applications could be configured to utilize available storage at ease and storage could grow,  shrink or move to accommodate the required workload just as easily as VMs can be deployed today.

How we get there

Aside from the VMware, Citrix, Microsoft thrusts towards virtual storage there are plenty of storage virtualization solutions that can control most backend enterprise SAN storage. However, the problem with these solutions is that in general the execute only on a specific vendors hardware and don’t necessarily talk to DAS or JBOD storage.

In addition, not all of the current generation storage virtualization solutions are unified. That is most of these today only talk FC, FCoE or iSCSI and don’t support NFS or CIFS/SMB.

These don’t appear to be insurmountable obstacles and with proper allocation of R&D funding, could all be solved.

However the more problematic is that none of these solutions operate on commodity hardware or commodity software.

The hardware is probably the easiest to deal with. Today many enterprise storage systems are built ontop of x86 processor storage controllers. Albeit sometimes they incorporate specialized packaging for redundancy and high availability.

The harder problem may be commodity software. Although the genesis for a few storage virtualization systems might come from BSD or other “commodity” software operating systems. They have been modified over the years to no longer represent anything that can run on standard off the shelf operating systems.

Then again some storage virtualization systems started out with special home grown hardware and software. As such, converting these over to something more commodity oriented would be a major transition.

But the challenge is how to get there from here and would anyone want to take this on.  The other problem is that the value add that storage vendors supply currently would be somewhat eroded.  Not unlike what happened to proprietary Unix systems with the advent of VMware.

But this will not take place overnight and the company that takes this on and makes a go at it can have a significant software monopoly that would be hard to crack.

Perhaps it will take a startup to do this but I believe the main enterprise storage vendors are best positioned to take this on.

Comments?

No-power sensors surface due to computational energy efficiency trends

Koomeys_law_graph,_made_by_Koomey (cc) (from wikipedia.org)
Koomeys_law_graph,_made_by_Koomey (cc) (from wikipedia.org)

Read an article The computing trend that will change everything in MIT’s TechReview today  about the trend in energy consumption per unit of computation.

Along with Moore’s law dictating that  transister density doubles every 18 to 24 months, there is Koomey’s law that states that computational power efficiency or computations per watt, will double every 1.57 yrs.

Koomey’s law has made today’s smart phones and tablets possible.  If your current laptop were computing at the power efficiency of 1991 computers their batteries would last ~2.5 seconds.

No-power sensors?!

But this computing efficiency trend is giving rise to no-power sensors/devices, or computational sensors without batteries.  These new sensors gather electrical energy from “ambient radio waves” in the air, and by doing so harvest enough electricity to power computations and as such, don’t need batteries.

Such devices can gather ~50μwatts of power from a TV transmitter just 2.5 miles away.  Most calculators only use ~5μwatts and digital thermoters around 1μwatt, so 50 is enough to do some reasonable amounts of sensing work.

But the exciting part is that as Koomley’s law continues, the amount of work that 50μwatts or even 5μwatts supports doubles again every 1.6 years.  For example, the computational power of today’s laptops will only consume infinitesimal amounts of power in ~two decades time.  Thus, no-power-sensors of 2034 will be very smart indeed.

“Any sufficiently advanced technology is indistinguishable from magic”, Arthur C. Clarke

Data transmission efficiency not keeping up

Nonetheless, the fact that computational efficiency is doubling every 1.6 years doesn’t mean the data transmission efficiency is doing the same.  Which means that for the foreseeable future, data transmission may remain a crucial bottleneck for no-power sensors.

However, computational increases can somewhat compensate for data transmission limitations by more efficient encoding, compression, etc. But there are limits as to what can be accomplished within any data transmission technology.

Nanodata

Thus, for the foreseeable future, although sensors will be able to do lots more computations, what they transmit to the outside world may remain limited.  Giving rise to smart, no-power sensors providing very miniscule data packages.

One term coined to describe such limited external data transmission from no-power computationally intense sensors is nanodata.   Because of their ability to exist outside the power grid, it is very likely that the future sensor cloud or internet-of-things will be primarily comprised of such nanodata devices.

~~~~
I was at SNW last week and there was some discussion of “little data” or data in corporate databases, in contrast with big data.  But nanodata is something I had never heard of before today.

So now we have big data, little data, and nanodata.  Seems like are missing a few steps here…