Commodity hardware loses again …

IMG_4528It seemed only a a couple of years back that everyone was touting how hardware engineering no longer mattered anymore.

What with Intel and others playing out Moore’s law, why invest in hardware engineering when the real smarts were all in software?

We said then that hardware engineered solutions still had a significant place to play but few believed me (see my posts:  Commodity hardware always loses and Commodity hardware debate heats up ).

Well hardware’s back, …

A few examples;

  1. EMC DSSD – at EMCWorld2015 a couple of weeks back, EMC demoed a new rack-scale flash storage system that was targeted at extremely high IOPS and very low latency. DSSD is a classic study on how proprietary hardware could enable new levels of performance. The solution connected to servers over a PCIe switched network, which didn’t really exist before and used hardware engineered, Flash Modules which were extremely dense, extremely fast and extremely reliable. (See my EMCWorld post on DSSD and our Greybeards on Storage (GBoS) podcast with Chad Sakac for more info on DSSD)
  2. Diablo Memory Channel Storage (MCS) /SanDisk UltraDIMMs – Diablo’s MCS is coming out in SanDisk’s UltraDIMM NAND storage that plugs into DRAM slots and provides a memory paged access to NAND storage. The key is that the hardware logic provides overheads that are ~50 μsecs to access NAND storage. (We’ve wrote  about MCS and UltraDIMMs here).
  3. Hitachi VSP G1000 storage and their Hitachi Accelerated Flash (HAF) – recent SPC-1 results showed that a G1000 outfitted with HAF modules could generate over 2M IOPS and had very low latency (220 μsecs).  (See our announcement summary on the Hitachi G1000 here).

Diablo ran into some legal problems but that’s all behind them now, so the way forward is clear of any extraneous hurdles.

There are other examples of proprietary hardware engineering from IBM FlashSystems,  networking companies, PCIe flash vendors and others but these will suffice to make my point.

My point is if you want to gain orders of magnitude of better performance, you need to seriously consider engaging in some proprietary hardware engineering. Proprietary hardware may take longer than software-only solutions (although that’s somewhat of a function of the resources you throw at), but the performance gains are sometimes unobtainable any other way.

~~~~

Chad made an interesting point on our GBoS podcast, hardware innovation is somewhat cyclical. For a period of time, commodity hardware is much better than any storage solution really needs, so innovation swings to the software arena. But over time, software functionality comes up to speed and maxes out the hardware that’s available and then you need more hardware innovation to take performance to the next level. Then the cycle swings back to hardware engineering. And the cycle will swing back and forth again a lot more times before storage is ever through as an IT technology.

Today when it seems that there’s a new software defined storage solution coming out every month we seem very close to peak software innovation with little left for performance gains, but there’s still plenty left if we open our eyes to consider proprietary hardware.

Welcome to the start of the next hardware innovation cycle – take that commodity hardware.

Comments?

EMCWorld2015 Day 2&3 news

Some additional news from EMCWorld2015 this week:

IMG_4527 IMG_4528 IMG_4531EMC announced directed availability for DSSD, their Rack scale shared Flash storage solution using a PCIe3 (switched) fabric with 36 dual ported, flash modules, which hold 512 NAND chips for 144TB NAND flash storage. On the stage floor they had a demonstration pitting a  40 node Hadoop cluster with DAS against a 15 node Hadoop cluster using the DSSD, both running HIVE and working on the same Query. By the time the 40node/DAS solution got to about 2% of the query completion the 15node/DSSD based cluster had finished the query without breaking a sweat. They then ran an even more complex query and it took no time at all.

They also simulated a copy of a 4TB file (~32K-128K IOs) from memory to memory and it took literally seconds, then copied it to SSD that took considerably longer (didn’t catch how long but much longer than memory), and then they showed the same file copy to DSSD and it only took seconds, almost looked exactly a smidgen slower than the memory to memory copy.

They said the PCIe fabric (no indication what the driver was) provided much more parallelism to the dual ported flash storage that the system was almost able to complete the 4TB copy at memory to memory speeds. It was all pretty impressive, albeit a simulation of the real thing.

EMC indicated that they designed the flash modules themselves and expect to double capacity of the DSSD to 288TB shortly. They showed the controller board that had a mezzanine board over a part of it, but together had 12 major chips on it which I assume had something to do with the PCIe fabric. They said there were two controllers in the system for high availability and the 144TB DSSD was deployed in 5U of space.

I can see how this would play well for real time analytics, high frequency trading and HPC environments but there’s more to shared storage than just speed. Cost wasn’t mentioned neither was the software driver but with the ease with which it worked on the Hive query, I can only assume at some lever it must look something like a DAS device but with memory access times… NVMe anyone?

Project CoprHD was announced which open sourced EMC’s ViPR Controller software. Many ViPR customers were asking for EMC to open source ViPR controller, apparently their listening. Hopefully this will enable some participation from non-EMC storage vendors to allow their storage to be brought under the management of ViPR Controller. I believe the intent is to have an EMC hardened/supported version of Project CoprHD or ViPR Controller to coexist with the open source project version which anyone can download and modify for themselves.

A Non-production, downloadable version of ScaleIO was also announced. The test-dev version is a free download with unlimited capacity, full functionality and available for an unlimited time but only for non-production use.  Another of the demos onstage this morning was Chad configuring storage across a ScaleIO cluster and using its QoS services to limit the impact of a specific workload. There was talk that ScaleIO was available previously as a free download but it took a bunch of effort to find and download. They have removed all these prior hindrances and soon, if not today it’s freely available for anyone. ScaleIO runs on VMware and other hypervisors (maybe bare metal as well). So if you wanted to get your feet wet with software defined storage, this sounds like the perfect opportunity.

ECS is being added to EMC’s Data Lake foundation. Not exactly sure what are all the components in the data lake solution but previously the only Data Lake storage was Isilon based. This week EMC added Elastic Cloud Storage to the picture. Recall that Elastic Cloud Storage comes in either a software only or hardware appliance deployment and provides object storage.

I missed Project Liberty before but it’s a virtual VNX appliance, software only version.  I assume this is intended for ROBO deployments or very low end business environments. Presumably it runs on VMware and has some sort of storage limitations. It seems, more and more of EMC products are coming out in virtual appliance versions.

Project Falcon was also announced which is a virtual Data Domain appliance, software only solution, targeted for ROBO environments and other small enterprises. The intent is to have an onramp for DataDomain backup storage.  I assume runs under VMware.

Project Caspian – rolling out CloudScaling orchestration/automation for OpenStack deployments. On the big stage today, Chad and Jeremy demonstrated Project Caspian on a VCE VxRACK deploying racks of servers under OpenStack control. They were able within a couple of clicks define and deploy openstack on bare metal hardware and deploy applications to the OpenStack servers. They had a monitoring screen which showed the OpenStack server activity (transactions) in real time and showed an over commit of the rack and how easy it was to add a new rack with more servers. All this seemed to take but a few clicks. The intent is not to create another OpenStack distribution but to provide an orchestration/automation/monitoring layer of software on top of OpenStack to “industrialize OpenStack” for enterprise users. Looked pretty impressive to me.

I would have to say the DSSD box was most impressive. It would have been interesting to get an upclose look at the box with some more specifications but they didn’t have one on the Expo floor.