Hitachi and the coming IoT gold rush

img_7137Earlier this week I attended Hitachi Summit 2016 along with a number of other analysts and Hitachi executives where Hitachi discussed their current and ongoing focus on the IoT (Internet of Things) business.

We have discussed IoT before (see QoM1608: The coming IoT tsunami or not, Extremely low power transistors … new IoT applications). Analysts and companies predict  ~200B IoT devices by 2020 (my QoM prediction is 72.1B 0.7 probability). But in any case there’s a lot of IoT activity going to come online, very shortly. Hitachi is already active in IoT and if anything, wants it to grow, significantly.

Hitachi’s current IoT business

Hitachi is uniquely positioned to take on the IoT business over the coming decades, having a number of current businesses in industrial processes, transportation, energy production, water management, etc. Over time, all these industries and more are becoming much more data driven and smarter as IoT rolls out.

Some metrics indicating the scale of Hitachi’s current IoT business, include:

  • Hitachi is #79 in the Fortune Global 500;
  • Hitachi’s generated $5.4B (FY15) in IoT revenue;
  • Hitachi IoT R&D investment is $2.3B (over 3 years);
  • Hitachi has 15K customers Worldwide and 1400+ partners; and
  • Hitachi spends ~$3B in R&D annually and has 119K patents

img_7142Hitachi has been in the OT (Operational [industrial] Technology) business for over a century now. Hitachi has also had a very successful and ongoing IT business (Hitachi Data Systems) for decades now.  Their main competitors in this IoT business are GE and Siemans but neither have the extensive history in IT that Hitachi has had. But both are working hard to catchup.

Hitachi Rail-as-a-Service

img_7152For one example of what Hitachi is doing in IoT, they have recently won a 27.5 year Rail-as-a-Service contract to upgrade, ticket, maintain and manage all new trains for UK Rail.  This entails upgrading all train rolling stock, provide upgraded rail signaling, traffic management systems, depot and station equipment and ticketing services for all of UK Rail.

img_7153The success and profitability of this Hitachi service offering hinges on their ability to provide more cost efficient rail transport. A key capability they plan to deliver is predictive maintenance.

Today, in UK and most other major rail systems, train high availability is often supplied by using spare rolling stock, that’s pre-positioned and available to call into service, when needed. With Hitachi’s new predictive maintenance capabilities, the plan is to reduce, if not totally eliminate the need for spare rolling stock inventory and keep the new trains running 7X24.

img_7145Hitachi said their new trains capture 48K data items and generate over ~25GB/train/day. All this data, will be fed into their new Hitachi Insight Group Lumada platform which includes Pentaho, HSDP (Hitachi Streaming Data Platform) and their Content Analytics to analyze train data and determine how best to keep the trains running. Behind all this analytical power will no doubt be HDS HCP object store used to keep track of all the train sensor data and other information, Hitachi UCP servers to process it all, and other Hitachi software and hardware to glue it all together.

The new trains and services will be rolled out over time, but there’s a pretty impressive time table. For instance, Hitachi will add 120 new high speed trains to UK Rail by 2018.  About the only thing that Hitachi is not directly responsible for in this Rail-as-a-Service offering, is the communications network for the trains.

Hitachi other IoT offerings

Hitachi is actively seeking other customers for their Rail-as-a-service IoT service offering. But it doesn’t stop there, they would like to offer smart-water-as-a-service, smart-city-as-a-service, digital-energy-as-a-service, etc.

There’s almost nothing that Hitachi currently supplies as industrial products that they wouldn’t consider offering in an X-as-a-service solution. With HDS Lumada Analytics, HCP and HDS storage systems, Hitachi UCP converged infrastructure, Hitachi industrial products, and Hitachi consulting services, together they are primed to take over the IoT-industrial products/services market.

Welcome to the new Hitachi IoT world.


HDS buys BlueArc

wall o' storage (fisheye) by ChrisDag (cc) (From Flickr)
wall o' storage (fisheye) by ChrisDag (cc) (From Flickr)

Yesterday, HDS announced that they had closed on the purchase of BlueArc their NAS supplier for the past 5 years or so.  Many commentators mentioned that this was a logical evolution of their ongoing OEM agreement, how the timing was right and speculated on what the purchase price might have been.   If you are interested in these aspects of the acquisition, I would refer you to the excellent post by David Vellante from Wikibon on the HDS BlueArc deal.

Hardware as a key differentiator

In contrast, I would like to concentrate here on another view of the purchase, specifically on how HDS and Hitachi, Ltd. have both been working to increase their product differentiation through advanced and specialized hardware (see my post on Hitachi’s VSP vs VMAX and for more on hardware vs. software check out Commodity hardware always loses).

Similarly, BlueArc shared this philosophy and was one of the few NAS vendors to develop special purpose hardware for their Titan and Mercury systems to specifically speedup NFS and CIFS processing.  Most other NAS systems use more general purpose hardware and as a result,  a majority of their R&D investment focuses on software functionality.

But not BlueArc, their performance advantage was highly dependent on specifically designed FPGAs and other hardware.  As such, they have a significant hardware R&D budget to continue their maintain and leverage their unique hardware advantage.

From my perspective, this follows what HDS and Hitachi, Ltd., have been doing all along with the USP, USP-V,  and now their latest entrant the VSP.  If you look under the covers at these products you find a plethora of many special purpose ASICs, FPGAs and other hardware that help accelerate IO performance.

BlueArc and HDS/Hitachi, Ltd. seem to be some of the last vendors standing that still believe that hardware specialization can bring value to data storage. From that standpoint, it makes an awful lot of sense to me to have HDS purchase them.

But others aren’t standing still

In the mean time, scale out NAS products continue to move forward on a number of fronts.  As readers of my newsletter know, currently the SPECsfs2008 overall performance winner is a scale out NAS solution using 144 nodes from EMC Isilon (newsletter signup is above right or can also be found here).

The fact that now HDS/Hitachi, Ltd. can bring their considerable hardware development skills and resources to bear on helping BlueArc develop and deploy their next generation of hardware is a good sign.

Another interesting tidbit was HDS’s previous purchase of ParaScale which seems to have some scale out NAS capabilities of its own.  How this all gets pulled together within HDS’s product line will need to be seen.

In any event, all this means that the battle for NAS isn’t over and is just moving to a higher level.



When will disks become extinct?

A head assembly on a Seagate disk drive by Robert Scoble (cc) (from flickr)
A head assembly on a Seagate disk drive by Robert Scoble (cc) (from flickr)

Yesterday, it was announced that Hitachi General Storage Technologies (HGST) is being sold to Western Digital for $4.3B and after that there was much discussion in the tweeterverse about the end of enterprise disk as we know it.  Also, last week I was at a dinner at an analyst meeting with Hitachi, where the conversation turned to when disks will no longer be available. This discussion was between Mr. Takashi Oeda of Hitachi RSD, Mr. John Webster of Evaluator group and myself.

Why SSDs will replace disks

John was of the opinion that disks would stop being economically viable in about 5 years time and will no longer be shipping in volume, mainly due to energy costs.  Oeda-san said that Hitachi had predicted that NAND pricing on a $/GB basis would cross over (become less expensive than) 15Krpm disk pricing sometime around 2013.  Later he said that NAND pricing had not come down as fast as projected and that it was going to take longer than anticipated.  Note that Oeda-san mentioned density price cross over for only 15Krpm disk not 7200rpm disk.  In all honesty, he said SATA disk would take longer, but he did not predict when

I think both arguments are flawed:

  • Energy costs for disk drives drop on a Watts/GB basis every time disk density increases. So the energy it takes to run a 600GB drive today will likely be able to run a 1.2TB drive tomorrow.  I don’t think energy costs are going to be the main factor to drives disks out of the enterprise.
  • Density costs for NAND storage are certainly declining but cost/GB is not the only factor in technology adoption. Disk storage has cost more than tape capacity since the ’50s, yet they continue to coexist in the enterprise. I contend that disks will remain viable for at least the next 15-20 years over SSDs, primarily because disks have unique functional advantages which are vital to enterprise storage.

Most analysts would say I am wrong, but I disagree. I believe disks will continue to play an important role in the storage hierarchy of future enterprise data centers.

NAND/SSD flaws from an enterprise storage perspective

All costs aside, NAND based SSDs have serious disadvantages when it comes to:

  • Data retention – the problem with NAND data cells is that they can only be written so many times before they fail.  And as NAND cells become smaller, this rate seems to be going the wrong way, i.e,  today’s NAND technology can support 100K writes before failure but tomorrow’s NAND technology may only support 15K writes before failure.  This is not a beneficial trend if one is going to depend on NAND technology for the storage of tomorrow.
  • Sequential access – although NAND SSDs perform much better than disk when it comes to random reads and less so, random writes, the performance advantage of sequential access is not that dramatic.  NAND sequential access can be sped up by deploying multiple parallel channels but it starts looking like internal forms of wide striping across multiple disk drives.
  • Unbalanced performance – with NAND technology, reads operate quicker than writes. Sometimes 10X faster.  Such unbalanced performance can make dealing with this technology more difficult and less advantageous than disk drives of today with much more balanced performance.

None of these problems will halt SSD use in the enterprise. They can all be dealt with through more complexity in the SSD or in the storage controller managing the SSDs, e.g., wear leveling to try to prolong data retention, multi-data channels for sequential access, etc. But all this additional complexity increases SSD cost, and time to market.

SSD vendors would respond with yes it’s more complex, but such complexity is a one time charge, mostly a one time delay, and once done, incremental costs are minimal. And when you come down to it, today’s disk drives are not that simple either with defect skipping, fault handling, etc.

So why won’t disk drives go away soon.  I think other major concern in NAND/SSD ascendancy is the fact that the bulk NAND market is moving away from SLC (single level cell or bit/cell) NAND to MLC (multi-level cell) NAND due to it’s cost advantage.  When SLC NAND is no longer the main technology being manufactured, it’s price will not drop as fast and it’s availability will become more limited.

Some vendors also counter this trend by incorporating MLC technology into enterprise SSDs. However, all the problems discussed earlier become an order of magnitude more severe with MLC NAND. For example, rather than 100K write operations to failure with SLC NAND today, it’s more like 10K write operations to failure on current MLC NAND.  The fact that you get 2 to 3 times more storage per cell with MLC doesn’t help that much when one gets 10X less writes per cell. And the next generation of MLC is 10X worse, maybe getting on the order of 1000 writes/cell prior to failure.  Similar issues occur for write performance, MLC writes are much slower than SLC writes.

So yes, raw NAND may become cheaper than 15Krpm Disks on a $/GB basis someday but the complexity to deal with such technology is also going up at an alarming rate.

Why disks will persist

Now something similar can be said for disk density, what with the transition to thermally assisted recording heads/media and the rise of bit-patterned media.  All of which are making disk drives more complex with each generation that comes out.  So what allows disks to persist long after $/GB is cheaper for NAND than disk:

  • Current infrastructure supports disk technology well in enterprise storage. Disks have been around so long, that storage controllers and server applications have all been designed around them.  This legacy provides an advantage that will be difficult and time consuming to overcome. All this will delay NAND/SSD adoption in the enterprise for some time, at least until this infrastructural bias towards disk is neutralized.
  • Disk technology is not standing still.  It’s essentially a race to see who will win the next generations storage.  There is enough of an eco-system around disk that will keep pushing media, heads and mechanisms ever forward into higher densities, better throughput, and more economical storage.

However, any infrastructural advantage can be overcome in time.  What will make this go away even quicker is the existance of a significant advantage over current disk technology in one or more dimensions. Cheaper and faster storage can make this a reality.

Moreover, as for the ecosystem discussion, arguably the NAND ecosystem is even larger than disk.  I don’t have the figures but if one includes SSD drive producers as well as NAND semiconductor manufacturers the amount of capital investment in R&D is at least the size of disk technology if not orders of magnitude larger.

Disks will go extinct someday

So will disks become extinct, yes someday undoubtedly, but when is harder to nail down. Earlier in my career there was talk of super-paramagnetic effect that would limit how much data could be stored on a disk. Advances in heads and media moved that limit out of the way. However, there will come a time where it becomes impossible (or more likely too expensive) to increase magnetic recording density.

I was at a meeting a few years back where a magnetic head researcher predicted that such an end point to disk density increase would come in 25 years time for disk and 30 years for tape.  When this occurs disk density increase will stand still and then it’s a certainty that some other technology will take over.  Because as we all know data storage requirements will never stop increasing.

I think the other major unknown is other, non-NAND semiconductor storage technologies still under research.  They have the potential for  unlimited data retention, balanced performance and sequential performance orders of magnitude faster than disk and can become a much more functional equivalent of disk storage.  Such technologies are not commercially available today in sufficient densities and cost to even threaten NAND let alone disk devices.


So when do disks go extinct.  I would say in 15 to 20 years time we may see the last disks in enterprise storage.  That would give disks an almost an 80 year dominance over storage technology.

But in any event I don’t see disks going away anytime soon in enterprise storage.


Hitachi’s VSP vs. VMAX

Today’s announcement of Hitachi’s VSP brings another round to the competition between EMC and Hitachi/HDS in the enterprise. VSP’s recent introduction which is GA and orderable today, takes the rivalry to a whole new level.

I was on SiliconANGLEs live TV feed earlier today discussing the merits of the two architectures with David Floyer and Dave Vellante from Wikibon. In essence, there seems to be a religious war going on between the two.

Examining VMAX, it’s obviously built around a concept of standalone nodes which all have cache, frontend, backend and processing components built in. Scaling the VMAX, aside from storage and perhaps cache, involves adding more VMAX nodes to the system. VMAX nodes talk to one another via an external switching fabric (RapidIO currently). The hardware although sophisticated packaging, IO connection technology and other internal stuff looks very much like a 2U server one could purchase from any number of vendors.

On the other hand, Hitachi’s VSP is a special built storage engine (or storage computer as Hu Yoshida says). While the architecture is not a radical revision of USP-V, it’s a major upleveling of all component technology from the 5th generation cross bar switch, the new ASIC driven Front-end and Back-end directors, the shared control L2 cache memory and the use of quad core Xenon Intel processors. Much of this hardware is unique, sophistication abounds and looks very much like a blade system for the storage controller community.

The VSP and VMAX comparison is sort of like a open source vs. closed source discussion. VMAX plays the role of open source champion that largely depends on commodity hardware, sophisticated packaging but with minimal ASICs technology. As evidence of the commodity hardware VPLEX EMC’s storage virtualization engine reportedly runs on VMAX hardware. Commodity hardware lets EMC ride the technology curve as it advances for other applications.

Hitachi VSP plays the role of closed source champion. Its functionality is locked inside proprietary hardware architecture, ASICS and interfaces. The functionality it provides is tightly coupled with their internal architecture and Hitachi probably believes that by doing so they can provide better performance and more tightly integrated functionality to the enterprise.

Perhaps this doesn’t do justice to either development team. There is plenty of unique proprietary hardware and sophisticated packaging in VMAX but they have taken the approach of separate but equal nodes. Whereas Hitachi has distributed this functionality out to various components like Front-end directors (FEDs), backend directors (BEDs), cache adaptors (CAs) and virtual storage directors (VSDs), each of which can scale independently, i.e., doesn’t require more BEDs to add FEDs or CAs. Ditto for VSDs. Each can be scaled separately up to the maximum that can fit inside a controller chasis and then if needed, you can add a whole another controller chasis.

One has an internal switching infrastructure (the VSP cross bar switch) and the other uses external switching infrastructure (the VMAX RapidIO). The promise of external switching like commodity hardware, is that you can share the R&D funding to enhance this technology with other users. But the disadvantage is that architecturally you may have more latency to propagate an IO to other nodes for handling.

With VSP’s cross bar switch, you may still need to move IO activity between VSDs but this can be done much faster and any VSD can access any CA, BED, FED resource required to perform the IO so the need to move IO is reduced considerably. Thus, providing a global pool of resources that any IO can take advantage of.

In the end, blade systems like VSP or separate server systems like VMAX, can all work their magic. Both systems have their place today and in the foreseeable future. Where blades servers shine is in dense packaging, high power cooling efficiency and bringing a lot of horse power to a small package. On the other hand, server systems are simple to deploy and connect together with minimal limitations on the number of servers that can be brought together.

In a small space blade systems probably can bring more compute (storage IO) power to bear within the same volume than multiple server systems but the hardware is much more proprietary and costs lots of R&D $s to maintain leading edge capabilities.

Typed this out after the show, hopefully I characterized the two products properly. If I am missing anything please let me know.

[Edited for readability, grammar and numerous misspellings – last time I do this on an iPhone. Thanks to Jay Livens (@SEPATONJay) and others for catching my errors.]