Google releases new Cloud TPU & Machine Learning supercomputer in the cloud

Last year about this time Google released their 1st generation TPU chip to the world (see my TPU and HW vs. SW … post for more info).

This year they are releasing a new version of their hardware called the Cloud TPU chip and making it available in a cluster on their Google Cloud.  Cloud TPU is in Alpha testing now. As I understand it, access to the Cloud TPU will eventually be free to researchers who promise to freely publish their research and at a price for everyone else.

What’s different between TPU v1 and Cloud TPU v2

The differences between version 1 and 2 mostly seem to be tied to training Machine Learning Models.

TPU v1 didn’t have any real ability to train machine learning (ML) models. It was a relatively dumb (8 bit ALU) chip but if you had say a ML model already created to do something like understand speech, you could load that model into the TPU v1 board and have it be executed very fast. The TPU v1 chip board was also placed on a separate PCIe board (I think), connected to normal x86 CPUs  as sort of a CPU accelerator. The advantage of TPU v1 over GPUs or normal X86 CPUs was mostly in power consumption and speed of ML model execution.

Cloud TPU v2 looks to be a standalone multi-processor device, that’s connected to others via what looks like Ethernet connections. One thing that Google seems to be highlighting is the Cloud TPU’s floating point performance. A Cloud TPU device (board) is capable of 180 TeraFlops (trillion or 10^12 floating point operations per second). A 64 Cloud TPU device pod can theoretically execute 11.5 PetaFlops (10^15 FLops).

TPU v1 had no floating point capabilities whatsoever. So Cloud TPU is intended to speed up the training part of ML models which requires extensive floating point calculations. Presumably, they have also improved the ML model execution processing in Cloud TPU vs. TPU V1 as well. More information on their Cloud TPU chips is available here.

So how do you code a TPU?

Both TPU v1 and Cloud TPU are programmed by Google’s open source TensorFlow. TensorFlow is a set of software libraries to facilitate numerical computation via data flow graph programming.

Apparently with data flow programming you have many nodes and many more connections between them. When a connection is fired between nodes it transfers a multi-dimensional matrix (tensor) to the node. I guess the node takes this multidimensional array does some (floating point) calculations on this data and then determines which of its outgoing connections to fire and how to alter the tensor to send to across those connections.

Apparently, TensorFlow works with X86 servers, GPU chips, TPU v1 or Cloud TPU. Google TensorFlow 1.2.0 is now available. Google says that TensorFlow is in use in over 6000 open source projects. TensorFlow uses Python and 1.2.0 runs on Linux, Mac, & Windows. More information on TensorFlow can be found here.

So where can I get some Cloud TPUs

Google is releasing their new Cloud TPU in the TensorFlow Research Cloud (TFRC). The TFRC has 1000 Cloud TPU devices connected together which can be used by any organization to train machine learning algorithms and execute machine learning algorithms.

I signed up (here) to be an alpha tester. During the signup process the site asked me: what hardware (GPUs, CPUs) and platforms I was currently using to training my ML models; how long does my ML model take to train; how large a training (data) set do I use (ranging from 10GB to >1PB) as well as other ML model oriented questions. I guess there trying to understand what the market requirements are outside of Google’s own use.

Google’s been using more ML and other AI technologies in many of their products and this will no doubt accelerate with the introduction of the Cloud TPU. Making it available to others is an interesting play but this would be one way to amortize the cost of creating the chip. Another way would be to sell the Cloud TPU directly to businesses, government agencies, non government agencies, etc.

I have no real idea what I am going to do with alpha access to the TFRC but I was thinking maybe I could feed it all my blog posts and train a ML model to start writing blog post for me. If anyone has any other ideas, please let me know.

Comments?

Photo credit(s): From Google’s website on the new Cloud TPU

 

Know Fortran, optimize NASA code, make money

Read a number of articles this past week about NASA offering a Fortran optimization contest, the High Performance Fast Computing Contest (HPFCC) for their computational fluid dynamics (CFD) program. They want to speed up CFD by 10X to 1000X and are willing to pay for it.

The contest is being run through HeroX and TopCoder and they are offering $55K, across the various levels of the contests to the winners.

The FUN3D CVD code (manual) runs on NASA’s Pliedes Linux supercomputer complex which sports over 245K cores. Even when running on the supercomputer complex, a typical CVD FUN3D run takes thousands to millions of core hours!

The program(s)

FUN3D does a hypersonic fluid analysis over a (fixed) surface which includes a “simulation of mixtures of thermally perfect gases in thermo-chemical equilibrium and non-equilibrium. The routines in PHYSICS_DEPS enable coupling of the new gas modules to the existing FUN3D infrastructure. These algorithms also address challenges in simulation of shocks and boundary layers on tetrahedral grids in hypersonic flows.”

Not sure what all that means but I am certain there’s a number of iterations on multiple Fortran modules, and it does this over a 3D grid of points, which corresponds to both the surface being modeled and the gas mixture, it’s running through at hypersonic speeds. Sounds easy enough.

The contest(s)

There are two levels to the contest: an Ideation phase (at HeroX) and an architectural phase (at TopCoder). The $55000 is split up between the HeroX ideation phase which rewards a total of $20K: $10K for winner and 2-$5K runner up prizes and the TopCoder architectural phase which rewards a total of $35K: $15K for winner and $10K for 2nd place and another $10K for “Qualified improvement candidate”.

The (HeroX) Ideation phase looks for specific new or faster algorithms that could replace current ones in FUN3D which include “exploiting algorithmic developments in such areas as grid adaptation, higher-order methods and efficient solution techniques for high performance computing hardware.”

The (TopCoder) Architecture phase looks at specifically speeding up actual FUN3D code execution. “Ideal submission(s) may include algorithm optimization of the existing code base,  Inter-node dispatch optimization or a combination of the two.  Unlike the Ideation challenge, which is highly strategic, this challenge focuses on measurable improvements of the existing FUN3d suite and is highly tactical.”

Sounds to me that the ideation phase is selecting algorithm designs and the architecture phase is running the new algorithms or just in general speeding up the FUN3D code execution.

The equation(s)

There’s a Navier-Stokes equation algorithm that get’s called maybe a trillion times until the flow settles down, during a run and any minor improvement here would be obviously significant. Perhaps there are algorithmic changes that can be used, if your an aeronautical engineer or perhaps there are compiler speedups that can be found, if your a fortran expert. Both approaches can be validated/debugged/proved out on a desktop computer.

You have to be a US citizen to access the code and you can apply here. You will receive an email to verify your email address and then once your validated and back on the website, you need to approve the software use agreement. NASA will verify your physical address by sending a letter to you with a passcode to use to finally access the code. The process may take weeks to complete, so if your interested in the contest, best to start now.

The Fortran(s)

I learned Fortran 66 a long time ago and may have dabbled with Fortran 77 but that’s the last touched fortran. But it’s like riding a bike, once you do it, it’s easy to do it again.

As I understand it the FUN3D uses Fortran 2003 and NASA suggests you use the Gnu Fortran GFortran compiler as the Intel one has some bugs in it. There appears to be a Fortran 2015 but it’s not in main use just yet.

A million core hours, just amazing. If you could save a millisecond out of the routine called a trillion times, you’d save 1 billion seconds, or ~280K core hours.

Coders start your engines…

 

Crowdsourced vision for visually impaired

Read an article the other day in Christian Science Monitor (CSM) on the Be My Eyes App. The app is from BeMyEyes.com and is available for the iPhone and Android smart phones.

Essentially there are two groups of people that use the app:

  • Visually helpful volunteers – these people signup for the app and when a visually impaired person needs help they provide visual aid by speaking to the person on the other end.
  • Visually impaired individuals – these people signup for the app and when they are having problems understanding what they are (or are not) looking at they can turn on their camera take video with their phone and it will be sent to a volunteer, they can then ask the volunteer for help in deciding what they are looking at.

So, the visually impaired ask questions about the scenes they are shooting with their phone camera and volunteers will provide an answer.

It’s easy to register as Sighted and I assume Blind. I downloaded the app, registered and tried a test call in minutes. You have to enable notifications, microphone access and camera access on your phone to use the app. The camera access is required to display the scene/video on your phone.

According to the app there are 492K sighted individuals, 34.1K blind individuals and they have been helped 214K times.

Sounds like an easy way to help the world.

There was no requests to identify a language to use, so it may only work for English speakers. And there was no way to disable/enable it for a period of time when you don’t want to be disturbed. But maybe you would just close the app.

But other than that it was simple to use and seemed effective.

Now if there were only an app that would provide the same service for the hearing impaired to supply captions or a “filtered” audio feed to ear buds.

The world need more apps like this…

Comments

There’s a new cluster filesystem on the block, Elastifile

At SFD12 last month we talked with the team from Elastifile. They are a new startup out of Israel working on a better cluster file system.

Elastifile was designed to support 1000s of nodes, 100,000 of users/client and 1000s of data containers (file systems/mount points), together with an infinite (64 bit) number of files and directories and up to Exabytes (10**18) in capacity. They also offer a 100% SSD file store capability. I encourage you to view the videos of their presentations at SFD12 to learn more.

Elastifile features

Elastifile supports data compression and optionally deduplication with NAND/Flash (e. g., low-/high-endurance) storage tiering, cloud storage tiering and multi-site storage. They also provide NFSv3/v4, SMB, AWS S3 and HDFS as native access protocols for their file storage.

They also offer non-disruptive hardware/software upgrades, n-way (2- or 3-way) data and metadata redundancy, self-healing capabilities, snapshots, and synchronous/asynchronous data replication or mirroring. Further, they provide multi-tenancy and QoS support.

Elastifile can be used in hyper converged mode as well as a dedicated storage server mode. For backend storage, they support heterogeneous, physical (block, I think?) storage systems as well as direct access storage in cluster nodes

Internals matter

Elastifile’s architecture supports accessor, owner and data nodes. But these can all be colocated on the same server or segregated across different servers.

Owner nodes, own all the metadata objects for a file or directory and caches the metadata working set in i’s memory. Ownership file or directory metadata may change in the case of hardware failures.

Elastifile supports a dynamic write data path, which means they determine, in real time, where to write file data rather than having the data locations identified before hand. They call this distributed write anywhere semantics.

Notably they don’t do data caching (with NVMe it doesn’t make sense) however, as noted above, they do use metadata caching

Internally, Elastifile uses variable length objects for both file data and metadata.

  • File data is composed of three object types: a file metadata (FileMD) object, mapping data objects, and file data objects. FileMD’s hold the normal file metadata (name, file size, create, access & modify ToDs, etc.) as well as pointing to all the Mapping Object (OIDs). Mapping objects exist for each 0.5MB of file data and consist of a 128 element table, each element mapping 4KB of file address space to a data object (OID). Each  data object holds the 4KB of compressed file data and journal log entries.
  • Director metadata is composed of directory metadata (DirMD) object and Directory listing objects. Directory listing objects maps file/directory names to FileMD or DirMD OIDs. Directory listing objects are accessed via an extensible hash table and contain a list of filenames/directory names within the directory

The Elastifile software architecture consists of three layers:

  • A protocol layer which terminates file system access protocols and translates requests into internal requests. The hashing and data compression of file data occur at this level.
  • A metadata layer which provides file system/directory name mapping to objects for owned files/directories and maintains file/directory metadata updates/journals/checkpoints.
  • A data layer which provides transaction consistency and a n-way redundant persistent data storage for (file or metadata) objects.

Metadata operations are persisted via journaled transactions and which are distributed across the cluster. For instance the journal entries for a mapping data object updates are written to the same file data object (OID) as the actual file data, the 4KB compressed data object.

There’s plenty of discussion on how they manage consistency for their metadata across cluster nodes. Elastifile invented and use Bizur, a key-value consensus based DB. Their chief architect Ezra Hoch (@EzraHoch) did a blog post and paper on Bizur for more information

~~~~

New file systems generally take many years to mature and get out into the market, cluster file systems even longer. Elastifile started in 2013, by some very smart engineers, is already on the market, just 4 years later. That’s impressive enough, but with their list of advanced functionality plus cloud storage tiering and multi-site operations all shipping in the current product is mind-blowing.

One lingering question is, does a market exist for another cluster file system? All flash is interesting but most of the current CFS’s do this and ship this today. Cloud storage tiering is interesting and a long term need but some CFSs already have this and others are no doubt implementing it as we speak. CFS’s use of objects for internal data and metadata management is not new and may make internals cleaner but don’t really provide a lot of customer benefit.

Exascale raw capacity, support for 100K users, 1000s of nodes, 1000s of file systems and an infinite # of files/directories is interesting. But most CFSs claim this level of support already, although this is more aspirational for some. And proving support at this scale is difficult, if not impossible.

On the other hand, Bizur is really neat. Its primary benefit is during recovery from hardware failures. For a CFS with 1000s of nodes, failures likely occur quite often. So Bizur’s advantage here may pay significant customer dividends.

Is that enough to to market a new CFS?

To see what other SFD12 bloggers have written on Elastifile, please see:

Ethereum enters the enterprise

Read an article the other day on NYT (Business Giants Announce Creation of … Ethereum).

In case you don’t know Ethereum is a open source, block chain solution that’s different than the software behind Bitcoin and IBM’s Hyperledger (for more on Hyperledger see our Blockchains at IBM post or our GreyBeardsOnStorage podcast with Donna Dillinger, IBM Fellow).

Blockchains are a software based, permanent ledger which can be used to record anything. Bitcoin, Ethereum and Hyperledger are all based on blockchains that provide similar digital information services with varying security, programability, consensus characteristics, etc.

Earth globe within a locked cageBlockchains represent an entirely new way of doing business in the digital world and have the potential to take over many financial services  and other contracting activities that are done today between organizations.

Blockchain services provide the decentralized recording of transactions into an immutable ledger.  The decentralized nature of blockchains makes it difficult (if not impossible) to game the system to record an invalid transaction.

Miners

Ethereum supports an Ethereum Virtual Machine (EVM) application which offers customers and users a more programmable blockchain. That is rather than just updating accounts with monetary transactions like Bitcoin does, one can implement specialized transaction processing for updating the immutable ledger. It’s this programability that allows for the creation of “smart contracts” which can be programmatically verified and executed.

MinerEthereum miner nodes are responsible for validating transactions and the state transition(s) that update the ledger. Transactions are grouped in blocks by miners.

Miners are responsible for validating the transaction block and performing a hard mathematical computation or proof of work (PoW) which goes along used to validate the block of transactions. Once the PoW computation is complete, the block is packaged up and the miner node updates its database (ledger) and communicates its result to all the other nodes on the network which updates their transaction ledgers as well. This constitutes one state transition of the Ethereum ledger.

Miners that validate Ethereum transactions get paid in Ethers, which are a form of currency throughout the Ethereum ecosystem.

Blockchain consensus

Ethereum ledger consensus is based on the miner nodes executing the PoW algorithm properly. The current Ethereal PoW algorithm is Ethash, which is an “ASIC resistant” algorithm. What this means is that standard GPUs and (less so) CPUs are already very well optimized to perform this algorithm and any potential ASIC designer, if they could do better, would make more money selling their design to GPU and CPU designers, than trying to game the system.

One problem with Bitcoin is that its PoW is more ASIC friendly, which has led some organizations to developing special purpose ASICs in an attempt to dominate Bitcoin mining. If they can dominate Bitcoin mining, this can  be used to game the Bitcoin consensus system and potentially implement invalid transactions.

Ethereum Accounts

Ethereum has two types of accounts:

  • Contract accounts that are controlled by the EVM application code, or
  • Externally owned accounts (EOA) that are controlled by a set of private keys and represent external agents (miner nodes, people, transaction generating entities)

Contract accounts really are code and data which constitute the EVM bytecode (application). Contract account bytecode is also stored on the Ethereum ledger (when deployed?) and are associated with an EOA that initiates the Contract account.

Contract functionality is written in Solidity, Serpent, Lisp Like Language (LLL) or other languages that can be compiled into EVM bytecode. Smart contracts use Ethereum Contract accounts to validate and execute contract actions.

Ethereum gas pricing

As EVMs contract accounts can consume arbitrary amounts of computation, bandwidth and storage to process transactions,   Ethereum uses a concept called “gas” to pay for their resource consumption.

When a contract account transaction is initiated, it identifies a gas price (in Ethers) and a maximum gas amount that it is willing to consume to process the transaction.

When a contract transaction takes place:

  • If the maximum gas amount is less than what the transaction consumes, then the transaction is executed and is applied to the ledger. Any left over or remaining gas Ethers is credited back to the EOA.
  • If the maximum gas amount is not enough to execute the transaction, then the transaction fails and no update occurs.

Enterprise Ethereum Alliance

What’s new to Ethereum is that Accenture, Bank of New York Mellon, BP, CreditSuisse, Intel, Microsoft, JP Morgan, UBS and many others have joined together to form the Enterprise Ethereum Alliance. The alliance intends to work to create a standard version of the Ethereum software that enterprise companies can use to manage smart contracts.

Microsoft has had a Azure Blockchain-as-a-Service online since 2015.  This was based on an earlier version of Ethereum called Project Bletchley.

Ethereum seems to be an alternative to IBM Hyperledger, which offers another enterprise class block chain for smart contracts. As enterprise class blockchains look like they will transform the way companies do business in the future, having multiple enterprise class blockchain solutions seems smart to many companies.

Comments?

Photo Credit(s): Miner by Mark Callahan; Gas prices by Corpsman.com; File: Ether pharmecie.jpg by Wikimedia

 

Docker presents at Cloud Field Day 1 (CFD1)

img_6933Earlier this summer, Docker presented at Cloud Field Day 1 (CFD1) on some of their current technology and upcoming enhancements. (See the video’s here).

As you probably recall, Docker is an implementation of Linux containers which is a way of packaging applications into micro-services that can be built, ship and run across onprem, private and public cloud infrastructure.

Docker containers and Docker Engine

Docker containers combine a base OS image, plus whatever other binaries are needed to run a micro-service into a container which runs ontop of a Docker Engine.  Containers can then be run as a single instance or multiple instances on a Docker Engine.

img_6943Containers are not VMs, they have a fundamentally different architecture. For instance,

  • A VM includes a full OS and App software, it often takes several minutes to boot up and there is a hypervisor underneath it that emulates hardware and other critical services needed to run a VM. But there is no underlying standard OS under the VM layer.
  • A Docker container relies on shared OS resources, which allows for a lighter weight application package using shared resources, which means that instantiation/booting up is much faster, there is no Hypervisor, but a container can run under Linux, Windows or Mac OSs, and containers provide for full stack portability.

In the Docker Hub (srepository for Docker containers) one can find a WordPress container that contains the whole LAMP + WordPress stack in a single container. To run WordPress you would also need a MySQL or compatible database and there’s a MySQL machine container that can be used. You could easily run both the WordPress/LAMP container and the MySQL container in the same Docker Engine, connect the two together and connect the LAMP+Wordpress container to the Internet to fire up a WordPress blog site.

Docker compared VMs to houses and containers to apartments. Docker Engines can run as a VM or on bare metal hardware.

Running Docker containers on desktop, servers and in the cloud

img_6938If you want to experiment with Docker, you can download Docker for Mac or Docker for Windows which can be used install and run a native Docker engine on your desktop.

Windows Server also supports native Docker containers. In VMware one can run Docker containers under vSphere Integrated Containers which supplies Docker API endpoints as standard ESX VMs or you can run Docker containers under Project Photon which is a streamlined, non-ESX hypervisor that also supplies Docker API endpoints.

You can run Docker containers in AWS and Azure as well that integrates with each public cloud’s compute, network and storage services.

Docker Swarm

So you have your Docker engine running, with multiple containers sharing resources and to create an application but your out of compute, storage or networking power on your engine and need to bring on another server or two.  What do you do? With Docker 1.12, you can now use Docker Swarm, which supports multiple Docker Engines.

With Docker Swarm, you have management nodes and worker nodes. Management nodes provide HA services for Docker containers which runs across multiple worker nodes. Worker nodes run Docker Engines with multiple containers.

img_6940Docker Swarms orchestrates the operation of multiple Docker Engines running Docker Services.

A Docker Service is a Docker container running across multiple worker nodes (engines) in a Docker Swarm. Docker services can be run globally (across each worker node) or replicated (some number of Docker Container instances are run across one or more worker nodes). You specify on the Docker Service command which you want and Swarm will insure that the specifications selected are implemented across its worker nodes.

If a worker node goes down, Swarm will detect it and re-start the failed container instances on other worker nodes in the Swarm. Beware, if your container relied on persistent storage, that storage must be also available to all Swarm worker nodes.

Swarm provides a Routing Mesh. When you fire up a container service you can identify a swarm-wide ingress port for a container. Every worker node will listen in on that port to provide a container-aware routing service to route app requests across the Swarm to wherever the containers are currently running.

You can have multiple Swarm management nodes which share the management of the Swarm. Swarm management nodes are either leaders or followers and provide a RAFT consensus model. If the leader node goes down, another management node will take on its leadership role and start managing the Swarm.

There are many other technologies underneath Docker Swarm that are worth a look but suffice it to say it provides a load-balancing, HA service for container execution across multiple engines.

Docker Datacenter

What could possibly be missing? We have Docker Engines that can run multiple containers and Docker Swarms that can run multiple Docker Engines and containers in an HA manner. But we really need something that supports multiple Docker Swarms,  and throw in a private secure Container repository and enterprise support options while you’re at it.

Earlier this year Docker introduced Docker Datacenter, a priced service offering which does just that.  It provides Containers-as-a-Service (CaaS) across multiple Docker Swarms that has commercial support options, a Docker Trusted Repository and integrates it all with enterprise services like LDAP/AD to provide audit logs and other monitoring capabilities for container services execution.

Using Docker Datacenter, developers can have their own multiple development swarms to support engineering activities and ship and store their container images in a secure, private repository and operations can have multiple Swarms which all run the same Docker Container apps in an HA manner.

From an app developer standpoint, it all looks like container instances are running in the same Docker Engine environment across all those implementations. Operations sees a centralized management console (plane) that provides a way to monitor and manage multiple Docker Swarms running everywhere.

Well that’s about it for the update on Docker. There wasn’t much at the sessions on how containers access persistent storage but there’s a Flocker service that offers plugin support for EMC, NetApp and other enterprise SAN storage for Container apps. And there seem to be others out there and available.

You can read/hear more about Docker from these other CFD1 participants:

Comments

Full disclosure: Docker gave us a very nice/very long scarf, and two t-shirts decorated with Docker logo and tagline and a number of stickers and pins.

Hitachi and the coming IoT gold rush

img_7137Earlier this week I attended Hitachi Summit 2016 along with a number of other analysts and Hitachi executives where Hitachi discussed their current and ongoing focus on the IoT (Internet of Things) business.

We have discussed IoT before (see QoM1608: The coming IoT tsunami or not, Extremely low power transistors … new IoT applications). Analysts and companies predict  ~200B IoT devices by 2020 (my QoM prediction is 72.1B 0.7 probability). But in any case there’s a lot of IoT activity going to come online, very shortly. Hitachi is already active in IoT and if anything, wants it to grow, significantly.

Hitachi’s current IoT business

Hitachi is uniquely positioned to take on the IoT business over the coming decades, having a number of current businesses in industrial processes, transportation, energy production, water management, etc. Over time, all these industries and more are becoming much more data driven and smarter as IoT rolls out.

Some metrics indicating the scale of Hitachi’s current IoT business, include:

  • Hitachi is #79 in the Fortune Global 500;
  • Hitachi’s generated $5.4B (FY15) in IoT revenue;
  • Hitachi IoT R&D investment is $2.3B (over 3 years);
  • Hitachi has 15K customers Worldwide and 1400+ partners; and
  • Hitachi spends ~$3B in R&D annually and has 119K patents

img_7142Hitachi has been in the OT (Operational [industrial] Technology) business for over a century now. Hitachi has also had a very successful and ongoing IT business (Hitachi Data Systems) for decades now.  Their main competitors in this IoT business are GE and Siemans but neither have the extensive history in IT that Hitachi has had. But both are working hard to catchup.

Hitachi Rail-as-a-Service

img_7152For one example of what Hitachi is doing in IoT, they have recently won a 27.5 year Rail-as-a-Service contract to upgrade, ticket, maintain and manage all new trains for UK Rail.  This entails upgrading all train rolling stock, provide upgraded rail signaling, traffic management systems, depot and station equipment and ticketing services for all of UK Rail.

img_7153The success and profitability of this Hitachi service offering hinges on their ability to provide more cost efficient rail transport. A key capability they plan to deliver is predictive maintenance.

Today, in UK and most other major rail systems, train high availability is often supplied by using spare rolling stock, that’s pre-positioned and available to call into service, when needed. With Hitachi’s new predictive maintenance capabilities, the plan is to reduce, if not totally eliminate the need for spare rolling stock inventory and keep the new trains running 7X24.

img_7145Hitachi said their new trains capture 48K data items and generate over ~25GB/train/day. All this data, will be fed into their new Hitachi Insight Group Lumada platform which includes Pentaho, HSDP (Hitachi Streaming Data Platform) and their Content Analytics to analyze train data and determine how best to keep the trains running. Behind all this analytical power will no doubt be HDS HCP object store used to keep track of all the train sensor data and other information, Hitachi UCP servers to process it all, and other Hitachi software and hardware to glue it all together.

The new trains and services will be rolled out over time, but there’s a pretty impressive time table. For instance, Hitachi will add 120 new high speed trains to UK Rail by 2018.  About the only thing that Hitachi is not directly responsible for in this Rail-as-a-Service offering, is the communications network for the trains.

Hitachi other IoT offerings

Hitachi is actively seeking other customers for their Rail-as-a-service IoT service offering. But it doesn’t stop there, they would like to offer smart-water-as-a-service, smart-city-as-a-service, digital-energy-as-a-service, etc.

There’s almost nothing that Hitachi currently supplies as industrial products that they wouldn’t consider offering in an X-as-a-service solution. With HDS Lumada Analytics, HCP and HDS storage systems, Hitachi UCP converged infrastructure, Hitachi industrial products, and Hitachi consulting services, together they are primed to take over the IoT-industrial products/services market.

Welcome to the new Hitachi IoT world.

Comments?

Blockchains at IBM

img_6985-2I attended IBM Edge 2016 (videos available here, login required) this past week and there was a lot of talk about their new blockchain service available on z Systems (LinuxONE).

IBM’s blockchain software/service  is based on the open source, Open Ledger, HyperLedger project.

Blockchains explained

1003163361_ba156d12f7We have discussed blockchain before (see my post on BlockStack). Blockchains can be used to implement an immutable ledger useful for smart contracts, electronic asset tracking, secured financial transactions, etc.

BlockStack was being used to implement Private Key Infrastructure and to implement a worldwide, distributed file system.

IBM’s Blockchain-as-a-service offering has a plugin based consensus that can use super majority rules (2/3+1 of members of a blockchain must agree to ledger contents) or can use consensus based on parties to a transaction (e.g. supplier and user of a component).

BitCoin (an early form of blockchain) consensus used data miners (performing hard cryptographic calculations) to determine the shared state of a ledger.

There can be any number of blockchains in existence at any one time. Microsoft Azure also offers Blockchain as a service.

The potential for blockchains are enormous and very disruptive to middlemen everywhere. Anywhere ledgers are used to keep track of assets, information, money, etc, that undergo transformations, transitions or transactions as they are further refined, produced and change hands, can be easily tracked in blockchains.  The only question is can these assets, information, currency, etc. be digitally fingerprinted and can that fingerprint be read/verified. If such is the case, then blockchains can be used to track them.

New uses for Blockchain

img_6995IBM showed a demo of their new supply chain management service based on z Systems blockchain in action.  IBM component suppliers record when they shipped component(s), shippers would record when they received the component(s), port authorities would record when components arrived at port, shippers would record when parts cleared customs and when they arrived at IBM facilities. Not sure if each of these transitions were recorded, but there were a number of records for each component shipment from supplier to IBM warehouse. This service is live and being used by IBM and its component suppliers right now.

Leanne Kemp, CEO Everledger, presented another example at IBM Edge (presumably built on z Systems Hyperledger service) used to track diamonds from mining, to cutter, to polishing, to wholesaler, to retailer, to purchaser, and beyond. Apparently the diamonds have a digital bar code/fingerprint/signature that’s imprinted microscopically on the diamond during processing and can be used to track diamonds throughout processing chain, all the way to end-user. This diamond blockchain is used for fraud detection, verification of ownership and digitally certify that the diamond was produced in accordance of the Kimberley Process.

Everledger can also be used to track any other asset that can be digitally fingerprinted as they flow from creation, to factory, to wholesaler, to retailer, to customer and after purchase.

Why z System blockchains

What makes z Systems a great way to implement blockchains is its securely, isolated partitioning and advanced cryptographic capabilities such as z System functionality accelerated hashing, signing & securing and hardware based encryption to speed up blockchain processing.  z Systems also has FIPS-140 level 4 certification which can provide the highest security possible for blockchain and other security based operations.

From IBM’s perspective blockchains speak to the advantages of the mainframe environments. Blockchains are compute intensive, they require sophisticated cryptographic services and represent formal systems of record, all traditional strengths of z Systems.

Aside from the service offering, IBM has made numerous contributions to the Hyperledger project. I assume one could just download the z Systems code and run it on any LinuxONE processing environment you want. Also, since Hyperledger is Linux based, it could just as easily run in any OpenPower server running an appropriate version of Linux.

Blockchains will be used to maintain the system of record of the future just like mainframes maintained the systems of record of today and the past.

Comments?