Quantum computer programming

I was on a vendor call last week and they were discussing their recent technological advances in quantum computing. During the discussion they mentioned a number of ways to code for quantum computers. The currently most popular one is based on the QIS (Quantum Information Software) Kit.

I went looking for a principle of operations on quantum computers. Ssomething akin to the System 360 Principles of Operations Manual that explained how to code for an IBM 360 computer. But there was no such manual.

Instead there is a paper, on the Open Quantum Assembly Language (QASM) that describes the Quantum computational environment and coding language used in QIS Kit.

It appears that quantum computers can be considered a special computational co-proccesor engine, operated in parallel with normal digital computation. This co-processor happens to provide a quantum simulation.

QASM coding

One programs a quantum computer by creating a digital program which describes a quantum circuit that uses qubits and quantum registers to perform some algorithm on those circuits. The quantum circuit can be measured to provide a result  which more digital code can interpret and potentially use to create other quantum circuits in a sort of loop.

There are four phases during the processing of a QIS Kit quantum algorithm.

  1. QASM compilation which occurs solely on a digital computer. QASM source code describing the quantum circuit together with compile time parameters are translated into a quantum PLUS digital intermediate representation.
  2. Circuit generation, which also occurs on a digital computer with access to the quantum co-processor. The intermediate language compiled above is combined with other parameters (available from the quantum computer environment) and together these are translated into specific quantum building blocks (circuits) and some classical digital code needed and used during quantum circuit execution.
  3. Execution, which takes place solely on the quantum computer. The system takes as input, the collection of quantum circuits defined above and runtime control parameters,and transforms these using a high-level quantum computer controller into low-level, real time instructions for the quantum computer building the quantum circuits. These are then executed and the results of the quantum circuit(s) execution creates a result stream (measurements) that can be passed back to the digital program for further  processing
  4. Post-Processing, which takes place on a digital computer and uses the results from the quantum circuit(s) execution and other intermediate results and processes these to either generate follow-on quantum circuits or output ae final result for the quantum algorithm.

As qubit coherence only last for a short while, so results from one execution of a quantum circuit cannot be passed directly to another execution of quantum circuits.  Thus these results have to be passed through some digital computations before they can be used in subsequent quantum circuits. A qubit is a quantum bit.

Quantum circuits don’t offer any branching as such.

Quantum circuits

The only storage for QASM are classical (digital) registers (creg) and quantum registers (qreg) which are an array of bits and qubits respectively.

There are limited number of built-in quantum operations that can be performed on qregs and qubits. One described in the QASM paper noted above is the CNOT   operation, which flips a qubit, i.e., CNOT alb will flip a qubit in b, iff a corresponding qubit in a is on.

Quantum circuits are made up of one or more gate(s). Gates are invoked with a set of variable parameter names and quantum arguments (qargs). QASM gates can be construed as macros that are expanded at runtime. Gates are essentially lists of unitary quantum subroutines (other gate invocations), builtin quantum functions or barrier statements that are executed in sequence and operate on the input quantum argument (qargs) used in the gate invocation.

Opaque gates are quantum gates whose circuits (code) have yet to be defined. Opaque gates have a physical implementation may yet be possible but whose definition is undefined. Essentially these operate as place holders to be defined in a subsequent circuit execution or perhaps something the quantum circuit creates in real time depending on gate execution (not really sure how this would work).

In addition to builtin quantum operations,  there are other statements like the measure  or  reset statement. The reset statement sets a qubit or qreg qubits to 0. The measure statement copies the state of a qubit or qreg into a digital bit or creg (digital register).

There is one conditional command in QASM, the If statement. The if statement can compare a creg against an integer and if equal execute a quantum operation. There is one “decision”  creg, used as an integer.  By using IF statements one can essentially construct a case statement in normal coding logic to execute quantum (circuits) blocks.

Quantum logic within a gate can be optimized during the compilation phase so that they may not be executed (e.g., if the same operation occurs twice in a gate, normally the 2nd execution would be optimized out) unless a barrier statement is encountered which prevents optimization.

Quantum computer cloud

In 2016, IBM started offering quantum computers in its BlueMix cloud through the IBM Quantum (Q)  Experience. The IBM Q Experience currently allows researchers access to 5- and 16-qubit quantum computers.

There are three pools of quantum computers: 1 pool called IBMQX5, consists of 8 16-qubit computers and 2 pools of 5 5-qubit computers, IBMQX2 and IBMQX4. As I’m writing this, IBMQX5 and IBMQX2 are offline for maintenance but IBMQX4 is active.

Google has recently released the OpenFermion as open source, which is another software development kit for quantum computation (will review this in another post). Although Google also seems to have quantum computers and has provided researchers access to them, I couldn’t find much documentation on their quantum computers.

Two other companies are working on quantum computation: D-Wave Systems and Rigetti Computing. Rigetti has their Forest 1.0 quantum computing full stack programming and execution environment but I couldn’t easily find anything on D-Wave Systems programming environment.

Last month, IBM announced they have  constructed a 50-Qubit quantum computer prototype.

IBM has also released 20-Qubit quantum computers for customer use and plans to offer the new 50-Qubit computers to customers in the future.

Comments?

Picture Credit(s): Quantum Leap Supercomputer,  IBM What is Quantum Computing Website

QASM control flow, Open Quantum Assembly Language, by A. Cross, et al

IBM’s newly revealed 50-Qubit Quantum Processer …,  Softcares blog post

Scratch file use in HPC @ORNL, a statistical analysis

Attended SC17 (Supercomputing Conference) this past week and I received a copy of the accompanying research proceedings. There are a number of interesting papers in the research and I came across one, Scientific User Behavior and Data Sharing Trends in a Peta Scale File System by Seung-Hwan Lim, et al from Oak Ridge National Laboratory (ORNL) and the use of files at the Oak Ridge Leadership Computing Facility (OLCF) which was very interesting.

The paper statistically describes the use of a Scratch files in a multi PB file system (Lustre) at OLCF from January 2015 to August 2016. The OLCF supports over 32PB of storage, has a peak aggregate of over 1TB/s and Spider II (current Lustre file system) consists of 288 Lustre Object Storage Servers, all interconnected and connected to all the supercomputing cluster of  servers via an InfiniBand network. Spider II supports all scratch storage requirements for active/queued jobs for the Titan (#4 in Top 500 [super computer clusters worldwide] list) and other clusters at ORNL.

ORNL uses an HPSS (High Performance Storage System) archive for permanent storage but uses the Spider II file system for all scratch files generated and used during supercomputing applications.  ORNL is expecting Spider III (2018-2023) to host 10 billion files.

Scratch files are purged from Spider II after 90 days of no access.The paper is based on metadata analysis captured during scratch purging process for 500 days of access.

The paper displays a number of statistics and metrics on the use of Spider II:

  • Less than 3% of projects have a directory depth >15, the maximum directory depth was recorded at 432, with most projects having a shallow (<10) directory depth.
  • A project typically has 10X the files that a specific researcher has and a median file count/researcher is 2000 files with a median project having 20,000 files.
  • Storage system performance is actively managed by many projects. For instance, 20 out of 35 science domains manually managed their Lustre cluster configuration to improve throughput.
  • File count continues to grow and reached a peak of 1B files during the time being analyzed.
  • On average only 3% of files were accessed readonly, 10% of files updated (read-write) and 76% of files were untouched during a week period. However, median and maximum file age was 138 and 214 days respectively, which means that these scratch files can continue to be accessed over the course of 200+ days.

There was more information in the paper but one item missing is statistics on scratch file size distribution a concern.

Nonetheless, in paints an interesting picture of scratch file use in HPC application/supercluster environments today.

Comments?

Crowdresearch, crowdsourced academic research

Read an article in Stanford Research, Crowdsourced research gives experience to global participants that discussed an activity in Stanford and other top tier research institutions to try to get global participation in academic research. The process is discussed more fully in a scientific paper (PDF here) by researchers from Stanford, MIT Media Lab, Cornell Tech and UC Santa Cruz.

They chose three projects:

  • A HCI (human computer interaction) project to design, engineer and build a new paid crowd sourcing marketplace (like Amazon’s Mechanical Turk).
  • A visual image recognition project to improve on current visual classification techniques/algorithms.
  • A data science project to design and build the world’s largest wisdom of the crowds experiment.

Why crowdsource academic research?

The intent of crowdsourced research is to provide top tier academic research experience to persons which have no access to top research organizations.

Participating universities obtain more technically diverse researchers, larger research teams, larger research projects, and a geographically dispersed research community.

Collaborators win valuable academic research experience, research community contacts, and potential authorship of research papers as well as potential recommendation letters (for future work or academic placement),

How does crowdresearch work?

It’s almost an open source and agile development applied to academic research. The work week starts with the principal investigator (PI) and research assistants (RAs) going over last week’s milestone deliveries to see which to pursue further next week. The crowdresearch uses a REDDIT like posting and up/down voting to determine which milestone deliverables are most important. The PI and RAs review this prioritized list to select a few to continue to investigate over the next week.

The PI holds an hour long video conference (using Google Hangouts On Air Youtube live stream service). On the conference call all collaborators can view the stream but only a select few are on camera. The PI and the researchers responsible for the important milestone research of the past week discuss their findings and the rest of the collaborators on the team can participate over Slack. The video conference is archived and available  to be watched offline.

At the end of the meeting, the PI identifies next weeks milestones and potentially directly responsible investigators (DRIs) to work on them.

The DRIs and other collaborators choose how to apportion the work for the next week and work commences. Collaboration can be fostered and monitored via Slack and if necessary, more Google live stream meetings.

If collaborators need help understanding some technology, technique, or too, the PI, RAs or DRIs can provide a mini video course on the topic or can point to other information used to get the researchers up to speed. Collaborators can ask questions and receive answers through Slack.

When it’s time to write the paper, they used Google Docs with change tracking to manage the writing process.

The team also maintained a Wiki on the overall project to help new and current members get up to speed on what’s going on. The Wiki would also list the week’s milestones, video archives, project history/information, milestone deliverables, etc.

At the end of the week, researchers and DRIs would supply a mini post to describe their work and link to their milestone deliverables so that everyone could review their results.

Who gets credit for crowdresearch?

Each week, everyone on the project is allocated 100 credits and apportions these credits to other participants the weeks activities. The credits are  used to drive a page-rank credit assignment algorithm to determine an aggregate credit score for each researcher on the project.

Check out the paper linked above for more information on the credit algorithm. They tried to defeat (credit) link rings and other obvious approaches to stealing credit.

At the end of the project, the PI, DRIs and RAs determine a credit clip level for paper authorship. Paper authors are listed in credit order and the remaining, non-author collaborators are listed in an acknowledgements section of the paper.

The PIs can also use the credit level to determine how much of a recommendation letter to provide for researchers

Tools for crowdresearch

The tools needed to collaborate on crowdresearch are cheap and readily available to anyone.

  • Google Docs, Hangouts, Gmail are all freely available, although you may need to purchase more Drive space to host the work on the project.
  • Wiki software is freely available as well from multiple sources including Wikipedia (MediaWiki).
  • Slack is readily available for a low cost, but other open source alternatives exist, if that’s a problem.
  • Github code repository is also readily available for a reasonable cost but  there may be alternatives that use Google Drive storage for the repo.
  • Web hosting is needed to host the online Wiki, media and other assets.

Initial projects were chosen in computer science, so outside of the above tools, they could depend on open source. Other projects will need to consider how much experimental apparatus, how to fund these apparatus purchases, and how a global researchers can best make use of these.

My crowdresearch projects

Some potential commercial crowdresearch projects where we could use aggregate credit score and perhaps other measures of participation to apportion revenue, if any.

  • NVMe storage system using a light weight storage server supporting NVMe over fabric access to hybrid NVMe SSD – capacity disk storage.
  • Proof of Stake (PoS) Ethereum pooling software using Linux servers to create a pool for PoS ETH mining.
  • Bipedal, dual armed, dual handed, five-fingered assisted care robot to supply assistance and care to elders and disabled people throughout the world.

Non-commercial projects, where we would use aggregate credit score to apportion attribution and any potential remuneration.

  • A fully (100%?) mechanical rover able to survive, rove around, perform  scientific analysis, receive/transmit data and possibly, effect repairs from within extreme environments such as the surface of Venus, Jupiter and Chernoble/Fukishima Daiichi reactor cores.
  • Zero propellent interplanetary tug able to rapidly transport rovers, satellites, probes, etc. to any place within the solar system and deploy theme properly.
  • A Venusian manned base habitat including the design, build process and ongoing support for the initial habitat and any expansion over time, such that the habitat can last 25 years.

Any collaborators across the world, interested in collaborating on any of these projects, do let me know, here via comments. Please supply some way to contact you and any skills you’re interested in developing or already have that can help the project(s).

I would be glad to take on PI role for the most popular project(s), if I get sufficient response (no idea what this would be). And  I’d be happy to purchase the Drive, GitHub, Slack and web hosting accounts needed to startup and continue to fruition the most popular project(s). And if there’s any, more domain experienced PIs interested in taking any of these projects do let me know.  

Comments?

Picture Credit(s): Crowd by Espen Sundve;

Videoblogger Video Conference by Markus Sandy;

Researchers Night 2014 by Department of Computer Science, NTNU;

Materials science rescues civilization, again

Read a bunch of articles this past week from MIT Technology Review, How materials science will determine the future of human civilization, from Stanford University, New ultra thin semiconductor materials…, and Wired, This battery breakthrough could change everything.

The message varied a bit between articles but there was an underlying theme to all of them. Materials science was taking off, unlike it ever has before. Let’s take them on, one by one, last in first out.

New battery materials

I have not reported on new battery structures or materials in the past but it seems that every week or so I run across another article or two on the latest battery technology that will change everything. Yet this one just might do that.

I am no material scientist but Bill Joy has been investing in a company, Ionic Materials, for a while now (both in his job as a VC partner and as in independent invested) that has been working on a solid battery material that could be used to create rechargeable batteries.

The problems with Li(thium)-Ion batteries today are that they are a safety risk (lithium is a highly flammable liquid) and they use an awful lot of a relatively scarce mineral (lithium is mined in Chile, Argentina, Australia, China and other countries with little mined in USA). Electric cars would not be possible today with Li-On batteries.

Ionic Materials claim to have designed a solid polymer electrolyte that can combine the properties of familiar, ultra-safe alkaline batteries we use everyday and the recharge ability of  Li-Ion batteries used in phones and cars today. This would make a cheap, safe rechargeable battery that could work anywhere. The polymer just happens to also be fire retardant.

The historic problems with alkaline, essentially zinc and manganese dioxide is that they can’t be recharged too many times before they short out. But with the new polymer these batteries could essentially be recharged for as many times as Li-Ion today.

Currently, the new material doesn’t have as many recharge cycles as they want but they are working on it. Joy calls the material ional.

New semiconductor materials

Moore’s law will eventually cease. It’s only a question of time and materials.

Silicon is increasingly looking old in the tooth. As researchers shrink silicon devices down to atomic scales, they start to breakdown and stop functioning.

The advantages of silicon are that it is extremely scaleable (shrinkable) and easy to rust. Silicon rust or silicon dioxide was very important because it is used as an insulator. As an insulating layer, it could be patterned just like the silicon circuits themselves. That way everything (circuits, gates, switches and insulators) could all use the same, elemental material.

A couple of Stanford researchers, Eric Pop and Michal Mleczko, a electrical engineering professor and a post doc researcher, have discovered two new materials that may just take Moore’s law into a couple of more chip generations. They wrote about these new materials in their paper in Science Advances.

The new materials: hafnium diselenide and zirconium diselenide have many similar properties to silicon. One is that they can be easily made to scale. But devices made with the new materials still function at smaller geometries, at just three atoms thick (0.67nm) and also consume happen less power.

That’s good but they also rust better. When the new materials rust, they form a high-K insulating material. With silicon, high-K insulators required additional materials/processing and more than just simple silicon rust anymore. And the new materials also match Silicon’s band gap.

Apparently the next step with these new materials is to create electrical contacts. And I am sure as any new material, introduced to chip fabrication will take quite awhile to solver all the technical hurdles. But it’s comforting to know that Moore’s law will be around another decade or two to keep us humming away.

New multiferric materials

But just maybe the endgame in chip fabrication materials and possibly many other domains seems to be new materials coming out of ETH Zurich Switzerland.

There a researcher, Nicola Saldi,n has described a new sort of material that has both ferro-electric and ferro-magnetic properties.

Spaldin starts her paper off by discussing how civilization evolved mainly due to materials science.

Way in the past, fibers and rosin allowed humans to attach stone blades and other material to poles/arrows/axhandles to hunt  and farm better. Later, the discovery of smelting and basic metallurgy led to the casting of bronze in the bronze age and later iron, that could also be hammered, led to the iron age.  The discovery of the electron led to the vacuum tube. Pure silicon came out during World War II and led to silicon transistors and the chip fabrication technology we have today

Spaldin talks about the other major problem with silicon, it consumes lots of energy. At current trends, almost half of all worldwide energy production will be used to power silicon electronics in a couple of decades.

Spaldin’s solution to the  energy consumption problem is multiferric materials. These materials offer both ferro-electric and ferro-magnetic properties in the same materials.

Historically, materials were either ferro-electric or ferro-magnetic but never both. However, Spaldin discovered there was nothing in nature prohibiting the two from co-existing in the same material. Then she and her compatriots designed new multiferric materials that could do just that.

As I understand it, ferro-electric material allow electrons to form chemical structures which create electrical dipoles or electronic fields. Similarly, ferro-magnetic materials allow chemical structures to create magnetic dipoles or magnetic fields.

That is multiferric materials can be used to create both magnetic and electronic fields. And the surprising part was that the boundaries between multiferric magnetic fields (domains) form nano-scale, conducting channels which can be moved around using electrical fields.

Seems to me that if this were all possible and one could fabricate a substrate using multi-ferrics and write (program) any electronic circuit  you want just by creating a precise magnetic and electrical field ontop of it. And with todays disk and tape devices, precise magnetic fields are readily available for circular and linear materials. And it would seem just as easy to use multi multiferric material for persistent data storage.

Spaldin goes on to say that replacing magnetic fields in todays magnetism centric information/storage industry with electrical fields should lead to  reduced energy consumption.

Welcome to the Multiferric age.

Photo Credit(s): Battery Recycling by Heather Kennedy;

AMD Quad Core backside by Don Scansen;  and

Magnetic Field – 14 by Windell Oskay

Collaboration as a function of proximity vs. heterogeneity, MIT research

Read an article the other week in MIT news on how Proximity boosts collaboration on MIT campus. Using MIT patents and papers published between 2004-2014, researchers determined how collaboration varied based on proximity or physical distance.

What they found was that distance matters. The closer you are to a person the more likely you are collaborate with him or her (on papers and patents at least).

Paper results

In looking at the PLOS research paper (An exploration of collaborative scientific production at MIT …), one can see that the relative frequency of collaboration decays as distance increases (Graph A shows frequency of collaboration vs. proximity for papers and Graph B shows a similar relationship for patents).

 

Other paper results

The two sets of charts below show the buildings where research (papers and patents) was generated. Building heterogeneity, crowdedness (lab space/researcher) and number of papers and patents per building is displayed using the color of the building.

The number of papers and patents per building is self evident.

The heterogeneity of a building is a function of the number of different departments that use the building. The crowdedness of a building is an indication of how much lab space per faculty member a building has. So the more crowded buildings are lighter in color and less crowded buildings are darker in color.

I would like to point out Building 32. It seems to have a high heterogeneity, moderate crowdedness and a high paper production but a relatively low patent production. Conversely, Building 68 has a low heterogeneity, low crowdedness and a high production of papers and a relatively low production of patents. So similar results have been obtained from buildings that have different crowdedness and different heterogeneity.

The paper specifically cites buildings 3 & 32 as being most diverse on campus and as “hubs on campus” for research activity.  The paper states that these buildings were outliers in research production on a per person basis.

And yet there’s no global correlation between heterogeneity or crowdedness for that matter and (paper/patent) research production. I view crowdedness as a substitute for researcher proximity. That is the more crowded a building is the closer researchers should be. Such buildings should theoretically be hotbeds of collaboration. But it doesn’t seem like they have any more papers than non-crowded buildings.

Also heterogeneity is often cited as a generator of research. Steven Johnson’s Where Good Ideas Come From, frequently mentions that good research often derives from collaboration outside your area of speciality. And yet, high heterogeneity buildings don’t seem to have a high production of research, at least for patents.

So I am perplexed and unsatisfied with the research. Yes proximity leads to more collaboration but it doesn’t necessarily lead to more papers or patents. The paper shows other information on the number of papers and patents by discipline which may be confounding results in this regard.

Telecommuting and productivity

So what does this tell us about the plight of telecommuters in todays business and R&D environments. While the paper has shown that collaboration goes down as a function of distance, it doesn’t show that an increase in collaboration leads to more research or productivity.

This last chart from the paper shows how collaboration on papers is trending down and on patents is trending up. For both papers and patents, inter-departmental collaboration is more important than inter-building collaboration. Indeed, the sidebars seem to show that the MIT faculty participation in papers and patents is flat over the whole time period even though the number of authors (for papers) and inventors (for patents) is going up.

So, I,  as a one person company can be considered an extreme telecommuter for any organization I work with. I am often concerned that  my lack of proximity to others adversely limits my productivity. Thankfully the research is inconclusive at best on this and if anything tells me that this is not a significant factor in research productivity

And yet, many companies (Yahoo, IBM, and others) have recently instituted policies restricting telecommuting because, they believe,  it  reduces productivity. This research does not show that.

So IBM and Yahoo I think what you are doing to concentrate your employee population and reduce or outright eliminate telecommuting is wrong.

Picture credit(s): All charts and figures are from the PLOS paper. 

 

New chip architecture with CPU, storage & sensors in one package

Read an article the other day in MIT news, (3D chip combines computing and data storage) about a new 3D chip out of Stanford and MIT research, which includes CPU, RRAM (resistive RAM) storage class memories and sensors in one single package. Such a chip architecture vastly minimizes the off chip bottleneck to access storage and sensors.

Chip componentry

The chip’s sensors are based on carbon nanotubes. Aside from a layer of silicon at the bottom, all the rest of transistors used in the chip are also based off of carbon nanotube FET (field effect transistors).

The RRAM storage class memory is a based on a dielectric material which uses electrical resistance to store non-volatile data.

The bottom layer is a silicon based CPU. On top of the silicon is a carbon nanotube layer. Next comes the RRAM and the top layer is more carbon nanotubes making up the sensor array.

Architectural benefits

One obvious benefit is having data storage directly accessible to the CPU is that there’s no longer a need to go off chip to access data. The 2nd major advantage to the chip architecture is that the sensor array can write directly to RRAM storage, so there’s no off chip delay to provide sensor readout and storage.

Another advantage to using carbon nanotube FET’s is that they can be an order of magnitude more energy efficient than silicon transistors. Moreover, RRAM has the potential to be much denser than DRAM.

Finally, another major advantage is that this can all be built in one 3D chip because carbon nanotube and RRAM fabrication can be done at relatively cooler temperatures (~200C) vs. silicon fabrication which requires relatively high temperatures (1000C). Silicon cannot be readily fabricated in multiple layers because of the high temperatures required which will harm lower layers. But you could fabricate the lowest layer in silicon and then the rest as either carbon nanotube FETs or RRAM without harming the silicon layer.

Transistor/RRAM counts

The chip as fabricated has a million RRAM cells (bits?) and 2 million nanotube FETs. In contrast, in 2014, Intel’s 15-core Xeon Ivy Bridge EX had 4.3B transistors and current DRAM chips offer 64Gb. So there’s a ways to go before carbon nanotube and RRAM densities can get to a level available from silicon today.

However, as they have a bottom layer of silicon they can have all the CPU complexity of an Intel processor and still build RRAM and carbon nanotubes FETs on top of that. Which makes this chip architecture compatible with current CMOS fabrication techniques and a very interesting addition to current CPU architectures.

~~~~

Unclear to me why they stopped at 4 layers (1-silicon FET, 1 carbon nanotubes FET, 1 RRAM and 1 carbon nanotubes FET [sensor array]). If they can do 4 why not do 5 or more. That way they could pack in even more RRAM storage and perhaps more sensor layers.

Also, not sure what the bottom most layer of carbon nanotubes is doing. If I had to hazard a guess, it’s being used for RRAM control logic. But I could be wrong.

I could see how these chips could be used for very specialized sensor applications, with a limited need for data storage. The researchers claim many types of sensors can be created using carbon nanotubes. If that’s the case, maybe we might see these sorts of chips showing up all over the place.

Comments?

Photo Credit(s): Three dimensional integration of nanotechnologies for computing and data storage on a single chip, Nature magazine. 

Hitachi and the coming IoT gold rush

img_7137Earlier this week I attended Hitachi Summit 2016 along with a number of other analysts and Hitachi executives where Hitachi discussed their current and ongoing focus on the IoT (Internet of Things) business.

We have discussed IoT before (see QoM1608: The coming IoT tsunami or not, Extremely low power transistors … new IoT applications). Analysts and companies predict  ~200B IoT devices by 2020 (my QoM prediction is 72.1B 0.7 probability). But in any case there’s a lot of IoT activity going to come online, very shortly. Hitachi is already active in IoT and if anything, wants it to grow, significantly.

Hitachi’s current IoT business

Hitachi is uniquely positioned to take on the IoT business over the coming decades, having a number of current businesses in industrial processes, transportation, energy production, water management, etc. Over time, all these industries and more are becoming much more data driven and smarter as IoT rolls out.

Some metrics indicating the scale of Hitachi’s current IoT business, include:

  • Hitachi is #79 in the Fortune Global 500;
  • Hitachi’s generated $5.4B (FY15) in IoT revenue;
  • Hitachi IoT R&D investment is $2.3B (over 3 years);
  • Hitachi has 15K customers Worldwide and 1400+ partners; and
  • Hitachi spends ~$3B in R&D annually and has 119K patents

img_7142Hitachi has been in the OT (Operational [industrial] Technology) business for over a century now. Hitachi has also had a very successful and ongoing IT business (Hitachi Data Systems) for decades now.  Their main competitors in this IoT business are GE and Siemans but neither have the extensive history in IT that Hitachi has had. But both are working hard to catchup.

Hitachi Rail-as-a-Service

img_7152For one example of what Hitachi is doing in IoT, they have recently won a 27.5 year Rail-as-a-Service contract to upgrade, ticket, maintain and manage all new trains for UK Rail.  This entails upgrading all train rolling stock, provide upgraded rail signaling, traffic management systems, depot and station equipment and ticketing services for all of UK Rail.

img_7153The success and profitability of this Hitachi service offering hinges on their ability to provide more cost efficient rail transport. A key capability they plan to deliver is predictive maintenance.

Today, in UK and most other major rail systems, train high availability is often supplied by using spare rolling stock, that’s pre-positioned and available to call into service, when needed. With Hitachi’s new predictive maintenance capabilities, the plan is to reduce, if not totally eliminate the need for spare rolling stock inventory and keep the new trains running 7X24.

img_7145Hitachi said their new trains capture 48K data items and generate over ~25GB/train/day. All this data, will be fed into their new Hitachi Insight Group Lumada platform which includes Pentaho, HSDP (Hitachi Streaming Data Platform) and their Content Analytics to analyze train data and determine how best to keep the trains running. Behind all this analytical power will no doubt be HDS HCP object store used to keep track of all the train sensor data and other information, Hitachi UCP servers to process it all, and other Hitachi software and hardware to glue it all together.

The new trains and services will be rolled out over time, but there’s a pretty impressive time table. For instance, Hitachi will add 120 new high speed trains to UK Rail by 2018.  About the only thing that Hitachi is not directly responsible for in this Rail-as-a-Service offering, is the communications network for the trains.

Hitachi other IoT offerings

Hitachi is actively seeking other customers for their Rail-as-a-service IoT service offering. But it doesn’t stop there, they would like to offer smart-water-as-a-service, smart-city-as-a-service, digital-energy-as-a-service, etc.

There’s almost nothing that Hitachi currently supplies as industrial products that they wouldn’t consider offering in an X-as-a-service solution. With HDS Lumada Analytics, HCP and HDS storage systems, Hitachi UCP converged infrastructure, Hitachi industrial products, and Hitachi consulting services, together they are primed to take over the IoT-industrial products/services market.

Welcome to the new Hitachi IoT world.

Comments?

Scality’s Open Source S3 Driver

img_6931
The view from Scality’s conference room

We were at Scality last week for Cloud Field Day 1 (CFD1) and one of the items they discussed was their open source S3 driver. (Videos available here).

Scality was on the 25th floor of a downtown San Francisco office tower. And the view outside the conference room was great. Giorgio Regni, CTO, Scality, said on the two days a year it wasn’t foggy out, you could even see Golden Gate Bridge from their conference room.

Scality

img_6912As you may recall, Scality is an object storage solution that came out of the telecom, consumer networking industry to provide Google/Facebook like storage services to other customers.

Scality RING is a software defined object storage that supports a full complement of interface legacy and advanced protocols including, NFS, CIGS/SMB, Linux FUSE, RESTful native, SWIFT, CDMI and Amazon Web Services (AWS) S3. Scality also supports replication and erasure coding based on object size.

RING 6.0 brings AWS IAM style authentication to Scality object storage. Scality pricing is based on usable storage and you bring your own hardware.

Giorgio also gave a session on the RING’s durability (reliability) which showed they support 13-9’s data availability. He flashed up the math on this but it was too fast for me to take down:)

Scality has been on the market since 2010 and has been having a lot of success lately, having grown 150% in revenue this past year. In the media and entertainment space, Scality has won a lot of business with their S3 support. But their other interface protocols are also very popular.

Why S3?

It looks as if AWS S3 is becoming the defacto standard for object storage. AWS S3 is the largest current repository of objects. As such, other vendors and solution providers now offer support for S3 services whenever they need an object/bulk storage tier behind their appliances/applications/solutions.

This has driven every object storage vendor to also offer S3 “compatible” services to entice these users to move to their object storage solution. In essence, the object storage industry, like it or not, is standardizing on S3 because everyone is using it.

But how can you tell if a vendor’s S3 solution is any good. You could always try it out to see if it worked properly with your S3 application, but that involves a lot of heavy lifting.

However, there is another way. Take an S3 Driver and run your application against that. Assuming your vendor supports all the functionality used in the S3 Driver, it should all work with the real object storage solution.

Open source S3 driver

img_6916Scality open sourced their S3 driver just to make this process easier. Now, one could just download their S3server driver (available from Scality’s GitHub) and start it up.

Scality’s S3 driver runs ontop of a Docker Engine so to run it on your desktop you would need to install Docker Toolbox for older Mac or Windows systems or run Docker for Mac or Docker for Windows for newer systems. (We also talked with Docker at CFD1).

img_6933Firing up the S3server on my Mac

I used Docker for Mac but I assume the terminal CLI is the same for both.Downloading and installing Docker for Mac was pretty straightforward.  Starting it up took just a double click on the Docker application, which generates a toolbar Docker icon. You do need to enter your login password to run Docker for Mac but once that was done, you have Docker running on your Mac.

Open up a terminal window and you have the full Docker CLI at your disposal. You can download the latest S3 Server from Scality’s Docker hub by executing  a pull command (docker pull scality/s3server), to fire it up, you need to define a new container (docker run -d –name s3server -p 8000:8000 scality/s3server) and then start it (docker start s3server).

It’s that simple to have a S3server running on your Mac. The toolbox approach for older Mac’s and PC’S is a bit more complicated but seems simple enough.

The data is stored in the container and persists until you stop/delete the container. However, there’s an option to store the data elsewhere as well.

I tried to use CyberDuck to load some objects into my Mac’s S3server but couldn’t get it to connect properly. I wrote up a ticket to the S3server community. It seemed to be talking to the right port, but maybe I needed to do an S3cmd to initialize the bucket first – I think.

[Update 2016Sep19: Turns out the S3 server getting started doc said you should download an S3 profile for Cyberduck. I didn’t do that originally because I had already been using S3 with Cyberduck. But did that just now and it now works just like it’s supposed to. My mistake]

~~~~

Anyways, it all seemed pretty straight forward to run S3server on my Mac. If I was an application developer, it would make a lot of sense to try S3 this way before I did anything on the real AWS S3. And some day, when I grew tired of paying AWS, I could always migrate to Scality RING S3 object storage – or at least that’s the idea.

Comments?