Learning machine learning – part 3

Image of the cover of the book Deep Learning with Python

Decided to take the plunge and purchase the Deep Learning with Python book and see what it has to offer. In prior posts (see Learning machine learning – part 1 & part 2) we were working with the cloud tutorials. This Part one is based on the book

It has a great introduction into deep learning which is a subset of machine learning. After what I know today, the Microsoft Azure session was more on traditional (statistical) machine learning and not deep learning.


Installing deep learning

In order to use the book, you need access to Keras, Python, Jupyter and a Keras backend (TensorFlow, Microsoft CNTK or Theanno).

I decided not to use any cloud solutions and rather install Python, Jupiter, TensorFlow and Keras on my MacBook. Although it probably would have been much easier (and more costly) to use any cloud solution.

I followed the directions on the installing TensorFlow website for the PIP install (you have to install a “virtual environment” and “PIP” first). The MacBook didn’t have a NVIDIA GPU so I needed to install the CPU version of TensorFlow.

But I had the hardest time running any of the book examples. Whenever I changed any command cell in a Jupyter notebook with Keras functionality in them (like adding a space to the end of an “import Keras” command line), it would throw a (module not found) error.

After days of web searching for what path is used for Jupyter notebook-iPython/Python imports (sys.path and PYTHONPATH) and where I should be importing Keras from (it’s not “~/ .keras”), I got nowhere closer to running anything.

I finally saw that I could directly install Keras (again, when I installed Tensorflow, it installed Keras as well) into my VENV. After I did that, everything worked. (I probably have one too many Keras environments, but who cares).

Finally getting the environment correct, I could now execute any command cells in a Jupyter notebook (with Keras functionality properly, well most of them anyways).

Jupyter notebooks for dummies…

It took me a while to figure out that the way you run a Jupyter notebook server is by issuing the command “jupyter notebook” (nowhere in the command’s help file, but can be found in Jupyter tutorials). That’s when I started to see the problems in the installation section above with my Keras installation.

Understanding Jupyter notebooks is non-trivial. Yes, I know it’s an interactive code and documentation environment. It’s sort of like BASIC on steroids with WORD functionality built in/escapeable into at any time.

First thing to understand is that when you open up a jupyter notebook, you haven’t executed anything yet. YES there are output lines in the notebook you just opened but NO, they aren’t from executing them under your client-server environment.

The output lines you see in the notebook is output from someone else’s execution run. So while they may look like they worked fine but they haven’t executed in your installation environment yet..

Also, when executing Jupyter notebook command cells, pay special attention to the In [?]: that’s shown to the right of every command cell.

When the ‘?’ is a number, like In:[12] that tells you what sequence (12th in the sequence) that (multi-line command cell) has been executed in and when the ‘?’ a “*”, like In[*], it says that the Jupyter notebook server is executing that command cell. 

Some command cells generate Out [?]: lines and others do not. So can’t use this to tell if something’s been executed or not. The only way to tell if some command cell has been executed is by seeing the In [n]: integer as n be incremented from the last command cell you executed. Of course you can execute command cells out of sequence if you wish.

Jupyter notebook coding/executing was weird as one who is more used to C, Java, and other coding languages and IDEs. A video tutorial on Jupiter notebooks would probably have helped here, but I couldn’t find one.

Running the examples

You can download all of the books current examples from the book’s website.

The book suggests you add model layers, subtract model layers and change the parameters of the number of nodes in a model as examples for you to try at home.

In general, doing so (once the environment was setup properly) seemed to work as desired. Adding layers didn’t seem to change the accuracy of the models, if anything it degraded it, and deleting layers didn’t help either. Ditto for adding or reducing node counts within a layer.

There’s a bunch of datasets that comes with Keras install used in the examples. Many examples have a first step where you modify this data so as to be more amenable to deep learning modeling.

For example, there’s a IMDB dataset that has film reviews. The film reviews are text files. But deep learning doesn’t work on text strings so you need to convert the text files into lists of integers. You do this by looking up each word in a word dictionary and substituting the index for each word in the review, generating an array (list) of integers.

This is all done through the NumPy package. It’s worth the time to become familiar with Python and probably NumPy. I took the verbal Python tutorial (but did nothing to learn NumPy).

Another example is a real estate prediction model that has 13 different parameters across 500 or so neighborhoods. The parameters are all different, some are distances, some %s, some pricing differences, etc. In order to perform deep learning on them, the example normalizes all of them, using distance from mean, in units of standard deviation.

There are other examples of data transformations as well. It seems that transforming your data into something amenable to deep learning is one part of the magic of deep learning.

Back to the book

Getting through chapter 3 of the book i- fairly straightforward when everything is set up properly. I found a iPad app (Juno) that could be used to connect to the Jupyter Server and it seemed to work once I found the proper command to use to start Jupyter (jupyter notebook –ip=”*”) and the proper Jupyter configuration parameters to use.

The examples are pretty self-documenting so you should be able to try out any of them on your own. The book adds great explanations on machine learning, deep learning and and the overall flow of how to approach a deep learning project.

Once you finish chapter 4 of the book you have all the tools one needs to tackle any deep learning project that you want to attack. You may need to read up on how to transform your data and you will probably be using one of the modeling techniques in one of the examples but it seems easy enough.

The rest of the book’s chapters (which I have yet to complete) deal with deep learning in practice and it’s in these chapters that you can learn some of the art of deep learning data science and model science..


I ended up having fun with Jupyter notebooks, once I got them running with the iPad client in one hand and the book in the other. At the end of chapter 4, I startedto see some applications to my consulting business that might be interesting to model.

Using the Mac CPU was fast enough for the examples but I may have to tear down the crypto mine and use it as an AI server for my home network if I plan to tackle something with more data.

Wish me luck…

Learning Machine Learning – part 2

In Learning Machine Learning – part 1, we covered AWS and GCP tutorials on machine learning within each of their clouds. In part 2 we cover Microsoft tutorial(s) on machine learning in Azure.

I found Machine Learning Jump Start in Microsoft Visual Academy with instructors, Buck Woody and Seayoung Rhee. This is a series of 4 video tutorials on Azure ML Studio. ML studio seems similar to AWS SageMaker as it’s a framework to perform machine learning.

Azure (and probably AWS & GCP) have a number of methods to perform machine learning. ML studio happens to be the one that I found but there are many others worth examining.

Azure’s ML Studio tutorial videos were a better than AWS but not as good as GCP IMHO for learning machine learning.  There are four videos in the series. I watched  the first (~45 minutes), the second  (~45 minutes) and most of the third (only 25 of ~45 minutes).

Video 1 Concepts and setting up a ML Studio account

In the first video, the instructors took a long time to get going and then when  got someplace interesting, it was all play acting (human as a machine learner) to teach concepts.

The tutorials do distinguish between Supervised learning and Unsupervised learning. Both of which can apply to prediction or classification types of problems or outcomes. These are discussed as classic machine learning characteristics.


In the last 1/3 of the first video they discuss Azure ML Studio. It provides a common place to work and collaborate across team members. It also provides a graphical approach to machine learning. ML Studio also supports a programmable API, but I never got to that section in my viewing.

Some Azure ML Studio strengths:

  1. It provides industry recognized  data sets and data science algorithms that can be used as a black box, such as recommendation engines.
  2. It allows you to publish and consume machine learning solutions.

On the Azure portal there’s a machine learning studio icon (it’s now buried under the 100+ services link, in the AI + Machine Learning section). You use this to create a new ML studio workspace.

Inside a workspace you can use Azure ML studio services.  In the workspace you can review all your experiments (these are algorithms or predictive models being worked). 

In the Experiments page you can create a new experiment which is sort of a graphical workflow of the machine learning task.

There you will find a list of Azure sample data sets and sample algorithms that can be used in your experiment. The first video didn’t go into much detail on any of this other than showing you how to get started and create a ML studio workspace.

Video 2 how to use ML studio

Video 2 takes your ML studio workspace and runs a rudimentary experiment with it. In this video they walk you through selecting a data set, selecting algorithms to use and how to connect them into a machine learning workflow.

Creating an ML Studio experiment is almost like flowcharting your workflow. You select the data you want and drop it into the workflow. Next select an extraction engine you want to use and drop it into the work flow and connect it to the data. Then. you identify what you want to do with the data (like training) and drop that algorithm into the workflow and so on.  In the end you have defined a sequence of actions to perform on data.

In their example, dataset they use a user movie ratings dataset. They connect this to a bayesian learning model and to a IMDB database to extract movie titles. The tutorial experiment is a movie recommendation engine.Although it wasn’t a neural net many of the same techniques apply.


ML Studio uses an intuitive graphical approach to defining a machine learning workflow.

Video 3 publishing your ML Studio web service

Video 3 shows you how to publish (on Azure’s Marketplace) the recommendation engine created in video 2 as an OData web service.

I stopped watching the 3rd video after about 25 minutes as it was setting up various aspect of the OData web service to be deployed on Azure marketplace.

Using Azure ML studio seemed pretty straightforward. But it was much more data science/data analytics activity than neural network training.

The Azure MVA ML Studio tutorial was created in 2014 so some of the concepts are a bit dated but most still apply.

Looking today on the Azure Portal, I was still able to find the ML studio workspaces under one of the 10 AI + Machine Learning services.  Again I would have to say the GCP tutorial was a better fit for what I wanted which was how do I create a neural  net and get it trained.

Other ML approaches under Azure

There are other Azure approaches to machine learning and tutorials that support them. For example, there’s a quick start tutorial to understand how to use Python and Jupyter notebooks under Azure, which is probably closer to the neural net training in GCP.

I found myself skipping ahead a lot in video 1 as it was mainly about concepts and not much technical detail. Video 2 was a good intro into ML studio and Video 3 showed you how to publish a ML studio web service in Azure but it was more details than I wanted to know. I never got to video 4, which probably talked about ML Studio’s programable API.

If I had to do it over again, I probably would have viewed the quick start tutorial with Python and Jupyter notebooks, which sounded more like the GCP tutorials in the part 1 post.

On the other hand, Azure ML Studio tutorials supplied a good complement to the GCP tutorial, as a different (more graphical) way to do ML. It would probably be worthwhile to view before taking the AWS Sagemaker tutorials as it’s a bit higher level and quicker introduction into the workflow of AI and machine learning.


Picture credit(s): Screen shots of Videos 1, 2 and 3 in the MVA series, (c) Microsoft 

Learning machine learning – part 1

Saw an article this past week from AWS Re:Invent that they just released their Machine Learning curriculum and materials  free to the public. Google (Cloud Platform and elsewhere) TensorFlow,  (Facebook’s) PyTorch, and Microsoft Azure CNTK frameworks  education is also available and has been for awhile now.

My money is on PyTorch and Tensorflow as being the two frameworks most likely to succeed. However all the above use many open source facilities and there seems to be a lot of cross breeding across them. Both AWS ML solutions and Microsoft CNTK offer PyTorch and TensorFlow frameworks/APIs as one option among many others.  

AWS Machine Learning

I spent about an hour plus looking over the AWS SageMaker tutorial videos in the developer section of AWS machine learning curriculum. Signing up was fairly easy but I already had an AWS login. You also had to enroll/register for the course on your AWS login  but once that was through, you could access courses.

In the comments on the AWS blog post there were a number of entries indicating broken links and other problems but I didn’t have any issues. Then again, I didn’t start at the beginning, only looked at over one series of courses, and was using the websites one week after they were announced at Re:Invent.

Amazon SageMaker is an overarching framework that can be used to perform machine learning on AWS, all the way from gathering, analyzing and modifying the dataset(s), to training the model, to creating a inference engine available as an endpoint that can be used to perform the inferencing.

Amazon also has special purpose API based tools that allow customers to embed intelligence (inferencing) directly into their application, without needing to perform the ML training. These include:

  • Amazon Recognition which provides image (facial and other tagging) recognition services
  • Amazon Polly which provides text to speech services in multilple languages, and
  • Amazon Lex which provides speech recognition technology (used by Alexa) and together with Polly helps embed conversational interfaces into customer applications.

TensorFlow Machine Learning

In the past I looked over the TensorFlow tutorials and recently rechecked them out. I found them much easier to follow this time.


The Google IO 2018 video on TensorFlowGetting Started With TensorFlow High Level APIs, takes you through a brief introduction to the Colab(oratory),  a GCP solution that uses TensorFlow and how to use Tensorflow Keras, tf.data and TensorFlow Eager Execution to create machine learning models and perform machine learning.

 Keras on TensorFlow seems to be the easiest approach to  use machine learning technologies. The video spends most of the time discussing a Colab Keras code element,  ~9 lines, that loads a image classification dataset, defines a 1 level (one standard layer and one output layer), trains it, validates it and uses it to perform  inferencing.

The video also touches a bit on tf.data and TensorFlow Eager Execution but the main portion discusses the 9 line TensorFlow Keras machine learning example.

Both Colab and AWS Sagemaker use and discuss Jupyter Notebooks. These appear to be an open source approach to documenting and creating a workflow and executing Python code automatically.

GCP Colab is essentially a GCP-Google Drive based Jupyter notebook execution engine. With Colab you create a Jupyter notebook on google drive and interactively execute it under Colab. You can download your Juyiter notebook files and essentially execute them anywhere else that supports TensorFlow (that supports TensorFlow v1.7 or above, with Keras API support).

In the video, the Google IO   instructors (Josh Gordon and Lawrence Moroney) walk you through building a model to recognize handwritten digits and outputs a classification (0..9) of what the handwritten digit represents.

It uses a standard labeled handwriting to digits labeled data set, called the MNIST database of handwritten digits that’s already been broken up into a training set and a validation set. Josh calls this the “Hello World” of machine learning.

The instructor in the video walks you through the (Jupyter Notebook – Eager Execution-Keras) code that inputs the data set (line 2), builds a 1 level (really two layer, one neural net layer and one output layer) neural network model (lines 3-6), trains the model (line 7), tests/validates the model (line 8) and then uses it to perform an inference (line 9).

Josh spends a little time discussing neural networks and model optimizations and some of the other parameters used in the code above. He has a few visualizations of what this all means but for the most part, the code uses a simple way to build a neural net model and some standard optimization techniques for the network.

He then goes on to discuss tf.data which is an API that can be used to create machine learning datasets and provide this data to the neural net for training or inferencing.  Apparently tf.data has a number of nifty features that allow you to take raw data and transform it into something that can be used to feed neural nets. For example, separating the data into batches, shuffling (randomizing) the batches of data, pre-fetching it so as to not starve the GPU matrix multipliers, etc.

Then it goes into how machine learning is different than regular coding. And show how TensorFlow Eager Execution is really just like Python execution. They go through another example (larger) of machine learning, this one distinguishes between cats and dogs. While they use an open source Python IDE ,  PyCharm, to test and walk through their TF Eager Execution code, setting breakpoints and examining data along the way.

At the end of the video they show a link to a Google crash course on TensorFlow machine learning and they refer to a book Deep Learning with Python by Francois Chollet. They also mention a browser version of TensorFlow which uses Java Script and  your browser to develop, train and perform inferences using TensorFlow Keras machine learning.


Never got around to Microsoft’s Azure training other than previewing some websites but plan to look over that soon.

I would have to say that the Google IO session on using TensorFlow high level APIs was a lot more enjoyable (~40 minutes) than the AWS multiple tutorial videos (>>40 minutes) that I watched to learn about SageMaker.

Not a fair comparison as one was a Google IO intro session on TensorFlow high level APIs and the other was a series of actual training videos on Amazon SageMaker and the AWS services you can use to take advantage of it.

But the GCP session left me thinking I can handle learning more and using machine learning (via TensorFlow, Keras, Eager Execution, & tf.data) to actually do something while the SageMaker sessions left me thinking, how much AWS facilities and AWS infrastructure services,  I would need to understand and use to ever get to actually developing a machine learning model.

I suppose one was more of an (AWS SageMaker) infrastructure tutorial  and the other was more of an intro into machine learning using TensorFlow wherever you wanted to execute it.

I think I’m almost ready to start creating and feeding a TensorFlow model with my handwriting and seeing if it can properly interpret it into searchable text. If it can do that, I would be a happy camper


Photo credits: 

Screenshos from AWS Sagemaker series of tutorial video 1, 2, 3, 4 & 5, you may need a signin to view them

Screenshots from the Getting Started with TensorFlow High Level APIs YouTube video 

New GraphCore GC2 chips with 2PFlop performance in a Dell Server

I was at Dell EMC Analyst summit this past week and at the show they had a series of sessions describing some of Dell Venture capital investments. One of the sessions was about GraphCore, a UK design firm that’s working on a new AI chip.

Their new GC2 chip is now out and available for customers to use. The new chip offers unprecedented performance for AI NN computations.


GraphCore’s new Colossus GC2 chip holds 1216 IPU-Cores™. Each IPU runs at 100GFlops and is capable of running 7 threads. The GC2 chip supports 300MB of memory, with an aggregate of 30TB/s of memory bandwidth.  Each IPU supports low precision floating point arithmetic in completely parallel/concrrent execution. The GC2 chip has 23.6B transistors.

Each GC2 chip supports 80 IPU-Links™ to connect to other GC2 chips with 2.5tbps of chip to chip bandwidth. Further, the chip includes a PCIe Gen 4 x16 link (31.5GB/s) to host processors. And each chip supports up to 8TB/s IPU-Exchange™ on the chip bandwidth for inter chip, IPU to IPU communications.

The GC2 chip is available on a PCIe accelerator board that includes 2 GC2 chips. It’s also available in a Dell server configuration with 8 of their PCIe accelerator boards. In the server, with 2 GC2 chips  per board, it has ~19.5K IPUs with ~2.0PFlops in total of IPU processing power.


GC2 IPUs support GraphCore’s Poplar® software and API’s that allows users to code in many of their favorite AI framework, such as PyTorch and TensorFlow.

At the NIPS 2017 conference GraphCore showed some AI ResNet-50, DeepBench LSTM RNN, and DeepVoice WaveNet performance benchmark results with their GC2 accelerator cards..

The chart above shows DeepBench LSTMN RNN runs comparing their  GC2 accelerator card against an Nvidia P100 GPU board (longer is better).

DeepBench is intended to support a set of workloads that mimic or simulate typical deep neural net types of operations and is used to compare NN hardware systems. The chart above compares DeepBench RNN inference operations with  GC2 accelerator card  vs. Nvidia P100 cards at three levels of response times (<2msec, <5msec. and <7msec.).

As can be seen in the chart, the GraphCore GC2 accelerator card performed significantly (from 182X to 242X) better than the Nvidia accelerator card executing NN inferencing at <5msec and <7msec latency. And was able to perform ~42K Inferences at <2msec latency where Nvidia P100 was unable to do at all.

The GC2 chip, accelerator card and Dell EMC servers that run them look to be a significant advance in AI NN computations. We didn’t see any technical spec’s for the server but we assume it comes in a 4U configuration and uses less power than 8 GPUs.

However, at the moment, the servers are sold out. No information on the GC2 accelerator cards but our guess is that they are sold out as well, and probably ditto for the chips. Dell didn’t quote us any pricing on the servers, so its hard to know whether we could afford one, even if they weren’t sold out.

Who wouldn’t want to own a 4U server with 2PFlops performance for their AI apps?


Photo Credit(s): Photos taken during Dell EMC Analyst Summit GraphCore presentation

Photos from GraphCore NIPS 2017 presentations

University of Manchester fires up world’s largest neuromorphic computer

Read an article the other day about SpiNNaker, the University of Manchester’s neuromorphic supercomputer (see Live Science Article: Worlds largest supercomputer brain…). There’s also a wikipedia page on SpiNNaker and a SpiNNaker project page.


SpiNNaker is part of the European Union Human Brain Project (HBP), Brain Simulation Program.

SpiNNaker supercomputing hardware

(Most of the following information is from the SpiNNaker project home page and SpiNNaker architectural overview page.)

The system has 1 million ARM9 (968) cores and ~7TB of memory, with each core emulating 1000 spiking neurons. With this amount of computing power, it should be able to emulate a 1B (1 billion, 10^9) neuron brain (region).

The system will consist of 1200 PCBs with each PCB containing a 48 chip array and associated networking hardware & memory. Each node contains a SpiNNaker chip with its 18 ARM9 cores.

Each node has two chips stitch bonded together on top of one another. The bottom consists of the 18-ARM9 cores and the top the double DDR memory and networking layer.

Total bisectional networking bandwidth is 5 B packets/second with each packet consisting of 5 or 9 bytes of data.

SpiNNaker operates on 1W per chip or 90KW of power to run the entire machine. Given that each chip is 18 cores and each core is 1000 neurons, this means each neuron simulation takes about 55.5µW of power to run.

You can deploy a single board as IoT solution but @ ~48W per board it may be be too energy consumptive for IoT.

SpiNNaker supercomputing software

According to the home page and the Live Science article, SpiNNaker is intended to be used to model critical segments of the human brain such as the  basal ganglia brain area for the EU HBP brain simulation program.

The system architecture has three tiers, a host machine (layer) which communicates with the monitor layer to start and monitor application execution and uses “ybug” to communicate,  a monitor core (layer) which interacts with ybug at the host and uses “scamp” to communicate with the application processors, and the application processors (layer) consisting of the ARM cores, memory and packet networking hardware which runs the  SpiNNaker Application Runtime Kernel (sark).

Applications which run on sark can consist of spiking neural networks or multi-layer perceptrons (MLP), classical deep learning neural networks.

  • MLP applications use back propagation and a training and inference phases, familiar to any deep learning application and uses a fixed neural network topology.
  • Spiking neural network applications use ongoing learning so there’s no training or inference phases (it’s always learning), use a variable network topology (reconfiguring the ARM core-packet network) and currently supports the PyNN spiking neural network simulator.

Unfortunately most of the links in the SpiNNaker project pages referring to PyNN spiking networking applications are broken. But PyNN is a Python based spiking neural network simulator that can run on a number of different hardware platforms (including sark/SpiNNaker).

Most of the AI groups I’ve talked with mention PyTorch or TensorFlow as AI frameworks of choice these days. But it’s unclear to me whether these two support spiking neural network generation/simulation.

If you want to learn more about programming SpiNNaker please check out their Software for SpiNNaker wiki page.


As you may recall, a homo sapiens brain has an estimated 16B to 86B neurons in its average cerebral cortex (see wikipedia “animals listed by neuron count” article, for low estimate, EU’s HBP Brain Simulation page, for high estimate), which puts SpiNNaker today, at about the equivalent of less than a average tufted capuchin cerebral cortex (@1.2B neurons).

Given the above and with SpiNNaker @1B neurons, we are only  4 to 7 generations away from human equivalence. That means we have at most ~14 years left before a 128B spiking neuronal simulation machine is available.

But SpiNNaker today is based on ARM9 cores and ARM11 cores already exist. So, if they redesigned/reimplemented the chip today, it would already be 2X the core count. aake that human equivalence is only a max of 12 years away.

The mean estimate for AGI (artificial general intelligence) seems to be 2040-2050 (see wikipedia Technological Singularity article). But given what University of Manchester’s SpiNNaker is capable of doing today, I don’t think we have that long to wait.

Photo Credits: All photos/charts above are from the SpiNNaker Project pages at the University of Manchester website

Low power spiking AI-Arm edge processing from voltage scaling on ETA TENSAI chip

Read an article in IEEE Spectrum (ETA Compute debuts spiking NN chip for edge AI) about a company producing a new AI IOT chip based on ARM microprocessors with DSPs for dedicated matrix computations.

The new ETA Compute TENSAI chip  technology supports spiking neural networks (NN) as well as more normal, convolutional NN  depending on  edge AI requirements. More  information can be found in an Embedded Computing article (Micropower intelligence for edge devices) and a EE News article (ETA adds spiking NN support to MCU)

We have discussed spiking NN  in prior posts  latest post: IBM using PCM for better AI -round 6). Any spiking NN more closely mimics real biological neurons present in the brains of humans and other life.

ETA also claims that spiking NN perform better unsupervised learning. They included some examples of this in videos. In one video, ETA trains a spiking NN to do the same job as a convolutional NN with 1/10th the pixel data.

In addition, spiking NN only use neural weight values of 0 or 1, whereas normal convolutional NN operations require 8 to 16 bit numbers. So spiking NN arithmetic really only uses addition while convolutional NN need 8-16 bit multiplicative arithmetic to determine resultant weights.

Voltage scaling saves power

The other claim is that the TENSAI chip can perform computations on the order of 10 microwatts of power per megahertz (~10 μW/MHz) using the ARM/DSP combination with voltage scaling

The new TENSAI chip is their 3rd generation with multiple ARM M3 cores and NXP digital signal processors. These cores are implemented in a sub-threshold, asynchronous mode process that allows them to operate at much lower voltage, ~0.2V and at varying clock frequencies. This is called voltage scaling electronics. Doing this took analog design, which ETA considers part of their special IP.

ETA claims the TENSAI chip can operate in listen mode (“Ok Google?”) and only consume 50 microwatts of power and once a key word is discovered, operate in full computational mode with 500 microwatts of power.

I couldn’t find a data sheet for the TENSAI chip, so was unable to see how many TOPS/W it could perform as discussed in prior posts (see: AI processing at the edge post). But it looks to be even more power efficient.

The TENSAI chip was announced at ARM Tech Con(terence) and won the Design Innovation of the Year award at the conference.


Photo Credits: Photo and caption from Figure 12 in  A brain inspired architecture…,  AIP Journal of Applied Physics article

Photo from EE News ETA adds spiking NN support to MCU article

Chart from Embedded Computing  Micropower intelligence for edge devices article

Board photo from IEEE Spectrum ETA Compute debuts spiking NN chip for edge AI article

IBM using PCM to implement better AI – round 6

Saw a recent article that discussed IBM’s research into new computing architectures that are inspired by brain computational techniques (see A new brain inspired architecture … ). The article reports on research done by IBM R&D into using Phase Change Memory (PCM) technology to implement various versions of computer architectures for AI (see Tutorial: Brain inspired computation using PCM, in the AIP Journal of Applied Physics).

As you may recall, we have been reporting on IBM Research into different computing architectures to support AI processing for quite awhile now, (see: Parts 1, 2, 3, 4, & 5). In our last post, More power efficient deep learning through IBM and PCM, we reported on a unique hybrid PCM-silicon solution to deep learning computation.

Readers should also be familiar with PCM as well as it’s been discussed at length in a number of our posts (see The end of NAND is near, maybe; The future of data storage is MRAM; and New chip architectures with CPU, storage & sensors …). MRAM, ReRAM and current 3D XPoint seem to be all different forms of PCM (I think).

In the current research, IBM discusses three different approaches to support AI  utilizing PCM devices. All three approaches stem from the physical characteristics of PCM.

(Some) PCM physics

FIG. 2. (a) Phase-change memory is based on the rapid and reversible phase transition of certain types of materials between crystalline and amorphous phases by the application of suitable electrical pulses. (b) Transmission electron micrograph of a mushroom-type PCM device in a RESET state. It can be seen that the bottom electrode is blocked by the amorphous phase.

It turns out that PCM devices have many  characteristics that lend themselves to be useful for specialized computation. PCM devices crystalize and melt in order to change state. The properties associated with melting and crystallization of the PCM media cell can be used to support unique forms of computation. Some of these PCM characteristics include::

  • Analog, not digital memory – PCM devices are, at the core, an analog memory device. We mean that they don’t record just a 0 or 1 (actually resistant or conductive) state, but rather a continuum of values between those two.
  • PCM devices have an accumulation capability –   each PCM cell actually  accumulates a level of activation. This means that one cell can be more or less likely to change state depending on prior activity.
  • PCM devices are noisy – PCM cells arenot perfect recorders of state chang signals  but rather have a well known, random noise which impacts the state level attained, that can be used to introduce randomness into processing.

The other major advantage of PCM devices is that they take a lot less power than a GPU-CPU to work.

Three ways to use PCM for AI learning

FIG. 4. “In-memory computing,” computation is performed in place by exploiting the physical attributes of memory devices organized as a “computational memory” unit. For example, if data A is stored in a computational memory unit and if we would like to perform f(A), then it is not required to bring A to the processing unit. This saves energy and time that would have to be spent in the case of conventional computing system and memory unit. Adapted from Ref. 19.

The Applied Physics article describes three ways to use PCM devices in AI learning. These three include:

  1. Computational storage – which uses the analog capabilities of PCM to perform  arithmetic and learning computations. In a sort of combined compute and storage device.
  2. AI co-processor – which uses PCM devices, in an “all PCM nodes connected to all other PCM nodes” operation that could be used to perform neural network learning. In an AI co-processor there would be multiple all connected PCM modules, each emulating a neural network layer.
  3. Spiking neural networks –  which uses PCM activation accumulation characteristics & inherent randomness to mimic, biological spiking neuron activation.

FIG. 11.
A proposed chip architecture for a co-processor for deep learning based on PCM arrays.28

It’s the last approach that intrigues me.

Spiking neural nets (SNN)

FIG. 12. (a) Schematic illustration of a synaptic connection and the corresponding pre- and post-synaptic neurons. The synaptic connection strengthens or weakens based on the spike activity of these neurons; a process referred to as synaptic plasticity. (b) A well-known plasticity mechanism is spike-time-dependent plasticity (STDP), leading to weight changes that depend on the relative timing between the pre- and post-synaptic neuronal spike activities. Adapted from Ref. 31.

Biological neurons accumulate charge from all input (connected) neurons and when they reach some input threshold, generate an output signal or spike. This spike is then used to start the process with another neuron up stream from it

Biological neurons also exhibit randomness in their threshold-spiking process.

Emulating spiking neurons, n today’s neural nets, takes computation.  Also randomness takes more.

But with PCM SNN, both the spiking process and its randomness, comes from device physics. Using PCM to create SNN seems a logical progression.

PCM as storage, as memory, as compute or all the above

In the storage business, we look at Optane (see our 3D Xpoint post) SSDs as blazingly fast storage. Intel has also announced that they will use 3D Xpoint in a memory form factor which should provide sadly slower, but larger memory devices.

But using PCM for compute, is a radical departure from the von Neumann computer architectures we know and love today. HPE has been discussing another new computing architecture with their memristor technology, but only in prototype form.

It seems IBM, is also prototyping hardware done this path.

Welcome to the next computing revolution.

Photo & Caption Credit(s): Photo and caption from Figure 2 in AIP Journal of Applied Physics article

Photo and caption from Figure 4 in AIP Journal of Applied Physics article

Photo and caption from Figure 11 in AIP Journal of Applied Physics article

Photo and caption from Figure 12 in AIP Journal of Applied Physics article



Data banks, data deposits & data withdrawals in the data economy – part 1

perspective by anomalous4 (cc) (from Flickr)

Big data visualization, Facebook friend connections
Facebook friend carrousel by antjeverena (cc) (from flickr)

Read an interesting article this week in The Atlantic, Why Technology Favors Tyranny by Yuvai Noah Harari, about the inevitable future of technology and how the use of data will drive it.

At the end of the article Harari talks about the need to take back ownership of our data in order to gain some control over the tech giants that currently control our data.

In part 3, Harari discusses the coming AI revolution and the impact on humanity. Yes there will still be jobs, but early on less jobs for unskilled labor and over time less jobs for skilled labor.

Yet, our data continues to be valuable. AI neural net (NN) accuracy increases as a function of the amount of data used to train it. As a result,  he has the most data creates the best AI NN. This means our data has value and can be used over and over again to train other AI NNs. This all sounds like data is just another form of capital, at least for AI NN training.

If only we could own our data, then there would still be value from people’s (digital) exertions (labor), regardless of how much AI has taken over the reigns of production or reduced the need for human work.Safe by cjc4454 (cc) (from flickr)

Safe by cjc4454 (cc) (from flickr)What we need is data (savings) banks. These banks would hold people’s data, gathered from social media likes/dislikes,  cell phone metadata, app/web history, search history, credit history, purchase history,  photo/video streams, email streams, lab work, X-rays, wearables info, etc. Probably many more categories need to be identified but ultimately ALL the digital data we generate today would need to be owned by people and deposited in their digital bank accounts.

Data deposits?

Social media companies, telecom, search companies, financial services app companies, internet  providers, etc. anywhere you do business should supply a copy of the digital data they gather for a person back to that persons data bank account.

There are many technical problems to overcome here but it could be as simple as an object storage bucket, assigned to each person that each digital business deposits (XML versions of) our  digital data they create for everyone that uses their service. They would do this as compensation for using our data in their business activities.

How to change data ownership?

Today, we all sign user agreements which essentially gives a company the rights to our data in perpetuity. That needs to change. I see a few ways that this change could come about

  1. Countries could enact laws to insure personal data ownership resides in the person generating it and enforce periodic distribution of this data
  2. Market dynamics could impel data distribution, e.g. if some search firm supplied data to us, we would be more likely to use them.
  3. Societal changes, as AI becomes more important to profit making activities and reduces the need for human work, and as data continues to be an important factor in AI success, data ownership becomes essential to retaining the value of human labor in society.

Probably, all of the above and maybe more would be required to change the ownership structure of data.

How to profit from data?

Technical entities needing data to train AI NNs could solicit data contributions through an Initial Data Offering (IDO). IDO’s would specify types of data required and a proportion of AI NN ownership, they would cede to all  data providers. Data providers would be apportioned ownership based on the % identified and the number of IDO data subscribers.

perspective by anomalous4 (cc) (from Flickr)
perspective by anomalous4 (cc) (from Flickr)

Data banks would extract the data requested by the IDO and supply it to the IDO entity for use. For IDOs, just like ICO’s or IPO’s, some would fail and others would succeed. But the data used in them would represent an ownership share sort of like a  stock (data) certificate in the AI NN.

Data bank responsibilities

Data banks would have various responsibilities and would need to collect fees to perform them. For example, data banks would be responsible for:

  1. Protecting data deposits – to insure data deposits are never lost, are never accessed without permission, are always trackable as to how they are used..
  2. Performing data deposits – to verify that data is deposited from proper digital entities, to validate that data deposits are in a usable form and to properly store the data in a customers object storage bucket.
  3. Performing data withdrawals – upon customer request, to extract all the appropriate data requested by an IDO,  anonymize it, secure it, package it and send it to the IDO originator.
  4. Reconciling data accounts – to track data transactions, data banks would supply a monthly statement that identifies all data deposits and data withdrawals, data revenues and data expenses/fees.
  5. Enforcing data withdrawal types – to enforce data withdrawal types, as data  withdrawals can have many different characteristics, such as exclusivity, expiration, geographic bounds, etc. Data banks would need to enforce withdrawal characteristics, at least to the extent they can
  6. Auditing data transactions – to insure that data is used properly, a consortium of data banks or possibly data accountancies would need to audit AI training data sets to verify that only data that has been properly withdrawn is used in trying the NN. .

AI NN, tools and framework responsibilities

In order for personal data ownership to work well, AI NNs, tools and frameworks used today would need to change to account for data ownership.

  1. Generate, maintain and supply immutable data ownership digests – data ownership digests would be a sort of stock registry for the data used in training the AI NN. They would need to be a part of any AI NN and be viewable by proper data authorities
  2. Track data use – any and all data used in AI NN training should be traceable so that proper data ownership can be guaranteed.
  3. Identify AI NN revenues – NN revenues would need to be isolated, identified and accounted for so that data owners could be rewarded.
  4. Identify AI NN data expenses – NN data costs would need to somehow be isolated, identified and accounted for so that data expenses could be properly deducted from data owner awards. .

At some point there’s a need for almost a data profit and loss statement as well as a data balance sheet for at an AI NN level. The information supplied above should make auditing data ownership, use and rewards much more feasible. But it all starts with identifying data ownership and the data used in training the AI.


There are a thousand more questions that come to mind. For example

  • Who owns earth sensing satellite, IoT sensors, weather sensors, car sensors etc. data? Everyone in the world (or country) being monitored is laboring to create the environment sensed by these devices. Shouldn’t this sensor data be apportioned to the people of the world or country where these sensors operate.
  • Who pays data bank fees? The generators/extractors of the data could pay in addition to providing data deposits for the privilege to use our data. I could also see the people paying.  Having the company pay would give them an incentive to make the data load be as efficient and complete as possible. Having the people pay would induce them to use their data more productively.
  • What’s a decent data expiration period? Given application time frames these days, 7-15 years would make sense. But what happens to the AI NN when data expires. Some way would need to be created to extract data from a NN, or the AI NN would need to cease being used and a new one would  need to be created with new data.
  • Can data deposits be rented/sold to data aggregators? Sort of like a AI VC partnership only using data deposits rather than money to fund AI startups.
  • What happens to data deposits when a person dies? Can one inherit a data deposits, would a data deposit inheritance be taxable as part of an estate transfer?

In the end, as data is required to train better AI, ownership of our data makes us all be capitalist (datalists) in the creation of new AI NNs and the subsequent advancement of society. And that’s a good thing.