AI processing at the edge

Read a couple of articles over the past few weeks (TechCrunch: Google is making a fast, specialized TPU chip for edge devices … and IEEE Spectrum: Two startups use processing in flash for AI at the edge) about chips for AI at the IoT edge.

The two startups, Syntiant and Mythic, are moving to analog only or analog-digital solutions to provide AI processing needed at the edge while Google is taking their TPU technology to the edge.  We have written about Google’s TPU before (see: TPU and hardware vs. software  innovation (round 3) post).

The major challenge in AI processing at the edge is power consumption. Both  startups attack the power problem by using flash and other analog circuitry to provide power efficient compute.

Google attacked the power problem with their original TPU by reducing computational precision from 64- to 8-bits. By reducing transistor counts, they lowered power requirements proportionally.

AI today is based on neural networks (NN), that connect simulated neurons via simulated synapses with weights attached to indicate whether to boost or decrease the signal being transmitted. AI learning is done by setting those weights and creating the connections between simulated neurons and the synapses.  So learning is setting weights and establishing connections. Actual inferences (using AI to do something) is a process of exciting input simulated neurons/synapses and letting the signal flow through the NN with each weight being used to determine output(s).

AI with standard compute

The problem with doing AI learning or inferencing with normal CPUs or even CUDAs is that the NN does thousands if not millions of  multiplication-accumulation actions at each simulated synapse-neuron connection. Doing all these multiplication-accumulation takes power. CPUs and CUDAs can do these sorts of operations on 32 or 64 bit numbers or even floating point but it still takes power.

AI processing power

AI processing power is measured in trillions of (accumulate-multiply) operations per second per watt (TOPS/W). Mythic believes it can perform 4 TOPS/W and Syntiant says it can do 20 TOPS/W. In comparison, the NVIDIA Volta V100 can do about 0.4 TOPS/W (according to the article). Although  comparing Syntiant-Mythic TOPS to NVIDIA TOPS is a little like comparing apples to oranges.

A current Intel Xeon Platinum 8180M (2.5Ghz, 28 Core processors, 205 W) can probably do (assuming one multiplication-accumulation per hertz) about 2.5 Billion X 28 Cores = 70 Billion Ops Second/205 W or 0.3 GOPS/W (source: Platinum 8180M Data sheet).

As for Google’s TPU TOPS/W, TPU2 is rated at 45 GFLOPS/chip and best guess for power consumption is between 160W and 200W, let’s say 180W. With power at that level, TPU2 should hit 0.25 GFLOPS/W.  TPU3 is coming out with 8X the power but it uses water cooling (read LOTS MORE POWER).

Nonetheless, it appears that Mythic and Syntiant are one to two orders of magnitude better than the best that NVIDIA and TPU2 can do today and many orders of magnitude better than Intel X86.

Improving TOPS/W

Use NAND, as an analog memory to read, write and hold  NN weights is an easy way to reduce power consumption. Combine that with  analog circuitry that can do multiplication and addition with those flash values and you have a AI NN processor. This way you reduce the need to hold weights in memory and do compute in registers by collapsing both compute and memory into the same componentry.

The major difference between Syntiant and Mythic seems to be the amount of analog circuitry they use. Mythic seems to relegate the analog circuitry to an accelerator while Syntiant has a more extensive use of analog circuitry throughout their chip. Probably why it can perform 5X the TOPS/W of Mythic’s IPU.

IBM and others have been working on neuromorphic chips some of which are analog based and others which are all digital based. We’ve written extensively on IBM and some on MIT’s approaches (for the latest on IBM see: More power efficient deep learning through IBM and PCM, and for MIT see: MIT builds an analog synapse chip) and follow the links there to learn more.

~~~~

Special purpose AI hardware is emerging from the labs and finally reaching reality. IBM R&D has been playing with it for a long time. Google is working on TPU3 so there’s no stopping them. And startups are seeing an opening and are taking everyone on. Stay tuned, were in for a good long ride before the someone rises above the crowd and becomes the next chip giant.

Comments?

 

Photo Credit(s): TechCrunch  Google is making a fast, specialized TPU chip for edge devices … article

Introduction to Digital Design Verification at Mythic, Medium.com Article

Images from Google Cloud Platform Blog on the TPU

Two startups use processing in flash for AI at the edge, IEEE Spectrum article courtesy of Mythic

Information flows everywhere – part 1

Read an article today from Scientific American (Sewage is helping cities flush out the opioid crisis) about how using chemical analysis of wastewater can be used to assess the extent of the opioid crisis in their city.

Wastewater information highway

There’s a lab at ASU (Arizona State University) that chemically analyzes samples of wastewater to determine the amount of drugs that a city’s population excretes. They can provide a near real-time assessment of the proportion of drugs in city sewage and thereby, in a city’s population.

The problem with public drug use surveys and hospital data gathering is that they take time.  Moreover, surveys and hospital data gathering typically come long after drugs problem have become a serious problem in a city’s population.

Wastewater sample drug analysis can be done in a matter of days and can be redone as often as needed. Such data could be used to track intervention activities and see if they have a real impact (positive or negative) on drug use in a population.

Neighborhood health

In addition, by sampling sewage at a neighborhood level, one can gain an assessment of drug problems at any sub-division of a city that’s needed.

The above article talks about an MIT program with Cary, NC (from Biobot.io)  that is designing robots to traverse sewer pipes and analyze wastewater chemical makeup in real time, reporting this back to ground stations around the city.

With such an approach, one could almost zero in (depending on sewer pipe networks) on any neighborhood in a city, target specific interventions at that level and measure impact in (digestion delayed) real time. Doing so, cities or states for that matter, could  experiment with different interventions on a neighborhood by neighborhood basis and gain statistical evidence on drug problem intervention effectiveness.

But, you can analyze wastewater for any number of variables, such as viruses, bacteria, enzymes, etc. Any of which can lead to a better understanding of a population’s health.

~~~~

Two things I want to leave you with:

First, public health has had a major impact on human health and has doubled our lifespan in 200 years. All modern cities have water treatment plants today to insure water quality and thereby, have reduced the incidence of cholera and other waterborne epidemics in their cities. Wastewater analysis has the potential for significant improvements in population health monitoring. Just like water treatment, wastewater analysis will someday become common public health practice in modern cities throughout the world.

Second, I was at a conference this week which presented a slide that there was no cold data anymore (Pure//Accelerate 2018). This was in reference to  re-analyzing old, cold data can often lead to insights and process improvements that were not obvious at first glance.

But it’s not just data anymore. Any activity done by man needs to be analyzed for (inherent & invisible) information flows that could be extracted to make the world a better place.

Photo Credit(s):

Analog neural simulation or digital neuromorphic computing vs. AI

DSC_9051 by Greg Gorman (cc) (from Flickr)
DSC_9051 by Greg Gorman (cc) (from Flickr)

At last week’s IBM Smarter Computing Forum we had a session on Watson, IBM’s artificial intelligence machine which won Jeopardy last year and another session on IBM sponsored research helping to create the SyNAPSE digital neuromorphic computing chip.

Putting “Watson to work”

Apparently, IBM is taking Watson’s smarts and applying it to health care and other information intensive verticals (intelligence, financial services, etc.).  At the conference IBM had Monoj Saxena, senior director Watson Solutions and Dr. Herbert Chase, a professor of clinical medicine a senior medical professor from Columbia School of Medicine come up and talk about Watson in healthcare.

Mr. Saxena’s contention and Dr. Chase concurred that Watson can play at important part in helping healthcare apply current knowledge.  Watson’s core capability is the ability to ingest and make sense of information and then be able to apply that knowledge.  In this case, using medical research knowledge to help diagnose patient problems.

Dr. Chase had been struck at a young age by one patient that had what appeared to be an incurable and unusual disease.  He was an intern at the time and was given the task to diagnose her issue.  Eventually, he was able to provide a proper diagnosis but it irked him that it took so long and so many doctors to get there.

So as a test of Watson’s capabilities, Dr. Chase input this person’s medical symptoms into Watson and it was able to provide a list of potential diagnosises.  Sure enough, Watson did list the medical problem the patient actually had those many years ago.

At the time, I mentioned to another analyst that Watson seemed to represent the end game of artificial intelligence. Almost a final culmination and accumulation of 60 years in AI research, creating a comprehensive service offering for a number of verticals.

That’s all great, but it’s time to move on.

SyNAPSE is born

In the next session IBM had Dr. Dharmenrad Modta come up and talk about their latest SyNAPSE chip, a new neueromorphic digital silicon chip that mimicked the brain to model neurological processes.

We are quite a ways away from productization of the SyNAPSE chip.  Dr. Modha showed us a real-time exhibition of the SyNAPSE chip in action (connected to his laptop) with it interpreting a handwritten numeral into it’s numerical representation.  I would say it’s a bit early yet, to see putting “SyNAPSE to work”.

Digital vs. analog redux

I have written about the SyNAPSE neuromorphic chip and a competing technology, the direct analog simulation of neural processes before (see IBM introduces SyNAPSE chip and MIT builds analog synapse chip).  In the MIT brain chip post I discussed the differences between the two approaches focusing on the digital vs. analog divide.

It seems that IBM research is betting on digital neuromorphic computing.  At the Forum last week, I had a discussion with a senior exec in IBM’s STG group, who said that the history of electronic computing over the last half century or so has been mostly about the migration from analog to digital technologies.

Yes, but that doesn’t mean that digital is better, just more easy to produce.

On that topic, I asked the Dr. Modha, on what he thought of MIT’s analog brain chip.  He said

  • MIT’s brain chip was built on 180nm fabrication processes whereas his is on 45nm or over 3X finer. Perhaps the fact that IBM has some of the best fab’s in the world may have something to do with this.
  • The digital SyNAPSE chip can potentially operate at 5.67Ghz and will be absolutely faster than any analog brain simulation.   Yes, but each analog simulated neuron is actually one of a parallel processing complex and with a 1’000 or a million of them operating even 1000X or million X slower it’s should be able to keep up.
  • The digital SyNAPSE chip was carefully designed to be complementary to current digital technology.   As I look at IT today we are surrounded by analog devices that interface very well with the digital computing environment, so I don’t think this will be a problem when we are ready to use it.

Analog still surrounds us and defines the real world.  Someday the computing industry will awaken from it’s digital hobby horse and somehow see the truth in that statement.

~~~~

In any case, if it takes another 60 years to productize one of these technologies then the Singularity is farther away than I thought, somewhere around 2071 should about do it.

Comments?

MIT builds analog synapse chip

2011 Wikimedia commons (400px-Synapse_Illustration_unlabeled.svg)
2011 Wikimedia commons (400px-Synapse_Illustration_unlabeled.svg)

Recently MIT announced a new brain chip, a breakthrough device that simulates a single brain synapse with an analog chip.

We have discussed before the digital nueromorphic chip activity going on (see my IBM introducing their SyNAPSE chip and Electro-human interface posts). However both those were digital, this new MIT chip is analog.  The chip uses ~400 transistors and was fabricated using VLSI processing.

Analog, whats that?

Given that the world has gone digital, analog devices may be foreign to most of us.  But analog dominated the way electronics worked for the first half of last century and were still pretty prominent during the last half.

Nowadays, such devices are used primarily in signal processing, and where streams of data are transformed from one mode to another (serial/deserializers).   An analog signal has a theoretically an infinite resolution (Wikipedia), which should make it closer to real life and may be why some stereophiles perfer records to CDs.

Neurons are analog devices

That being said, it’s a treat to see some new analog technology come out that’s better than digital implementations.  One would have to say that neural activity is by definition analog and as such, should make simulating brain activity much easier.

The advantage of analog can be seen in that the neural synapse is the connection between two neurons.  Information is transferred between the two neurons by the take up of Ions.  In the case of the MIT synapse chip, the same sort of process occurs but in this case information flows based on gradients of electronic potential.

In testament to the capabilities of the new synapse chip they were able to resolve a long standing debate in neuro-biology. The question was on how long term potentation (LTP) and long term depression (LTD) which enhances or depresses the information transfer across the synapse was accomplished in real neurons.  Previously, it had been postulated that LTP and LTD would depend on two different mechanisms in real cells. But there was one theory that said with a specific type of receptor, both LTP and LTD could be performed in a single way.

MIT researchers were able to configure their synapse-chip to mimic that new receptor and were able to show how LTP and LTD could work with this single receptor in the brain.

Onto the brain

Of course a single synapse is not much considering the brain has 100B neurons each with many 100’s if not 1000’s of synapses. But it’s a start.

Naturally, considering its built out of transistors using CMOS technology, it should follow Moore’s law and after 18 months or so we should have a chip with two synapses on it. Another 40 or so doublings more (~60 years from now in 2071), if Moore’s law holds, we can have a brain-chip with 100B neurons and 100T synapses on it.

Of course, this being a prototype, I suppose with today’s fabrication capable of  creating 40M transistors/chip, we may already be able to simulate 100K synapses and 100 neurons. Which means we should have a brain’s level of neurons and synapses in 30 doublings or ~2056.

Analog is better than biological

The other nice thing about analog logic and transistors, is that information processing in the brain-chip should be orders of magnitude faster than the brain’s biological processing.  Which is probably even more frightening.

The IBM SyNAPSE chip mentioned earlier was an all digital creation and had two chip cores, one provided “learning synapses” and the other “programmable synapses”.  This was probably an attempt to mimic neural processing in digital logic.

The analog brain-chip that MIT has invented, has no such distinction, supplying all synapse functionality in 400 transistors.   Nonetheless, any accurate simulation of neural processes can help us to understand how to mimic it better. The fact that we have an analog simulation neural processes should help us improve the digital simulation to more closely match the brain.

—-

Not sure what we should call this chip, it’s certainly not neuromorphic, because it’s a real simulation of analog neural synapses not a digital approximation.  I would use synapse- chip but its already in use.  I kind of like the brain-chip but that may be stretching it a bit. Maybe the neuron-chip is best for now

Now that we know the date for the singularity, hopefully we can be ready to deal with whatever happens then.

Comments?