University of Manchester fires up world’s largest neuromorphic computer

Read an article the other day about SpiNNaker, the University of Manchester’s neuromorphic supercomputer (see Live Science Article: Worlds largest supercomputer brain…). There’s also a wikipedia page on SpiNNaker and a SpiNNaker project page.

 

SpiNNaker is part of the European Union Human Brain Project (HBP), Brain Simulation Program.

SpiNNaker supercomputing hardware

(Most of the following information is from the SpiNNaker project home page and SpiNNaker architectural overview page.)

The system has 1 million ARM9 (968) cores and ~7TB of memory, with each core emulating 1000 spiking neurons. With this amount of computing power, it should be able to emulate a 1B (1 billion, 10^9) neuron brain (region).

The system will consist of 1200 PCBs with each PCB containing a 48 chip array and associated networking hardware & memory. Each node contains a SpiNNaker chip with its 18 ARM9 cores.

Each node has two chips stitch bonded together on top of one another. The bottom consists of the 18-ARM9 cores and the top the double DDR memory and networking layer.

Total bisectional networking bandwidth is 5 B packets/second with each packet consisting of 5 or 9 bytes of data.

SpiNNaker operates on 1W per chip or 90KW of power to run the entire machine. Given that each chip is 18 cores and each core is 1000 neurons, this means each neuron simulation takes about 55.5µW of power to run.

You can deploy a single board as IoT solution but @ ~48W per board it may be be too energy consumptive for IoT.

SpiNNaker supercomputing software

According to the home page and the Live Science article, SpiNNaker is intended to be used to model critical segments of the human brain such as the  basal ganglia brain area for the EU HBP brain simulation program.

The system architecture has three tiers, a host machine (layer) which communicates with the monitor layer to start and monitor application execution and uses “ybug” to communicate,  a monitor core (layer) which interacts with ybug at the host and uses “scamp” to communicate with the application processors, and the application processors (layer) consisting of the ARM cores, memory and packet networking hardware which runs the  SpiNNaker Application Runtime Kernel (sark).

Applications which run on sark can consist of spiking neural networks or multi-layer perceptrons (MLP), classical deep learning neural networks.

  • MLP applications use back propagation and a training and inference phases, familiar to any deep learning application and uses a fixed neural network topology.
  • Spiking neural network applications use ongoing learning so there’s no training or inference phases (it’s always learning), use a variable network topology (reconfiguring the ARM core-packet network) and currently supports the PyNN spiking neural network simulator.

Unfortunately most of the links in the SpiNNaker project pages referring to PyNN spiking networking applications are broken. But PyNN is a Python based spiking neural network simulator that can run on a number of different hardware platforms (including sark/SpiNNaker).

Most of the AI groups I’ve talked with mention PyTorch or TensorFlow as AI frameworks of choice these days. But it’s unclear to me whether these two support spiking neural network generation/simulation.

If you want to learn more about programming SpiNNaker please check out their Software for SpiNNaker wiki page.

~~~~

As you may recall, a homo sapiens brain has an estimated 16B to 86B neurons in its average cerebral cortex (see wikipedia “animals listed by neuron count” article, for low estimate, EU’s HBP Brain Simulation page, for high estimate), which puts SpiNNaker today, at about the equivalent of less than a average tufted capuchin cerebral cortex (@1.2B neurons).

Given the above and with SpiNNaker @1B neurons, we are only  4 to 7 generations away from human equivalence. That means we have at most ~14 years left before a 128B spiking neuronal simulation machine is available.

But SpiNNaker today is based on ARM9 cores and ARM11 cores already exist. So, if they redesigned/reimplemented the chip today, it would already be 2X the core count. aake that human equivalence is only a max of 12 years away.

The mean estimate for AGI (artificial general intelligence) seems to be 2040-2050 (see wikipedia Technological Singularity article). But given what University of Manchester’s SpiNNaker is capable of doing today, I don’t think we have that long to wait.

Photo Credits: All photos/charts above are from the SpiNNaker Project pages at the University of Manchester website

Low power spiking AI-Arm edge processing from voltage scaling on ETA TENSAI chip

Read an article in IEEE Spectrum (ETA Compute debuts spiking NN chip for edge AI) about a company producing a new AI IOT chip based on ARM microprocessors with DSPs for dedicated matrix computations.

The new ETA Compute TENSAI chip  technology supports spiking neural networks (NN) as well as more normal, convolutional NN  depending on  edge AI requirements. More  information can be found in an Embedded Computing article (Micropower intelligence for edge devices) and a EE News article (ETA adds spiking NN support to MCU)

We have discussed spiking NN  in prior posts  latest post: IBM using PCM for better AI -round 6). Any spiking NN more closely mimics real biological neurons present in the brains of humans and other life.

ETA also claims that spiking NN perform better unsupervised learning. They included some examples of this in videos. In one video, ETA trains a spiking NN to do the same job as a convolutional NN with 1/10th the pixel data.

In addition, spiking NN only use neural weight values of 0 or 1, whereas normal convolutional NN operations require 8 to 16 bit numbers. So spiking NN arithmetic really only uses addition while convolutional NN need 8-16 bit multiplicative arithmetic to determine resultant weights.

Voltage scaling saves power

The other claim is that the TENSAI chip can perform computations on the order of 10 microwatts of power per megahertz (~10 μW/MHz) using the ARM/DSP combination with voltage scaling

The new TENSAI chip is their 3rd generation with multiple ARM M3 cores and NXP digital signal processors. These cores are implemented in a sub-threshold, asynchronous mode process that allows them to operate at much lower voltage, ~0.2V and at varying clock frequencies. This is called voltage scaling electronics. Doing this took analog design, which ETA considers part of their special IP.

ETA claims the TENSAI chip can operate in listen mode (“Ok Google?”) and only consume 50 microwatts of power and once a key word is discovered, operate in full computational mode with 500 microwatts of power.

I couldn’t find a data sheet for the TENSAI chip, so was unable to see how many TOPS/W it could perform as discussed in prior posts (see: AI processing at the edge post). But it looks to be even more power efficient.

The TENSAI chip was announced at ARM Tech Con(terence) and won the Design Innovation of the Year award at the conference.

Comments?

Photo Credits: Photo and caption from Figure 12 in  A brain inspired architecture…,  AIP Journal of Applied Physics article

Photo from EE News ETA adds spiking NN support to MCU article

Chart from Embedded Computing  Micropower intelligence for edge devices article

Board photo from IEEE Spectrum ETA Compute debuts spiking NN chip for edge AI article

IBM using PCM to implement better AI – round 6

Saw a recent article that discussed IBM’s research into new computing architectures that are inspired by brain computational techniques (see A new brain inspired architecture … ). The article reports on research done by IBM R&D into using Phase Change Memory (PCM) technology to implement various versions of computer architectures for AI (see Tutorial: Brain inspired computation using PCM, in the AIP Journal of Applied Physics).

As you may recall, we have been reporting on IBM Research into different computing architectures to support AI processing for quite awhile now, (see: Parts 1, 2, 3, 4, & 5). In our last post, More power efficient deep learning through IBM and PCM, we reported on a unique hybrid PCM-silicon solution to deep learning computation.

Readers should also be familiar with PCM as well as it’s been discussed at length in a number of our posts (see The end of NAND is near, maybe; The future of data storage is MRAM; and New chip architectures with CPU, storage & sensors …). MRAM, ReRAM and current 3D XPoint seem to be all different forms of PCM (I think).

In the current research, IBM discusses three different approaches to support AI  utilizing PCM devices. All three approaches stem from the physical characteristics of PCM.

(Some) PCM physics

FIG. 2. (a) Phase-change memory is based on the rapid and reversible phase transition of certain types of materials between crystalline and amorphous phases by the application of suitable electrical pulses. (b) Transmission electron micrograph of a mushroom-type PCM device in a RESET state. It can be seen that the bottom electrode is blocked by the amorphous phase.

It turns out that PCM devices have many  characteristics that lend themselves to be useful for specialized computation. PCM devices crystalize and melt in order to change state. The properties associated with melting and crystallization of the PCM media cell can be used to support unique forms of computation. Some of these PCM characteristics include::

  • Analog, not digital memory – PCM devices are, at the core, an analog memory device. We mean that they don’t record just a 0 or 1 (actually resistant or conductive) state, but rather a continuum of values between those two.
  • PCM devices have an accumulation capability –   each PCM cell actually  accumulates a level of activation. This means that one cell can be more or less likely to change state depending on prior activity.
  • PCM devices are noisy – PCM cells arenot perfect recorders of state chang signals  but rather have a well known, random noise which impacts the state level attained, that can be used to introduce randomness into processing.

The other major advantage of PCM devices is that they take a lot less power than a GPU-CPU to work.

Three ways to use PCM for AI learning

FIG. 4. “In-memory computing,” computation is performed in place by exploiting the physical attributes of memory devices organized as a “computational memory” unit. For example, if data A is stored in a computational memory unit and if we would like to perform f(A), then it is not required to bring A to the processing unit. This saves energy and time that would have to be spent in the case of conventional computing system and memory unit. Adapted from Ref. 19.

The Applied Physics article describes three ways to use PCM devices in AI learning. These three include:

  1. Computational storage – which uses the analog capabilities of PCM to perform  arithmetic and learning computations. In a sort of combined compute and storage device.
  2. AI co-processor – which uses PCM devices, in an “all PCM nodes connected to all other PCM nodes” operation that could be used to perform neural network learning. In an AI co-processor there would be multiple all connected PCM modules, each emulating a neural network layer.
  3. Spiking neural networks –  which uses PCM activation accumulation characteristics & inherent randomness to mimic, biological spiking neuron activation.
FIG. 11.
A proposed chip architecture for a co-processor for deep learning based on PCM arrays.28

It’s the last approach that intrigues me.

Spiking neural nets (SNN)

FIG. 12. (a) Schematic illustration of a synaptic connection and the corresponding pre- and post-synaptic neurons. The synaptic connection strengthens or weakens based on the spike activity of these neurons; a process referred to as synaptic plasticity. (b) A well-known plasticity mechanism is spike-time-dependent plasticity (STDP), leading to weight changes that depend on the relative timing between the pre- and post-synaptic neuronal spike activities. Adapted from Ref. 31.

Biological neurons accumulate charge from all input (connected) neurons and when they reach some input threshold, generate an output signal or spike. This spike is then used to start the process with another neuron up stream from it

Biological neurons also exhibit randomness in their threshold-spiking process.

Emulating spiking neurons, n today’s neural nets, takes computation.  Also randomness takes more.

But with PCM SNN, both the spiking process and its randomness, comes from device physics. Using PCM to create SNN seems a logical progression.

PCM as storage, as memory, as compute or all the above

In the storage business, we look at Optane (see our 3D Xpoint post) SSDs as blazingly fast storage. Intel has also announced that they will use 3D Xpoint in a memory form factor which should provide sadly slower, but larger memory devices.

But using PCM for compute, is a radical departure from the von Neumann computer architectures we know and love today. HPE has been discussing another new computing architecture with their memristor technology, but only in prototype form.

It seems IBM, is also prototyping hardware done this path.

Welcome to the next computing revolution.

Photo & Caption Credit(s): Photo and caption from Figure 2 in AIP Journal of Applied Physics article

Photo and caption from Figure 4 in AIP Journal of Applied Physics article

Photo and caption from Figure 11 in AIP Journal of Applied Physics article

Photo and caption from Figure 12 in AIP Journal of Applied Physics article

 

 

Data banks, data deposits & data withdrawals in the data economy – part 1

perspective by anomalous4 (cc) (from Flickr)
Big data visualization, Facebook friend connections
Facebook friend carrousel by antjeverena (cc) (from flickr)

Read an interesting article this week in The Atlantic, Why Technology Favors Tyranny by Yuvai Noah Harari, about the inevitable future of technology and how the use of data will drive it.

At the end of the article Harari talks about the need to take back ownership of our data in order to gain some control over the tech giants that currently control our data.

In part 3, Harari discusses the coming AI revolution and the impact on humanity. Yes there will still be jobs, but early on less jobs for unskilled labor and over time less jobs for skilled labor.

Yet, our data continues to be valuable. AI neural net (NN) accuracy increases as a function of the amount of data used to train it. As a result,  he has the most data creates the best AI NN. This means our data has value and can be used over and over again to train other AI NNs. This all sounds like data is just another form of capital, at least for AI NN training.

If only we could own our data, then there would still be value from people’s (digital) exertions (labor), regardless of how much AI has taken over the reigns of production or reduced the need for human work.Safe by cjc4454 (cc) (from flickr)

Safe by cjc4454 (cc) (from flickr)What we need is data (savings) banks. These banks would hold people’s data, gathered from social media likes/dislikes,  cell phone metadata, app/web history, search history, credit history, purchase history,  photo/video streams, email streams, lab work, X-rays, wearables info, etc. Probably many more categories need to be identified but ultimately ALL the digital data we generate today would need to be owned by people and deposited in their digital bank accounts.

Data deposits?

Social media companies, telecom, search companies, financial services app companies, internet  providers, etc. anywhere you do business should supply a copy of the digital data they gather for a person back to that persons data bank account.

There are many technical problems to overcome here but it could be as simple as an object storage bucket, assigned to each person that each digital business deposits (XML versions of) our  digital data they create for everyone that uses their service. They would do this as compensation for using our data in their business activities.

How to change data ownership?

Today, we all sign user agreements which essentially gives a company the rights to our data in perpetuity. That needs to change. I see a few ways that this change could come about

  1. Countries could enact laws to insure personal data ownership resides in the person generating it and enforce periodic distribution of this data
  2. Market dynamics could impel data distribution, e.g. if some search firm supplied data to us, we would be more likely to use them.
  3. Societal changes, as AI becomes more important to profit making activities and reduces the need for human work, and as data continues to be an important factor in AI success, data ownership becomes essential to retaining the value of human labor in society.

Probably, all of the above and maybe more would be required to change the ownership structure of data.

How to profit from data?

Technical entities needing data to train AI NNs could solicit data contributions through an Initial Data Offering (IDO). IDO’s would specify types of data required and a proportion of AI NN ownership, they would cede to all  data providers. Data providers would be apportioned ownership based on the % identified and the number of IDO data subscribers.

perspective by anomalous4 (cc) (from Flickr)
perspective by anomalous4 (cc) (from Flickr)

Data banks would extract the data requested by the IDO and supply it to the IDO entity for use. For IDOs, just like ICO’s or IPO’s, some would fail and others would succeed. But the data used in them would represent an ownership share sort of like a  stock (data) certificate in the AI NN.

Data bank responsibilities

Data banks would have various responsibilities and would need to collect fees to perform them. For example, data banks would be responsible for:

  1. Protecting data deposits – to insure data deposits are never lost, are never accessed without permission, are always trackable as to how they are used..
  2. Performing data deposits – to verify that data is deposited from proper digital entities, to validate that data deposits are in a usable form and to properly store the data in a customers object storage bucket.
  3. Performing data withdrawals – upon customer request, to extract all the appropriate data requested by an IDO,  anonymize it, secure it, package it and send it to the IDO originator.
  4. Reconciling data accounts – to track data transactions, data banks would supply a monthly statement that identifies all data deposits and data withdrawals, data revenues and data expenses/fees.
  5. Enforcing data withdrawal types – to enforce data withdrawal types, as data  withdrawals can have many different characteristics, such as exclusivity, expiration, geographic bounds, etc. Data banks would need to enforce withdrawal characteristics, at least to the extent they can
  6. Auditing data transactions – to insure that data is used properly, a consortium of data banks or possibly data accountancies would need to audit AI training data sets to verify that only data that has been properly withdrawn is used in trying the NN. .

AI NN, tools and framework responsibilities

In order for personal data ownership to work well, AI NNs, tools and frameworks used today would need to change to account for data ownership.

  1. Generate, maintain and supply immutable data ownership digests – data ownership digests would be a sort of stock registry for the data used in training the AI NN. They would need to be a part of any AI NN and be viewable by proper data authorities
  2. Track data use – any and all data used in AI NN training should be traceable so that proper data ownership can be guaranteed.
  3. Identify AI NN revenues – NN revenues would need to be isolated, identified and accounted for so that data owners could be rewarded.
  4. Identify AI NN data expenses – NN data costs would need to somehow be isolated, identified and accounted for so that data expenses could be properly deducted from data owner awards. .

At some point there’s a need for almost a data profit and loss statement as well as a data balance sheet for at an AI NN level. The information supplied above should make auditing data ownership, use and rewards much more feasible. But it all starts with identifying data ownership and the data used in training the AI.

~~~~

There are a thousand more questions that come to mind. For example

  • Who owns earth sensing satellite, IoT sensors, weather sensors, car sensors etc. data? Everyone in the world (or country) being monitored is laboring to create the environment sensed by these devices. Shouldn’t this sensor data be apportioned to the people of the world or country where these sensors operate.
  • Who pays data bank fees? The generators/extractors of the data could pay in addition to providing data deposits for the privilege to use our data. I could also see the people paying.  Having the company pay would give them an incentive to make the data load be as efficient and complete as possible. Having the people pay would induce them to use their data more productively.
  • What’s a decent data expiration period? Given application time frames these days, 7-15 years would make sense. But what happens to the AI NN when data expires. Some way would need to be created to extract data from a NN, or the AI NN would need to cease being used and a new one would  need to be created with new data.
  • Can data deposits be rented/sold to data aggregators? Sort of like a AI VC partnership only using data deposits rather than money to fund AI startups.
  • What happens to data deposits when a person dies? Can one inherit a data deposits, would a data deposit inheritance be taxable as part of an estate transfer?

In the end, as data is required to train better AI, ownership of our data makes us all be capitalist (datalists) in the creation of new AI NNs and the subsequent advancement of society. And that’s a good thing.

Comments?

 

 

AI processing at the edge

Read a couple of articles over the past few weeks (TechCrunch: Google is making a fast, specialized TPU chip for edge devices … and IEEE Spectrum: Two startups use processing in flash for AI at the edge) about chips for AI at the IoT edge.

The two startups, Syntiant and Mythic, are moving to analog only or analog-digital solutions to provide AI processing needed at the edge while Google is taking their TPU technology to the edge.  We have written about Google’s TPU before (see: TPU and hardware vs. software  innovation (round 3) post).

The major challenge in AI processing at the edge is power consumption. Both  startups attack the power problem by using flash and other analog circuitry to provide power efficient compute.

Google attacked the power problem with their original TPU by reducing computational precision from 64- to 8-bits. By reducing transistor counts, they lowered power requirements proportionally.

AI today is based on neural networks (NN), that connect simulated neurons via simulated synapses with weights attached to indicate whether to boost or decrease the signal being transmitted. AI learning is done by setting those weights and creating the connections between simulated neurons and the synapses.  So learning is setting weights and establishing connections. Actual inferences (using AI to do something) is a process of exciting input simulated neurons/synapses and letting the signal flow through the NN with each weight being used to determine output(s).

AI with standard compute

The problem with doing AI learning or inferencing with normal CPUs or even CUDAs is that the NN does thousands if not millions of  multiplication-accumulation actions at each simulated synapse-neuron connection. Doing all these multiplication-accumulation takes power. CPUs and CUDAs can do these sorts of operations on 32 or 64 bit numbers or even floating point but it still takes power.

AI processing power

AI processing power is measured in trillions of (accumulate-multiply) operations per second per watt (TOPS/W). Mythic believes it can perform 4 TOPS/W and Syntiant says it can do 20 TOPS/W. In comparison, the NVIDIA Volta V100 can do about 0.4 TOPS/W (according to the article). Although  comparing Syntiant-Mythic TOPS to NVIDIA TOPS is a little like comparing apples to oranges.

A current Intel Xeon Platinum 8180M (2.5Ghz, 28 Core processors, 205 W) can probably do (assuming one multiplication-accumulation per hertz) about 2.5 Billion X 28 Cores = 70 Billion Ops Second/205 W or 0.3 GOPS/W (source: Platinum 8180M Data sheet).

As for Google’s TPU TOPS/W, TPU2 is rated at 45 GFLOPS/chip and best guess for power consumption is between 160W and 200W, let’s say 180W. With power at that level, TPU2 should hit 0.25 GFLOPS/W.  TPU3 is coming out with 8X the power but it uses water cooling (read LOTS MORE POWER).

Nonetheless, it appears that Mythic and Syntiant are one to two orders of magnitude better than the best that NVIDIA and TPU2 can do today and many orders of magnitude better than Intel X86.

Improving TOPS/W

Using NAND, as an analog memory to read, write and hold  NN weights is an easy way to reduce power consumption. Combine that with  analog circuitry that can do multiplication and addition with those flash values and you have a AI NN processor. This way you reduce the need to hold weights in memory and do compute in registers by collapsing both compute and memory into the same componentry.

The major difference between Syntiant and Mythic seems to be the amount of analog circuitry they use. Mythic seems to relegate the analog circuitry to an accelerator while Syntiant has a more extensive use of analog circuitry throughout their chip. Probably why it can perform 5X the TOPS/W of Mythic’s IPU.

IBM and others have been working on neuromorphic chips some of which are analog based and others which are all digital based. We’ve written extensively on IBM and some on MIT’s approaches (for the latest on IBM see: More power efficient deep learning through IBM and PCM, and for MIT see: MIT builds an analog synapse chip) and follow the links there to learn more.

~~~~

Special purpose AI hardware is emerging from the labs and finally reaching reality. IBM R&D has been playing with it for a long time. Google is working on TPU3 so there’s no stopping them. And startups are seeing an opening and are taking everyone on. Stay tuned, were in for a good long ride before the someone rises above the crowd and becomes the next chip giant.

Comments?

 

Photo Credit(s): TechCrunch  Google is making a fast, specialized TPU chip for edge devices … article

Introduction to Digital Design Verification at Mythic, Medium.com Article

Images from Google Cloud Platform Blog on the TPU

Two startups use processing in flash for AI at the edge, IEEE Spectrum article courtesy of Mythic

MIT’s new Navion chip for better Nano drone navigation

Read an article this week in Science Daily (Chip upgrade help’s bee-sized drones navigate) about a recent chip created by MIT, called Navion, that reduces size and power consumption for electronics used in drone navigation. The chip is also documented on MIT’s Navion project homepage and in a technical  paper describing the new VIO (Visual-Inertial Odometry ) Navion chip.

The Navion chip can perform inertial measurement at 52Khz as well as process video streams of 752×480 stereo images at 171 frames per second in a 20 sqmm package consuming only 24mW of power. The chip was fabricated on a 65nm CMOS process line.

Navion is the result of a collaborative design process which optimized electronics required to perform  drone navigation processing. By placing all the memory required for inertial measurement and image analysis and all the processing hardware on the same chip, they have substantially reduced power consumption and space requirements for drone navigation.

Navion architecture

Navion uses a state of the art, non-linear factor graph optimization algorithm to navigate in space.  It doesn’t sound like  DL neural net image recognition but more like a statistical/probabilistic approach to image mapping and place estimation. The chip uses image compression, two stage memory, and sparse linear solver memory to reduce image processing memory requirements from 3.5MB to less than 1MB.

The chip uses 3 inputs: two images (right &  left image) and IMU (inertial management unit sensor) and has one (complex output), its estimate of the current state of where it is on the map.

Navion processing creates and maintains a 3D map using stereo images and provides navigational support to move through that space.  According to the paper, the Navion chip updates the state(s) and sparse 3D map at a KF (Kalman filter) rate of between 16 and 90 fps. Navion also offers configurations options to maximize accuracy, throughput or energy efficiency.

Navion compares well to other navigation electronics

The table shows comparisons of the Navion chip against other traditional navigational systems that use Xeon, ARM or FPGA chips. As far as I can tell it’s either much better or at least on a par with these other larger, more complex, power hungry systems.

Nano drones are coming to our space, sooner than anyone expects.

Comments?

Photo credit(s): System overview from Navion project page (c) 2018 MIT;

Picture of chip with layout  from Navion project page (c) 2018 MIT;

Navion: A Fully Integrated Energy-Efficient Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones (c) 2018 MIT

A new way to compute

I read an article the other day on using using random pulses rather than digital numbers to compute with, see Computing with random pulses promises to simplify circuitry and save power, in IEEE Spectrum. Essentially they encode a number as a probability in a random string of bits and then use simple logic to compute with. This approach was invented in the early days of digital logic and was called stochastic computing.

Stochastic numbers?

It’s pretty easy to understand how such logic can work for fractions. For example to represent 1/4, you would construct a bit stream that had one out of every four bits, on average, as a 1 and the rest 0’s. This could easily be a random string of bits which have an average of 1 out of every 4 bits as a one.

A nice result of such a numerical representation is that it easily results in more precision as you increase the length of the bit stream. The paper calls this progressive precision.

Progressive precision helps stochastic computing be more fault tolerant than standard digital logic. That is, if the string has one bit changed it’s not going to make that much of a difference from the original string and computing with an erroneous number like this will probably result in similar results to the correct number.  To have anything like this in digital computation requires parity bits, ECC, CRC and other error correction mechanisms and the logic required to implement these is extensive.

Stochastic computing

2 bit multiplier

Another advantage of stochastic computation and using a probability  rather than binary (or decimal) digital representation, is that most arithmetic functions are much simpler to implement.

 

They discuss two examples in the original paper:

  • AND gate

    Multiplication – Multiplying two probabilistic bit streams together is as simple as ANDing the two strings.

  • 2 input stream multiplexer

    Addition – Adding two probabilistic bit strings together just requires a multiplexer, but you end up with a bit string that is the sum of the two divided by two.

What about other numbers?

I see a couple of problems with stochastic computing:,

  • How do you represent  an irrational number, such as the square root of 2;
  • How do you represent integers or for that matter any value greater than 1.0 in a probabilistic bit stream; and
  • How do you represent negative values in a bit stream.

I suppose irrational numbers could be represented by taking a near-by, close approximation of the irrational number. For instance, using 1.4 for the square root of two, or 1.41, or 1.414, …. And this way you could get whatever (progressive) precision that was needed.

As for integers greater than 1.0, perhaps they could use a floating point representation, with two defined bit strings, one representing the mantissa (fractional part) and the other an exponent. We would assume that the exponent rather than being a probability from 0..1.0, would be inverted and represent 1.0…∞.

Negative numbers are a different problem. One way to supply negative numbers is to use something akin to complemetary representation. For example, rather than the probabilistic bit stream representing 0.0 to 1.0 have it represent -0.5 to 0.5. Then progressive precision would work for negative numbers as well a positive numbers.

One major downside to stochastic numbers and computation is that high precision arithmetic is very difficult to achieve.  To perform 32 bit precision arithmetic would require a bit streams that were  2³² bits long. 64 bit precision would require streams that were  2**64th bits long.

Good uses for stochastic computing

One advantage of simplified logic used in stochastic computing is it needs a lot less power to compute. One example in the paper they use for stochastic computers is as a retinal sensor for in the body visual augmentation. They developed a neural net that did edge detection that used a stochastic front end to simplify the logic and cut down on power requirements.

Other areas where stochastic computing might help is for IoT applications. There’s been a lot of interest in IoT sensors being embedded in streets, parking lots, buildings, bridges, trucks, cars etc. Most have a need to perform a modest amount of edge computing and then send information up to the cloud or some edge consolidator intermediate

Many of these embedded devices lack access to power, so they will need to make do with whatever they can find.  One approach is to siphon power from ambient radio (see this  Electricity harvesting… article), temperature differences (see this MIT … power from daily temperature swings article), footsteps (see Pavegen) or other mechanisms.

The other use for stochastic computing is to mimic the brain. It appears that the brain encodes information in pulses of electric potential. Computation in the brain happens across exhibitory and inhibitory circuits that all seem to interact together.  Stochastic computing might be an effective way, low power way to simulate the brain at a much finer granularity than what’s available today using standard digital computation.

~~~~

Not sure it’s all there yet, but there’s definitely some advantages to stochastic computing. I could see it being especially useful for in body sensors and many IoT devices.

Comments?

Photo Credit(s):  The logic of random pulses

2 bit by 2 bit multiplier, By Sodaboy1138 (talk) (Uploads) – Own work, CC BY-SA 3.0, wikimedia

AND ANSI Labelled, By Inductiveload – Own work, Public Domain, wikimedia

2 Input multiplexor

A battery free implantable neural sensor, MIT Technology Review article

Integrating neural signal and embedded system for controlling a small motor, an IntechOpen article

AI reaches a crossroads

There’s been a lot of talk on the extendability of current AI this past week and it appears that while we may have a good deal of runway left on the machine learning/deep learning/pattern recognition, there’s something ahead that we don’t understand.

Let’s start with MIT IQ (Intelligence Quest),  which is essentially a moon shot project to understand and replicate human intelligence. The Quest is attempting to answer “How does human intelligence work, in engineering terms? And how can we use that deep grasp of human intelligence to build wiser and more useful machines, to the benefit of society?“.

Where’s HAL?

The problem with AI’s deep learning today is that it’s fine for pattern recognition, but it doesn’t appear to develop any basic understanding of the world beyond recognition.

Some AI scientists concede that there’s more to human/mamalian intelligence than just pattern recognition expertise, while others’ disagree. MIT IQ is trying to determine, what’s beyond pattern recognition.

There’s a great article in Wired about the limits of deep learning,  Greedy, Brittle, Opaque and Shallow: the Downsides to Deep Learning. The article says deep learning is greedy because it needs lots of data (training sets) to work, it’s brittle because step one inch beyond what’s it’s been trained  to do and it falls down, and it’s opaque because there’s no way to understand how it came to label something the way it did. Deep learning is great for pattern recognition of known patterns but outside of that, there must be more to intelligence.

The limited steps using unsupervised learning don’t show a lot of hope, yet

“Pattern recognition” all the way down…

There’s a case to be made that all mammalian intelligence is based on hierarchies of pattern recognition capabilities.

That is, at a bottom level  human intelligence consists of pattern recognition, such as vision, hearing, touch, balance, taste, etc. systems which are just sophisticated pattern recognition algorithms that label what we are hearing as Bethovan’s Ninth Symphony, tasting as grandma’s pasta sauce, and seeing as the Grand Canyon.

Then, at the next level there’s another pattern recognition(-like) system that takes all these labels and somehow recognizes this scene as danger, romance, school,  etc.

Then, at the next level, human intelligence just looks up what to do in this scene.  Almost as if we have a defined list of action templates that are what we do when we are in danger (fight or flight), in romance (kiss, cuddle or ?), in school (answer, study, view, hide, …), etc.  Almost like a simple lookup table with procedural logic behind each entry

One question for this view is how are these action templates defined and  how many are there. If, as it seems, there’s almost an infinite number of them, how are they selected (some finer level of granularity in scene labeling – romance but only flirting …).

No, it’s not …

But to other scientists, there appears to be more than just pattern recognition(-like) algorithms and lookup and act algorithms, going on inside our brains.

For example, once I interpret a scene surrounding me as in danger, romance, school, etc.,  I believe I start to generate possible action lists which I could take in this domain, and then somehow I select the one to do which makes the most sense in this situation or rather gets me closer to my current goal (whatever that is) in this situation.

This is beyond just procedural logic and involves some sort of memory system, action generative system, goal generative/recollection system, weighing of possible action scripts, etc.

And what to make of the brain’s seemingly infinite capability to explain itself…

Baby intelligence

Most babies understand their parents language(s) and learn to crawl within months after birth. But they haven’t listened to thousands of hours of people talking or crawled thousands of miles.  And yet, deep learning requires even more learning sets in order to label language properly or  learning how to crawl on four appendages. And of course, understanding language and speaking it are two different capabilities. Ditto for crawling and walking.

How does a baby learn to recognize these patterns without TB of data and millions of reinforcements (“Smile for Mommy”, say “Daddy”). And what to make of the, seemingly impossible to contain wanderlust, of any baby given free reign of an area.

These questions are just scratching the surface in what it really means to engineer human intelligence.

~~~~

MIT IQ is one attempt to try to answer the question that: assuming we understand how to pattern recognition can be made to work well on today’s computers what else do we need to do to build a more general purpose intelligence.

There are obvious ethical questions on whether we want to engineer a human level of intelligence (see my Existential risks… post). Our main concern is what it does (to humanity) once we achieve it.

But assuming we can somehow contain it for the benefit of humanity, we ought to take another look at just what it entails.

 

Photo Credits:  Tech trends for 2017: more AI …., the Next Silicon Valley website. 

HAL from 2001 a Space Odyssey 

Design software test labeling… 

Exploration in toddlers…, Science Daily website