IBM using PCM to implement better AI – round 6

Saw a recent article that discussed IBM’s research into new computing architectures that are inspired by brain computational techniques (see A new brain inspired architecture … ). The article reports on research done by IBM R&D into using Phase Change Memory (PCM) technology to implement various versions of computer architectures for AI (see Tutorial: Brain inspired computation using PCM, in the AIP Journal of Applied Physics).

As you may recall, we have been reporting on IBM Research into different computing architectures to support AI processing for quite awhile now, (see: Parts 1, 2, 3, 4, & 5). In our last post, More power efficient deep learning through IBM and PCM, we reported on a unique hybrid PCM-silicon solution to deep learning computation.

Readers should also be familiar with PCM as well as it’s been discussed at length in a number of our posts (see The end of NAND is near, maybe; The future of data storage is MRAM; and New chip architectures with CPU, storage & sensors …). MRAM, ReRAM and current 3D XPoint seem to be all different forms of PCM (I think).

In the current research, IBM discusses three different approaches to support AI  utilizing PCM devices. All three approaches stem from the physical characteristics of PCM.

(Some) PCM physics

FIG. 2. (a) Phase-change memory is based on the rapid and reversible phase transition of certain types of materials between crystalline and amorphous phases by the application of suitable electrical pulses. (b) Transmission electron micrograph of a mushroom-type PCM device in a RESET state. It can be seen that the bottom electrode is blocked by the amorphous phase.

It turns out that PCM devices have many  characteristics that lend themselves to be useful for specialized computation. PCM devices crystalize and melt in order to change state. The properties associated with melting and crystallization of the PCM media cell can be used to support unique forms of computation. Some of these PCM characteristics include::

  • Analog, not digital memory – PCM devices are, at the core, an analog memory device. We mean that they don’t record just a 0 or 1 (actually resistant or conductive) state, but rather a continuum of values between those two.
  • PCM devices have an accumulation capability –   each PCM cell actually  accumulates a level of activation. This means that one cell can be more or less likely to change state depending on prior activity.
  • PCM devices are noisy – PCM cells arenot perfect recorders of state chang signals  but rather have a well known, random noise which impacts the state level attained, that can be used to introduce randomness into processing.

The other major advantage of PCM devices is that they take a lot less power than a GPU-CPU to work.

Three ways to use PCM for AI learning

FIG. 4. “In-memory computing,” computation is performed in place by exploiting the physical attributes of memory devices organized as a “computational memory” unit. For example, if data A is stored in a computational memory unit and if we would like to perform f(A), then it is not required to bring A to the processing unit. This saves energy and time that would have to be spent in the case of conventional computing system and memory unit. Adapted from Ref. 19.

The Applied Physics article describes three ways to use PCM devices in AI learning. These three include:

  1. Computational storage – which uses the analog capabilities of PCM to perform  arithmetic and learning computations. In a sort of combined compute and storage device.
  2. AI co-processor – which uses PCM devices, in an “all PCM nodes connected to all other PCM nodes” operation that could be used to perform neural network learning. In an AI co-processor there would be multiple all connected PCM modules, each emulating a neural network layer.
  3. Spiking neural networks –  which uses PCM activation accumulation characteristics & inherent randomness to mimic, biological spiking neuron activation.
FIG. 11.
A proposed chip architecture for a co-processor for deep learning based on PCM arrays.28

It’s the last approach that intrigues me.

Spiking neural nets (SNN)

FIG. 12. (a) Schematic illustration of a synaptic connection and the corresponding pre- and post-synaptic neurons. The synaptic connection strengthens or weakens based on the spike activity of these neurons; a process referred to as synaptic plasticity. (b) A well-known plasticity mechanism is spike-time-dependent plasticity (STDP), leading to weight changes that depend on the relative timing between the pre- and post-synaptic neuronal spike activities. Adapted from Ref. 31.

Biological neurons accumulate charge from all input (connected) neurons and when they reach some input threshold, generate an output signal or spike. This spike is then used to start the process with another neuron up stream from it

Biological neurons also exhibit randomness in their threshold-spiking process.

Emulating spiking neurons, n today’s neural nets, takes computation.  Also randomness takes more.

But with PCM SNN, both the spiking process and its randomness, comes from device physics. Using PCM to create SNN seems a logical progression.

PCM as storage, as memory, as compute or all the above

In the storage business, we look at Optane (see our 3D Xpoint post) SSDs as blazingly fast storage. Intel has also announced that they will use 3D Xpoint in a memory form factor which should provide sadly slower, but larger memory devices.

But using PCM for compute, is a radical departure from the von Neumann computer architectures we know and love today. HPE has been discussing another new computing architecture with their memristor technology, but only in prototype form.

It seems IBM, is also prototyping hardware done this path.

Welcome to the next computing revolution.

Photo & Caption Credit(s): Photo and caption from Figure 2 in AIP Journal of Applied Physics article

Photo and caption from Figure 4 in AIP Journal of Applied Physics article

Photo and caption from Figure 11 in AIP Journal of Applied Physics article

Photo and caption from Figure 12 in AIP Journal of Applied Physics article

 

 

More power efficient deep learning through IBM and PCM

Read an article today from MIT Technical Review (TR) (AI could get 100 times more efficient with IBM’s new artificial synapses). Discussing the power efficiency of a new analog approach to neural nets and deep learning.

We have talked about IBM’s TrueNorth and Synapse neuromorphic devices  and PCM neural nets before (see: Parts 1, 2, 3, & 4).

The paper in Nature (Equivalent accuracy accelerated neural training using analogue memory ) referred to by the TR article is behind a pay wall. However, another ArsTechnica (Ars) article (Training a neural network in phase change memory beats GPUs) on the new research was a bit more informative.

Both articles discuss a new analog approach, using phase change memory (PCM) which has significant power/training efficiency when compared to today’s standard GPU AI processor. Both the TR and Ars papers report on IBM developments simulating a new (PCM based) neuromorphic device that reduces training  power consumption AND training time by a factor of 100.   But the Nature paper abstract says it reduces both power consumption and computational space (computations per sq mm) by a factor of 100, not exactly the same.

Why PCM

PCM is a nonvolatile memory technology (see part 4 above for more info) that uses electronically induced phase changes in a material to establish a 1’s or 0’s state for a PCM bit.

However, another advantage of PCM is that it also can take on a state between 0 and 1. This is bad for data memory/storage but good for neural nets.

For a PCM based neural net you could have a layer of PCM (neuron) structures and standard wiring that wires all the PCM neurons to the next layer down, for however many layers required for your neural net. The PCM value would indicate the strength of the connection between neurons (synapses).

But, the problem with a PCM neural net is that PCM states don’t provide enough graduations of values between 0 and 1 to fully map today’s neural net weights.

IBM’s latest design has two different tiers of neural nets

According to Ars article, IBM’s latest design has a two tier approach to using PCM in its neural net. The first, top tier uses a PCM structure and the second lower tier uses a more traditional, silicon based structure and together they implement the neural net.

The Ars article speaks of the new two tier design as providing two digit resolution for the weight between  neuron. The structure implemented in PCM determines the higher order digit and the more traditional, silicon based, neural net segment determines the lower order digit in the two digit neural net weight.

With this approach, training occurs mostly in the more traditional, silicon layer neural net, but every 100 or so training events (epochs),  information is used to modify the PCM structure as well. In this fashion, the PCM-silicon neural net is fine tuned using 1 out of 100 or so training events to correct the PCM layer and the other 99 or so training events to modify the silicon layer.

In addition, the silicon layer is apparently implemented in silicon to mimic the PCM layer, using capacitors and transistors.

~~~~

I wonder why not just use two tiers of PCM to do the same thing but it’s possible that training the silicon layer is more power efficient, speedy or both than the PCM layer.

The TR and Ars articles seem to make a point of saying this is analogue computing. And I would guess because the PCM and the silicon layer can take on many values between 0 and 1 that means it’s not digital.

Much of the article is based on combined hardware (built using 90nm technology) and software simulations of the new PCM-silicon neuromorphic device. However, simulations like this are a standard step in ASIC design process, and if successful, we would expect an chip to emerge from foundry within 6-12 months from now.

The Nature paper’s abstract indicated that they simulated the device using standard (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100) training datasets for handwritten digit recognition and color image classification/recognition. The new device was able to approach within 1% accuracy of software trained neural net with 1% the power and (when updated to latest foundry technologies) in 1% the space.

Furthermore, the abstract said that the current device supports ~205K synapses. The previous generation, IBM TrueNorth (see part 2 above) had the “equivalent of 1M neurons” and their earlier IBM SYNAPSE (see part 1 above) chip had “256K programable synapses” and 256 computational elements. But I believe both of those were single tier devices.

I’d also be very interested in whether the neuromorphic device is compatible with and could be programmed with PyTorch or TensorFlow but I didn’t see any information on how the devices were programmed.

Comments?

Photo Credit(s): neuron by mararie 

3D CrossPoint graphic, taken from Intel-Micron session at FMS16

brain-neurons by Fotis Bobolas

IBM Research creates PCM synapses – cognitive computing, round 4

Flaming Lotus Girls Neuron by SanFranAnnie (cc) (from Flickr)

Last year we reported on IBM’s progress in taking PCM (phase change memory) and using it to create a new, neuromorphic computing architecture (see Phase Change Memory (PCM) based neuromorphic processors). And earlier we discussed IBM’s (2nd generation), True North chip and IBM’s (1st generation) Synapse Chip.

This past week IBM made another cognitive computing announcement. This time they have taken their neuromorphic technologies another step closer to precise emulation of neurological processing of the brain.

Their research paper was not directly available, but IBM Research has summarized its contents in a short web article with a video (see IBM Scientists imitate the functionality of neurons with Phase-Change device).
Continue reading “IBM Research creates PCM synapses – cognitive computing, round 4”