AI processing at the edge

Read a couple of articles over the past few weeks (TechCrunch: Google is making a fast, specialized TPU chip for edge devices … and IEEE Spectrum: Two startups use processing in flash for AI at the edge) about chips for AI at the IoT edge.

The two startups, Syntiant and Mythic, are moving to analog only or analog-digital solutions to provide AI processing needed at the edge while Google is taking their TPU technology to the edge.  We have written about Google’s TPU before (see: TPU and hardware vs. software  innovation (round 3) post).

The major challenge in AI processing at the edge is power consumption. Both  startups attack the power problem by using flash and other analog circuitry to provide power efficient compute.

Google attacked the power problem with their original TPU by reducing computational precision from 64- to 8-bits. By reducing transistor counts, they lowered power requirements proportionally.

AI today is based on neural networks (NN), that connect simulated neurons via simulated synapses with weights attached to indicate whether to boost or decrease the signal being transmitted. AI learning is done by setting those weights and creating the connections between simulated neurons and the synapses.  So learning is setting weights and establishing connections. Actual inferences (using AI to do something) is a process of exciting input simulated neurons/synapses and letting the signal flow through the NN with each weight being used to determine output(s).

AI with standard compute

The problem with doing AI learning or inferencing with normal CPUs or even CUDAs is that the NN does thousands if not millions of  multiplication-accumulation actions at each simulated synapse-neuron connection. Doing all these multiplication-accumulation takes power. CPUs and CUDAs can do these sorts of operations on 32 or 64 bit numbers or even floating point but it still takes power.

AI processing power

AI processing power is measured in trillions of (accumulate-multiply) operations per second per watt (TOPS/W). Mythic believes it can perform 4 TOPS/W and Syntiant says it can do 20 TOPS/W. In comparison, the NVIDIA Volta V100 can do about 0.4 TOPS/W (according to the article). Although  comparing Syntiant-Mythic TOPS to NVIDIA TOPS is a little like comparing apples to oranges.

A current Intel Xeon Platinum 8180M (2.5Ghz, 28 Core processors, 205 W) can probably do (assuming one multiplication-accumulation per hertz) about 2.5 Billion X 28 Cores = 70 Billion Ops Second/205 W or 0.3 GOPS/W (source: Platinum 8180M Data sheet).

As for Google’s TPU TOPS/W, TPU2 is rated at 45 GFLOPS/chip and best guess for power consumption is between 160W and 200W, let’s say 180W. With power at that level, TPU2 should hit 0.25 GFLOPS/W.  TPU3 is coming out with 8X the power but it uses water cooling (read LOTS MORE POWER).

Nonetheless, it appears that Mythic and Syntiant are one to two orders of magnitude better than the best that NVIDIA and TPU2 can do today and many orders of magnitude better than Intel X86.

Improving TOPS/W

Use NAND, as an analog memory to read, write and hold  NN weights is an easy way to reduce power consumption. Combine that with  analog circuitry that can do multiplication and addition with those flash values and you have a AI NN processor. This way you reduce the need to hold weights in memory and do compute in registers by collapsing both compute and memory into the same componentry.

The major difference between Syntiant and Mythic seems to be the amount of analog circuitry they use. Mythic seems to relegate the analog circuitry to an accelerator while Syntiant has a more extensive use of analog circuitry throughout their chip. Probably why it can perform 5X the TOPS/W of Mythic’s IPU.

IBM and others have been working on neuromorphic chips some of which are analog based and others which are all digital based. We’ve written extensively on IBM and some on MIT’s approaches (for the latest on IBM see: More power efficient deep learning through IBM and PCM, and for MIT see: MIT builds an analog synapse chip) and follow the links there to learn more.

~~~~

Special purpose AI hardware is emerging from the labs and finally reaching reality. IBM R&D has been playing with it for a long time. Google is working on TPU3 so there’s no stopping them. And startups are seeing an opening and are taking everyone on. Stay tuned, were in for a good long ride before the someone rises above the crowd and becomes the next chip giant.

Comments?

 

Photo Credit(s): TechCrunch  Google is making a fast, specialized TPU chip for edge devices … article

Introduction to Digital Design Verification at Mythic, Medium.com Article

Images from Google Cloud Platform Blog on the TPU

Two startups use processing in flash for AI at the edge, IEEE Spectrum article courtesy of Mythic

MIT’s new Navion chip for better Nano drone navigation

Read an article this week in Science Daily (Chip upgrade help’s bee-sized drones navigate) about a recent chip created by MIT, called Navion, that reduces size and power consumption for electronics used in drone navigation. The chip is also documented on MIT’s Navion project homepage and in a technical  paper describing the new VIO (Visual-Inertial Odometry ) Navion chip.

The Navion chip can perform inertial measurement at 52Khz as well as process video streams of 752×480 stereo images at 171 frames per second in a 20 sqmm package consuming only 24mW of power. The chip was fabricated on a 65nm CMOS process line.

Navion is the result of a collaborative design process which optimized electronics required to perform  drone navigation processing. By placing all the memory required for inertial measurement and image analysis and all the processing hardware on the same chip, they have substantially reduced power consumption and space requirements for drone navigation.

Navion architecture

Navion uses a state of the art, non-linear factor graph optimization algorithm to navigate in space.  It doesn’t sound like  DL neural net image recognition but more like a statistical/probabilistic approach to image mapping and place estimation. The chip uses image compression, two stage memory, and sparse linear solver memory to reduce image processing memory requirements from 3.5MB to less than 1MB.

The chip uses 3 inputs: two images (right &  left image) and IMU (inertial management unit sensor) and has one (complex output), its estimate of the current state of where it is on the map.

Navion processing creates and maintains a 3D map using stereo images and provides navigational support to move through that space.  According to the paper, the Navion chip updates the state(s) and sparse 3D map at a KF (Kalman filter) rate of between 16 and 90 fps. Navion also offers configurations options to maximize accuracy, throughput or energy efficiency.

Navion compares well to other navigation electronics

The table shows comparisons of the Navion chip against other traditional navigational systems that use Xeon, ARM or FPGA chips. As far as I can tell it’s either much better or at least on a par with these other larger, more complex, power hungry systems.

Nano drones are coming to our space, sooner than anyone expects.

Comments?

Photo credit(s): System overview from Navion project page (c) 2018 MIT;

Picture of chip with layout  from Navion project page (c) 2018 MIT;

Navion: A Fully Integrated Energy-Efficient Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones (c) 2018 MIT

More power efficient deep learning through IBM and PCM

Read an article today from MIT Technical Review (TR) (AI could get 100 times more efficient with IBM’s new artificial synapses). Discussing the power efficiency of a new analog approach to neural nets and deep learning.

We have talked about IBM’s TrueNorth and Synapse neuromorphic devices  and PCM neural nets before (see: Parts 1, 2, 3, & 4).

The paper in Nature (Equivalent accuracy accelerated neural training using analogue memory ) referred to by the TR article is behind a pay wall. However, another ArsTechnica (Ars) article (Training a neural network in phase change memory beats GPUs) on the new research was a bit more informative.

Both articles discuss a new analog approach, using phase change memory (PCM) which has significant power/training efficiency when compared to today’s standard GPU AI processor. Both the TR and Ars papers report on IBM developments simulating a new (PCM based) neuromorphic device that reduces training  power consumption AND training time by a factor of 100.   But the Nature paper abstract says it reduces both power consumption and computational space (computations per sq mm) by a factor of 100, not exactly the same.

Why PCM

PCM is a nonvolatile memory technology (see part 4 above for more info) that uses electronically induced phase changes in a material to establish a 1’s or 0’s state for a PCM bit.

However, another advantage of PCM is that it also can take on a state between 0 and 1. This is bad for data memory/storage but good for neural nets.

For a PCM based neural net you could have a layer of PCM (neuron) structures and standard wiring that wires all the PCM neurons to the next layer down, for however many layers required for your neural net. The PCM value would indicate the strength of the connection between neurons (synapses).

But, the problem with a PCM neural net is that PCM states don’t provide enough graduations of values between 0 and 1 to fully map today’s neural net weights.

IBM’s latest design has two different tiers of neural nets

According to Ars article, IBM’s latest design has a two tier approach to using PCM in its neural net. The first, top tier uses a PCM structure and the second lower tier uses a more traditional, silicon based structure and together they implement the neural net.

The Ars article speaks of the new two tier design as providing two digit resolution for the weight between  neuron. The structure implemented in PCM determines the higher order digit and the more traditional, silicon based, neural net segment determines the lower order digit in the two digit neural net weight.

With this approach, training occurs mostly in the more traditional, silicon layer neural net, but every 100 or so training events (epochs),  information is used to modify the PCM structure as well. In this fashion, the PCM-silicon neural net is fine tuned using 1 out of 100 or so training events to correct the PCM layer and the other 99 or so training events to modify the silicon layer.

In addition, the silicon layer is apparently implemented in silicon to mimic the PCM layer, using capacitors and transistors.

~~~~

I wonder why not just use two tiers of PCM to do the same thing but it’s possible that training the silicon layer is more power efficient, speedy or both than the PCM layer.

The TR and Ars articles seem to make a point of saying this is analogue computing. And I would guess because the PCM and the silicon layer can take on many values between 0 and 1 that means it’s not digital.

Much of the article is based on combined hardware (built using 90nm technology) and software simulations of the new PCM-silicon neuromorphic device. However, simulations like this are a standard step in ASIC design process, and if successful, we would expect an chip to emerge from foundry within 6-12 months from now.

The Nature paper’s abstract indicated that they simulated the device using standard (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100) training datasets for handwritten digit recognition and color image classification/recognition. The new device was able to approach within 1% accuracy of software trained neural net with 1% the power and (when updated to latest foundry technologies) in 1% the space.

Furthermore, the abstract said that the current device supports ~205K synapses. The previous generation, IBM TrueNorth (see part 2 above) had the “equivalent of 1M neurons” and their earlier IBM SYNAPSE (see part 1 above) chip had “256K programable synapses” and 256 computational elements. But I believe both of those were single tier devices.

I’d also be very interested in whether the neuromorphic device is compatible with and could be programmed with PyTorch or TensorFlow but I didn’t see any information on how the devices were programmed.

Comments?

Photo Credit(s): neuron by mararie 

3D CrossPoint graphic, taken from Intel-Micron session at FMS16

brain-neurons by Fotis Bobolas

A new way to compute

I read an article the other day on using using random pulses rather than digital numbers to compute with, see Computing with random pulses promises to simplify circuitry and save power, in IEEE Spectrum. Essentially they encode a number as a probability in a random string of bits and then use simple logic to compute with. This approach was invented in the early days of digital logic and was called stochastic computing.

Stochastic numbers?

It’s pretty easy to understand how such logic can work for fractions. For example to represent 1/4, you would construct a bit stream that had one out of every four bits, on average, as a 1 and the rest 0’s. This could easily be a random string of bits which have an average of 1 out of every 4 bits as a one.

A nice result of such a numerical representation is that it easily results in more precision as you increase the length of the bit stream. The paper calls this progressive precision.

Progressive precision helps stochastic computing be more fault tolerant than standard digital logic. That is, if the string has one bit changed it’s not going to make that much of a difference from the original string and computing with an erroneous number like this will probably result in similar results to the correct number.  To have anything like this in digital computation requires parity bits, ECC, CRC and other error correction mechanisms and the logic required to implement these is extensive.

Stochastic computing

2 bit multiplier

Another advantage of stochastic computation and using a probability  rather than binary (or decimal) digital representation, is that most arithmetic functions are much simpler to implement.

 

They discuss two examples in the original paper:

  • AND gate

    Multiplication – Multiplying two probabilistic bit streams together is as simple as ANDing the two strings.

  • 2 input stream multiplexer

    Addition – Adding two probabilistic bit strings together just requires a multiplexer, but you end up with a bit string that is the sum of the two divided by two.

What about other numbers?

I see a couple of problems with stochastic computing:,

  • How do you represent  an irrational number, such as the square root of 2;
  • How do you represent integers or for that matter any value greater than 1.0 in a probabilistic bit stream; and
  • How do you represent negative values in a bit stream.

I suppose irrational numbers could be represented by taking a near-by, close approximation of the irrational number. For instance, using 1.4 for the square root of two, or 1.41, or 1.414, …. And this way you could get whatever (progressive) precision that was needed.

As for integers greater than 1.0, perhaps they could use a floating point representation, with two defined bit strings, one representing the mantissa (fractional part) and the other an exponent. We would assume that the exponent rather than being a probability from 0..1.0, would be inverted and represent 1.0…∞.

Negative numbers are a different problem. One way to supply negative numbers is to use something akin to complemetary representation. For example, rather than the probabilistic bit stream representing 0.0 to 1.0 have it represent -0.5 to 0.5. Then progressive precision would work for negative numbers as well a positive numbers.

One major downside to stochastic numbers and computation is that high precision arithmetic is very difficult to achieve.  To perform 32 bit precision arithmetic would require a bit streams that were  2³² bits long. 64 bit precision would require streams that were  2**64th bits long.

Good uses for stochastic computing

One advantage of simplified logic used in stochastic computing is it needs a lot less power to compute. One example in the paper they use for stochastic computers is as a retinal sensor for in the body visual augmentation. They developed a neural net that did edge detection that used a stochastic front end to simplify the logic and cut down on power requirements.

Other areas where stochastic computing might help is for IoT applications. There’s been a lot of interest in IoT sensors being embedded in streets, parking lots, buildings, bridges, trucks, cars etc. Most have a need to perform a modest amount of edge computing and then send information up to the cloud or some edge consolidator intermediate

Many of these embedded devices lack access to power, so they will need to make do with whatever they can find.  One approach is to siphon power from ambient radio (see this  Electricity harvesting… article), temperature differences (see this MIT … power from daily temperature swings article), footsteps (see Pavegen) or other mechanisms.

The other use for stochastic computing is to mimic the brain. It appears that the brain encodes information in pulses of electric potential. Computation in the brain happens across exhibitory and inhibitory circuits that all seem to interact together.  Stochastic computing might be an effective way, low power way to simulate the brain at a much finer granularity than what’s available today using standard digital computation.

~~~~

Not sure it’s all there yet, but there’s definitely some advantages to stochastic computing. I could see it being especially useful for in body sensors and many IoT devices.

Comments?

Photo Credit(s):  The logic of random pulses

2 bit by 2 bit multiplier, By Sodaboy1138 (talk) (Uploads) – Own work, CC BY-SA 3.0, wikimedia

AND ANSI Labelled, By Inductiveload – Own work, Public Domain, wikimedia

2 Input multiplexor

A battery free implantable neural sensor, MIT Technology Review article

Integrating neural signal and embedded system for controlling a small motor, an IntechOpen article

AI reaches a crossroads

There’s been a lot of talk on the extendability of current AI this past week and it appears that while we may have a good deal of runway left on the machine learning/deep learning/pattern recognition, there’s something ahead that we don’t understand.

Let’s start with MIT IQ (Intelligence Quest),  which is essentially a moon shot project to understand and replicate human intelligence. The Quest is attempting to answer “How does human intelligence work, in engineering terms? And how can we use that deep grasp of human intelligence to build wiser and more useful machines, to the benefit of society?“.

Where’s HAL?

The problem with AI’s deep learning today is that it’s fine for pattern recognition, but it doesn’t appear to develop any basic understanding of the world beyond recognition.

Some AI scientists concede that there’s more to human/mamalian intelligence than just pattern recognition expertise, while others’ disagree. MIT IQ is trying to determine, what’s beyond pattern recognition.

There’s a great article in Wired about the limits of deep learning,  Greedy, Brittle, Opaque and Shallow: the Downsides to Deep Learning. The article says deep learning is greedy because it needs lots of data (training sets) to work, it’s brittle because step one inch beyond what’s it’s been trained  to do and it falls down, and it’s opaque because there’s no way to understand how it came to label something the way it did. Deep learning is great for pattern recognition of known patterns but outside of that, there must be more to intelligence.

The limited steps using unsupervised learning don’t show a lot of hope, yet

“Pattern recognition” all the way down…

There’s a case to be made that all mammalian intelligence is based on hierarchies of pattern recognition capabilities.

That is, at a bottom level  human intelligence consists of pattern recognition, such as vision, hearing, touch, balance, taste, etc. systems which are just sophisticated pattern recognition algorithms that label what we are hearing as Bethovan’s Ninth Symphony, tasting as grandma’s pasta sauce, and seeing as the Grand Canyon.

Then, at the next level there’s another pattern recognition(-like) system that takes all these labels and somehow recognizes this scene as danger, romance, school,  etc.

Then, at the next level, human intelligence just looks up what to do in this scene.  Almost as if we have a defined list of action templates that are what we do when we are in danger (fight or flight), in romance (kiss, cuddle or ?), in school (answer, study, view, hide, …), etc.  Almost like a simple lookup table with procedural logic behind each entry

One question for this view is how are these action templates defined and  how many are there. If, as it seems, there’s almost an infinite number of them, how are they selected (some finer level of granularity in scene labeling – romance but only flirting …).

No, it’s not …

But to other scientists, there appears to be more than just pattern recognition(-like) algorithms and lookup and act algorithms, going on inside our brains.

For example, once I interpret a scene surrounding me as in danger, romance, school, etc.,  I believe I start to generate possible action lists which I could take in this domain, and then somehow I select the one to do which makes the most sense in this situation or rather gets me closer to my current goal (whatever that is) in this situation.

This is beyond just procedural logic and involves some sort of memory system, action generative system, goal generative/recollection system, weighing of possible action scripts, etc.

And what to make of the brain’s seemingly infinite capability to explain itself…

Baby intelligence

Most babies understand their parents language(s) and learn to crawl within months after birth. But they haven’t listened to thousands of hours of people talking or crawled thousands of miles.  And yet, deep learning requires even more learning sets in order to label language properly or  learning how to crawl on four appendages. And of course, understanding language and speaking it are two different capabilities. Ditto for crawling and walking.

How does a baby learn to recognize these patterns without TB of data and millions of reinforcements (“Smile for Mommy”, say “Daddy”). And what to make of the, seemingly impossible to contain wanderlust, of any baby given free reign of an area.

These questions are just scratching the surface in what it really means to engineer human intelligence.

~~~~

MIT IQ is one attempt to try to answer the question that: assuming we understand how to pattern recognition can be made to work well on today’s computers what else do we need to do to build a more general purpose intelligence.

There are obvious ethical questions on whether we want to engineer a human level of intelligence (see my Existential risks… post). Our main concern is what it does (to humanity) once we achieve it.

But assuming we can somehow contain it for the benefit of humanity, we ought to take another look at just what it entails.

 

Photo Credits:  Tech trends for 2017: more AI …., the Next Silicon Valley website. 

HAL from 2001 a Space Odyssey 

Design software test labeling… 

Exploration in toddlers…, Science Daily website

Atomristors, a new single (atomic) layer memristor

Read an article the other day about the “Atomristor: non-volatile resistance  switching in atomic sheets of transition metal dichalcogenides” (TMDs), an ACS publication. The article shows research that discovered an atomic sheet level version of a memristor. The device is an atomic sheet of TMD that is sandwiched between two (gold, silver or graphene) electrodes.

They refer to the device switching non-volatile resistance (NVR) from low to high or vice versa but from our perspective it could just as easily be considered a non-volatile device usable for memory, storage, or electronic circuitry.

Prior to this research, it was believed that such resistance switching could not be accomplished with single atomic, sub-nanometre (0.7nm) sized, sheet of material.

NVR atomristor technological properties

The researchers discovered that NVR switching can occur at different device temperatures, sheet areas, compliance current, voltage sweep rate, and layer thickness. In all five degrees of freedom were tested to show that  TMD atomristors had wide applicability and allowed for significant environmental and electronic variability.

Not only was the effect extremely versatile, the researchers identified multiple materials which could be used for the atomic sheet. In fact, TMD are a class of materials and they showed 4 different TMD materials that had the NVR effect.

Surprisingly, some TMD materials exhibited the NVR effect using unipolar voltages and some using bipolar voltages, and some could use both.

The researchers went a long way to showing that the NVR was due to the atomic sheet. In one instance they specifically used non-lithographic measures to fabricate the devices. This process utilized graphene manufacturing like methods to produce an atomic sheet ontop of gold foil and depositing another gold layer ontop of that.

But they also used standard fabrication techniques to build the atomristor devices as well. Using these different fabrication methods, they were able to show the NVR effect using different electrodes types, testing gold, silver, and graphene, all of which worked similarly.

The paper talked of using atomristors in a software defined radio, as a electronic circuit/cross bar switch, or as a memory/storage device. But they also indicated that it could easily be used in a neuromorphic computer as well, effectively simulating neuron like computations.

There’s much more information in the ACS article.

How does it compare to flash?

As compared to flash, atomristors NVR devices should be able to provide higher levels (bits per mm) of density. And due to the lower current (~1v) required for (bipolar) NVR setting, reading and resetting, there’s a lower probability of leakage of stored charges as they’re scaled down to nm sizes.

And of course it comes in 2d sheets, so it’s just as amenable to 3D arrays as NAND and 3DX is today. That means that as fabs start scaling 3D NAND up in layers, atomristor NVR devices should be able to follow their technology roadmap to be scaled up just as high.

Atomristor computers, storage or switch devices

Going from the “lab” to an IT shop is a multifaceted endeavour that takes a lot of time. There are many steps to needed to get to commercialization and many lab breakthroughs never make it that far because of complexity, economics, and other factors.

For instance, memristors were first proposed in 1971 and HP(E) researchers first discovered material that could produce the memristor effect in 2008. In March 2012, HRL fabricated the first memristor chip on CMOS. In Dec. 2017, >9 years later, at their Discover Conference, HPE showed off “The Machine”, a prototype of a memristor based computer to the public. But we are still waiting to see one on the market for sale.

That being said, memristor technologies didn’t exist before 2008, so the use of these devices in a computer took sometime to be understood. The fact that atomristors are “just” an extremely, thinner version of memristors should help it be get to market faster that original memristor technologies. But how much faster than 9-12 years is anyone’s guess.

~~~~

Comments?

Picture Credit(s): All graphics and pictures are from the article in ACS

GPU growth and the compute changeover

Attended SC17 last month in Denver and Nvidia had almost as big a presence as Intel. Their VR display was very nice as compared to some of the others at the show.

GPU past

GPU’s were originally designed to support visualization and the computation to render a specific scene quickly and efficiently. In order to do this they were designed with 100s to now 1000s of arithmetically intensive (floating point) compute engines where each engine could be given an individual pixel or segment of an image and compute all the light rays and visual aspects pertinent to that scene in a very short amount of time. This created a quick and efficient multi-core engine to render textures and map polygons of an image.

Image rendering required highly parallel computations and as such more compute engines meant faster scene throughput. This led to todays GPUs that have 1000s of cores. In contrast, standard microprocessor CPUs have 10-60 compute cores today.

GPUs today 

Funny thing, there are lots of other applications for many core engines. For example, GPUs also have a place to play in the development and mining of crypto currencies because of their ability to perform many cryptographic operations a second, all in parallel

Another significant driver of GPU sales and usage today seems to be AI, especially machine learning. For instance, at SC17, visual image recognition was on display at dozens of booths besides Intel and Nvidia. Such image recognition  AI requires a lot of floating point computation to perform well.

I saw one article that said GPUs can speed up Machine Learning (ML) by a factor of 250 over standard CPUs. There’s a highly entertaining video clip at the bottom of the Nvidia post that shows how parallel compute works as compared to standard CPUs.

GPU’s play an important role in speech recognition and image recognition (through ML) as well. So we find that they are being used in self-driving cars, face recognition, and other image processing/speech recognition tasks.

The latest Apple X iPhone has a Neural Engine which my best guess is just another version of a GPU. And the iPhone 8 has a custom GPU.

Tesla is also working on a custom AI engine for its self driving cars.

So, over time, GPUs will have an increasing role to play in the future of AI and crypto currency and as always, image rendering.

 

Photo Credit(s): SC17 logo, SC17 website;

GTX1070(GP104) vs. GTX1060(GP106) by Fritzchens Fritz;

Intel 2nd Generation core microprocessor codenamed Sandy Bridge wafer by Intel Free Press

Magnonics for configurable electronics

Read an article today in ScienceDaily on [a] New way to write magnetic info … that discusses research done at Imperial College Of London that used a magnetic force microscope (small magnetic probe) to write magnetic fields onto a dense array of nanowires.

Frustrated metamaterials needed

The original research is written up in a Nature article Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing  (paywall). Unclear what that means but the paper abstract discusses geometrically frustrated magnetic metamaterials.  This is where the physical size or geometrical properties of the materials at the nanometer scale restricts or limits the magnetic states that material can exhibit.

Magnetic storage deals with magnetic material but there are a number of unique interactions of magnetic material when in close (nm) proximity to one another and the way nanowire geometrically frustrated magnetic metamaterials can be magnetized to different magnetic moments which can be exploited for other uses.  These interactions and magnetic moments can be combined to provide electronic circuitry and data storage.

I believe the research provides a proof point that such materials can be written, in close proximity to one another using a magnetic force microscope.

Why it’s important

The key is the potential to create  magnonic circuitry based on the pattern of moments writen into an array of nanowires. In doing so, one can fabricate any electrical circuit. It’s almost like photolithography but without fabs, chemicals, or laser scanners.

At first I thought this could be a denser storage device, but the potential is much greater if electronic circuitry could be constructed without having to fabricate semiconductors. It would seem ideal for testing out circuitry before manufacturing. And ultimately if it could be scaled up, the manufacture/fabrication of electronic circuitry itself could be done using these techniques.

Speed, endurance, write limits?

There was no information in the public article about the speed of writing the “frustrated magnetic metamaterials”. But an atomic force microscope can scan 150×150 micrometers in several minutes. If we assume that a typical chip size today is 150×150 mm, then this would take 1E6 times several minutes, or ~2K days. With multiple scanning force microscopes operating concurrently we could cut this down by a factor of 10 or 100 and maybe someday 1000. 2 days to write any electronic circuit on the order of todays 23nm devices with nanowires and magnetic force microscopes would be a significant advance

Also there was no mention of endurance, write limits or other characteristics we have learned to love with Flash storage. But the assumption is that it can be written multiple times and that the pattern stays around for some amount of time.

How magnetics generate electronic circuits

Neither Wikipedia page, the public article or the paywall articles’ abstract describes how Magnonics can supply electronic circuitry. However both the abstract and the public article discuss applications for this new technology in hardware based neural networks using arrays of densely packed nanowires.

Presumably, by writing different magnetic patterns in these nanowire metamaterials, such patterns can be used to simulate hardware connected neurons. This means that the magnetic information can be overwritten because it can be trained. Also, such magnetic circuits can be constructed to: a) can create different path for electrons to flow through the material; b) can restrict or enhance this electronic flow, and c) can integrate across a number of inputs and determine how electronic flow will proceed from a simulated neuron.

If magnonics can do all that,  it’s very similar to electronic gates today in CPU, GPUs and other electronic circuitry. Maybe it cannot simulate every gate or electronic device that’s found in todays CPUs but it’s a step in the right direction. And magnonics is relatively new. Silicon transistors are over 70 years old and the integrated circuit is almost 60 years old. So in time, magnonics could very well become the next generation of chip technology.

Writing speed is a problem. Maybe if they spun the nanowire array around the magnetic force microscope…

Comments?

Photo Credits:  Real space observation of emergent magnetic monopoles … Nature article

Realization of ground state in artificial kagome spin ice via topological defect driven magnetic writing, Nature article