MIT’s new Navion chip for better Nano drone navigation

Read an article this week in Science Daily (Chip upgrade help’s bee-sized drones navigate) about a recent chip created by MIT, called Navion, that reduces size and power consumption for electronics used in drone navigation. The chip is also documented on MIT’s Navion project homepage and in a technical  paper describing the new VIO (Visual-Inertial Odometry ) Navion chip.

The Navion chip can perform inertial measurement at 52Khz as well as process video streams of 752×480 stereo images at 171 frames per second in a 20 sqmm package consuming only 24mW of power. The chip was fabricated on a 65nm CMOS process line.

Navion is the result of a collaborative design process which optimized electronics required to perform  drone navigation processing. By placing all the memory required for inertial measurement and image analysis and all the processing hardware on the same chip, they have substantially reduced power consumption and space requirements for drone navigation.

Navion architecture

Navion uses a state of the art, non-linear factor graph optimization algorithm to navigate in space.  It doesn’t sound like  DL neural net image recognition but more like a statistical/probabilistic approach to image mapping and place estimation. The chip uses image compression, two stage memory, and sparse linear solver memory to reduce image processing memory requirements from 3.5MB to less than 1MB.

The chip uses 3 inputs: two images (right &  left image) and IMU (inertial management unit sensor) and has one (complex output), its estimate of the current state of where it is on the map.

Navion processing creates and maintains a 3D map using stereo images and provides navigational support to move through that space.  According to the paper, the Navion chip updates the state(s) and sparse 3D map at a KF (Kalman filter) rate of between 16 and 90 fps. Navion also offers configurations options to maximize accuracy, throughput or energy efficiency.

Navion compares well to other navigation electronics

The table shows comparisons of the Navion chip against other traditional navigational systems that use Xeon, ARM or FPGA chips. As far as I can tell it’s either much better or at least on a par with these other larger, more complex, power hungry systems.

Nano drones are coming to our space, sooner than anyone expects.

Comments?

Photo credit(s): System overview from Navion project page (c) 2018 MIT;

Picture of chip with layout  from Navion project page (c) 2018 MIT;

Navion: A Fully Integrated Energy-Efficient Visual-Inertial Odometry Accelerator for Autonomous Navigation of Nano Drones (c) 2018 MIT

More power efficient deep learning through IBM and PCM

Read an article today from MIT Technical Review (TR) (AI could get 100 times more efficient with IBM’s new artificial synapses). Discussing the power efficiency of a new analog approach to neural nets and deep learning.

We have talked about IBM’s TrueNorth and Synapse neuromorphic devices  and PCM neural nets before (see: Parts 1, 2, 3, & 4).

The paper in Nature (Equivalent accuracy accelerated neural training using analogue memory ) referred to by the TR article is behind a pay wall. However, another ArsTechnica (Ars) article (Training a neural network in phase change memory beats GPUs) on the new research was a bit more informative.

Both articles discuss a new analog approach, using phase change memory (PCM) which has significant power/training efficiency when compared to today’s standard GPU AI processor. Both the TR and Ars papers report on IBM developments simulating a new (PCM based) neuromorphic device that reduces training  power consumption AND training time by a factor of 100.   But the Nature paper abstract says it reduces both power consumption and computational space (computations per sq mm) by a factor of 100, not exactly the same.

Why PCM

PCM is a nonvolatile memory technology (see part 4 above for more info) that uses electronically induced phase changes in a material to establish a 1’s or 0’s state for a PCM bit.

However, another advantage of PCM is that it also can take on a state between 0 and 1. This is bad for data memory/storage but good for neural nets.

For a PCM based neural net you could have a layer of PCM (neuron) structures and standard wiring that wires all the PCM neurons to the next layer down, for however many layers required for your neural net. The PCM value would indicate the strength of the connection between neurons (synapses).

But, the problem with a PCM neural net is that PCM states don’t provide enough graduations of values between 0 and 1 to fully map today’s neural net weights.

IBM’s latest design has two different tiers of neural nets

According to Ars article, IBM’s latest design has a two tier approach to using PCM in its neural net. The first, top tier uses a PCM structure and the second lower tier uses a more traditional, silicon based structure and together they implement the neural net.

The Ars article speaks of the new two tier design as providing two digit resolution for the weight between  neuron. The structure implemented in PCM determines the higher order digit and the more traditional, silicon based, neural net segment determines the lower order digit in the two digit neural net weight.

With this approach, training occurs mostly in the more traditional, silicon layer neural net, but every 100 or so training events (epochs),  information is used to modify the PCM structure as well. In this fashion, the PCM-silicon neural net is fine tuned using 1 out of 100 or so training events to correct the PCM layer and the other 99 or so training events to modify the silicon layer.

In addition, the silicon layer is apparently implemented in silicon to mimic the PCM layer, using capacitors and transistors.

~~~~

I wonder why not just use two tiers of PCM to do the same thing but it’s possible that training the silicon layer is more power efficient, speedy or both than the PCM layer.

The TR and Ars articles seem to make a point of saying this is analogue computing. And I would guess because the PCM and the silicon layer can take on many values between 0 and 1 that means it’s not digital.

Much of the article is based on combined hardware (built using 90nm technology) and software simulations of the new PCM-silicon neuromorphic device. However, simulations like this are a standard step in ASIC design process, and if successful, we would expect an chip to emerge from foundry within 6-12 months from now.

The Nature paper’s abstract indicated that they simulated the device using standard (MNIST, MNIST-backrand, CIFAR-10 and CIFAR-100) training datasets for handwritten digit recognition and color image classification/recognition. The new device was able to approach within 1% accuracy of software trained neural net with 1% the power and (when updated to latest foundry technologies) in 1% the space.

Furthermore, the abstract said that the current device supports ~205K synapses. The previous generation, IBM TrueNorth (see part 2 above) had the “equivalent of 1M neurons” and their earlier IBM SYNAPSE (see part 1 above) chip had “256K programable synapses” and 256 computational elements. But I believe both of those were single tier devices.

I’d also be very interested in whether the neuromorphic device is compatible with and could be programmed with PyTorch or TensorFlow but I didn’t see any information on how the devices were programmed.

Comments?

Photo Credit(s): neuron by mararie 

3D CrossPoint graphic, taken from Intel-Micron session at FMS16

brain-neurons by Fotis Bobolas

Information flows everywhere – part 1

Read an article today from Scientific American (Sewage is helping cities flush out the opioid crisis) about how using chemical analysis of wastewater can be used to assess the extent of the opioid crisis in their city.

Wastewater information highway

There’s a lab at ASU (Arizona State University) that chemically analyzes samples of wastewater to determine the amount of drugs that a city’s population excretes. They can provide a near real-time assessment of the proportion of drugs in city sewage and thereby, in a city’s population.

The problem with public drug use surveys and hospital data gathering is that they take time.  Moreover, surveys and hospital data gathering typically come long after drugs problem have become a serious problem in a city’s population.

Wastewater sample drug analysis can be done in a matter of days and can be redone as often as needed. Such data could be used to track intervention activities and see if they have a real impact (positive or negative) on drug use in a population.

Neighborhood health

In addition, by sampling sewage at a neighborhood level, one can gain an assessment of drug problems at any sub-division of a city that’s needed.

The above article talks about an MIT program with Cary, NC (from Biobot.io)  that is designing robots to traverse sewer pipes and analyze wastewater chemical makeup in real time, reporting this back to ground stations around the city.

With such an approach, one could almost zero in (depending on sewer pipe networks) on any neighborhood in a city, target specific interventions at that level and measure impact in (digestion delayed) real time. Doing so, cities or states for that matter, could  experiment with different interventions on a neighborhood by neighborhood basis and gain statistical evidence on drug problem intervention effectiveness.

But, you can analyze wastewater for any number of variables, such as viruses, bacteria, enzymes, etc. Any of which can lead to a better understanding of a population’s health.

~~~~

Two things I want to leave you with:

First, public health has had a major impact on human health and has doubled our lifespan in 200 years. All modern cities have water treatment plants today to insure water quality and thereby, have reduced the incidence of cholera and other waterborne epidemics in their cities. Wastewater analysis has the potential for significant improvements in population health monitoring. Just like water treatment, wastewater analysis will someday become common public health practice in modern cities throughout the world.

Second, I was at a conference this week which presented a slide that there was no cold data anymore (Pure//Accelerate 2018). This was in reference to  re-analyzing old, cold data can often lead to insights and process improvements that were not obvious at first glance.

But it’s not just data anymore. Any activity done by man needs to be analyzed for (inherent & invisible) information flows that could be extracted to make the world a better place.

Photo Credit(s):

NetApp’s new NVMeoF/FC AFF & Cloud Data Volumes for every cloud

We attended a NetApp analyst event in their CA HQ last week and they had some interesting announcements as well other information to share. 1st up new faster ONTAP storage.

NVMeoF AFF

NetApp announced this week that their latest generation AFF (All Flash FAS) systems will support FC NVMeoF. We asked if this was just for NVMe SSDs or did it apply to all AFF media. The answer was it’s just another host interface which the customer can license for NVMe SSDs (available only on AFF F800) or SAS SSDs (A700S, A700, and A300). The only AFF not supporting the new host interface is their lowend AFF A220.

As for which NVMeoF, they only support FC at the moment, and it’s our belief that the FC NVMeoF spec is most well defined these days and the FC switch hardware (Brocade-Broadcom since Gen 5, now shipping Gen 6, Cisco not sure) already has NVMeoF support.

NetApp also mentioned support for 100GbE (A800 & A700S only) and 32Gbs FC hardware (all AFF systems but A220). So, presumably they offer NVMeoF for both 32Gbps and 16Gbps FC.

No word on when this will be available for Ethernet FCoE or iSCSI (iNVMe?) but with all the major storage vendors bar one, moving to NVMe SSDs it’s only a matter of time before they also support Ethernet NVMeoF.

As for AFF NVMeoF performance, the answer wasn’t entirely satisfactory. The indication was that the interface reduced response time by 10 usecs or so for NVMe SSDs over SAS SSDs. But I didn’t see any other performance information to substantiate that.

We did see on their AFF datasheet that with NVMe SSDs and NVMeoF FC, the AFF A800 response time was sub 200usec with throughput of 300GB/s (in a 24 node cluster, 12 HA pairs). This means they add only about 100usec for ONTAP data services, a decent trade off from our perspective. Later in their datasheet they say the A800 is capable of 1.3M IOPS and sub-500usec latencies. Unsure why they quoted both numbers.

Cloud Data Volumes

NetApp is taking storage to the cloud. They just announced that NetApp Cloud Data Volumes will be available as a native service under Google Cloud Platform (GCP). NetApp Cloud Data Volume is a storage-as-a-service offering that provides on demand ONTAP file services in the cloud.

For GCP,  both Google and NetApp will be offering the service. Dianne Green, GCP VP said Cloud Data Volumes are a bit like Kubernetes, disruption without disrupting. Customers can easily migrate their onprem file based applications to the cloud without having to worry about the performance of their data or data protection for that matter.

Getting the data there is another matter, but NetApp has other services like CloudSync and someday (maybe for Cloud Data Volumes), SnapMirror, which can help customers move data to and from the cloud.

Currently Cloud Data Volumes are in public preview as an Microsoft Azure Enterprise NFS (and SMB) service. It’s also in beta (I think) in AWS marketplace. And availability on GCP is still restricted. There’s a lot of emphasis at NetApp events on Cloud Data Volumes given its current status on public cloud providers but we think they are trying to gain some experience before they roll it out to the rest of the world.

However,  Jean English, NetApp CMO mentioned that NetApp’s Cloud Data Service business unit has over 1800 customers and currently supports a multi-PB storage footprint in various clouds. Note, this is not just Cloud Data Volumes but comprises all NetApp Cloud Data Services, which includes ONTAP Cloud, NPS, CloudSync, AltaVault, etc. Nonetheless, it’s an impressive indicator of just how far they have come in applying their storage magic to the public cloud in a short time. The hyperscalers (read public cloud providers) say NetApp is 2 or more years ahead of all the other competition and from what we can see, it’s true.

One of the key differentiators between NetApp Cloud Data Volumes and ONTAP Cloud is performance SLAs. Cloud Data Volume customers can select and purchase a specified performance SLA. We believe it comes at three levels and is normally purchased on a pay as you go, consumption based, service offering. However, it’s also available to be billed periodically, other purchase options may be available as well.

When asked what storage was behind the service, the only thing NetApp would confirm was that it was ONTAP storage, present in public cloud data centers in various regions. So Cloud Data Volumes is available in only specific regions but I would expect that to expand over time.

Data Visualization Center

They also christened their new Data Visualization Center (DVC) and we had a multi-course meal at the Bistro at the center. The DVC had a wrap around, 1.5 floor tall screen which showed some of NetApp customer success stories. Inside the screen was a more immersive setting and there was plenty of VR equipment in work spaces alongside customer conference rooms.

Full Disclosure: NetApp paid for all our travel, hotel and food during the analyst event and gave us all Google Home Minis as going away presents and NetApp is a long time customer of my firm.

Random access, DNA object storage system

Read a couple of articles this week Inching closer to a DNA-based file system in ArsTechnica and DNA storage gets random access in IEEE Spectrum. Both of these seem to be citing an article in Nature, Random access in large-scale DNA storage (paywall).

We’ve known for some time now that we can encode data into DNA strings (see my DNA as storage … and Genomic informatics takes off posts).

However, accessing DNA data has been sequential and reading and writing DNA data has been glacial. Researchers have started to attack the sequentiality of DNA data access. The prize, DNA can store 215PB of data in one gram and DNA data can conceivably last millions of years.

Researchers at Microsoft and the University of Washington have come up with a solution to the sequential access limitation. They have used polymerase chain reaction (PCR) primers as a unique identifier for files. They can construct a complementary PCR primer that can be used to extract just DNA segments that match this primer and amplify (replicate) all DNA sequences matching this primer tag that exist in the cell.

DNA data format

The researchers used a Reed-Solomon (R-S) erasure coding mechanism for data protection and encode the DNA data into many DNA strings, each with multiple (metadata) tags on them. One of tags is the PCR primer tag header, another tag indicates the position of the DNA data segment in the file and an end of data tag that is the same PCR primer tag.

The PCR primer tag was used as sort of a file address. They could configure a complementary PCR tag to match the primer tag of the file they wanted to access and then use the PCR process to replicate (amplify) only those DNA segments that matched the searched for primer tag.

Apparently the researchers chunk file data into a block of 150 base pairs. As there are 2 complementary base pairs, I assume one bit to one base pair mapping. As such, 150 base pairs or bits of data per segment means ~18 bytes of data per segment. Presumably this is to allow for more efficient/effective encoding of data into DNA strings.

DNA strings don’t work well with replicated sequences of base pairs, such as all zeros. So the researchers created a random sequence of 150 base pairs and XOR the file DNA data with this random sequence to determine the actual DNA sequence to use to encode the data. Reading the DNA data back they need to XOR the data segment with the random string again to reconstruct the actual file data segment.

Not clear how PCR replicated DNA segments are isolated and where they are originally decoded (with a read head). But presumably once you have thousands to millions of copies of a DNA segment,  it’s pretty straightforward to decode them.

Once decoded and XORed, they use the R-S erasure coding scheme to ensure that the all the DNA data segments represent the actual data that was encoded in them. They can then use the position of the DNA data segment tag to indicate how to put the file data back together again.

What’s missing?

I am assuming the cellular data storage system has multiple distinct cells of data, which are clustered together into some sort of organism.

Each cell in the cellular data storage system would hold unique file data and could be extracted and a file read out individually from the cell and then the cell could be placed back in the organism. Cells of data could be replicated within an organism or to other organisms.

To be a true storage system, I would think we need to add:

  • DNA data parity – inside each DNA data segment, every eighth base pair would be a parity for the eight preceding base pairs, used to indicate when a particular base pair in eight has mutated.
  • DNA data segment (block) and file checksums –  standard data checksums, used to verify and correct for double and triple base pair (bit) corruption in DNA data segments and in the whole file.
  • Cell directory – used to indicate the unique Cell ID of the cell, a file [name] to PCR primer tag mapping table, a version of DNA file metadata tags, a version of the DNA file XOR string, a DNA file data R-S version/level, the DNA file length or number of DNA data segments, the DNA data creation data time stamp, the DNA last access date-time stamp,and DNA data modification data-time stamp (these last two could be omited)
  • Organism directory – used to indicate unique organism ID, organism metadata version number, organism unique cell count,  unique cell ID to file list mapping, cell ID creation data-time stamp and cell ID replication count.

The problem with an organism cell-ID file list is that this could be quite long. It might be better to somehow indicate a range or list of ranges of PCR primer tags that are in the cell-ID. I can see other alternatives using a segmented organism directory or indirect organism cell to file lists b-tree, which could hold file name lists to cell-ID mapping.

It’s unclear whether DNA data storage should support a multi-level hierarchy, like file system  directories structures or a flat hierarchy like object storage data, which just has buckets of objects data. Considering the cellular structure of DNA data it appears to me more like buckets and the glacial access seems to be more useful to archive systems. So I would lean to a flat hierarchy and an object storage structure.

Is DNA data is WORM or modifiable? Given the effort required to encode and create DNA data segment storage, it would seem it’s more WORM like than modifiable storage.

How will the DNA data storage system persist or be kept alive, if that’s the right word for it. There must be some standard internal cell mechanisms to maintain its existence. Perhaps, the researchers have just inserted file data DNA into a standard cell as sort of junk DNA.

If this were the case, you’d almost want to create a separate, data  nucleus inside a cell, that would just hold file data and wouldn’t interfere with normal cellular operations.

But doesn’t the PCR primer tag approach lend itself better to a  key-value store data base?

Photo Credit(s): Cell structure National Cancer Institute

Prentice Hall textbook

Guide to Open VMS file applications

Unix Inodes CSE410 Washington.edu

Key Value Databases, Wikipedia By ClescopOwn work, CC BY-SA 4.0, Link

Blockchains go mainstream…

 

I read an article a while back on Finland’s use of blockchain technology to provide bank accounts and identity services to immigrants (see  MIT TechReview article about Finland).

Blockchains were originally invented as a way of supporting financial transactions outside the current, government monitored, financial marketplace. With Finland’s experiment, the government is starting to use blockchains to support the unbanked and monitoring their financial activity – go figure.

Debit cards on blockchain

Finland’s using a Helsinki based startup MONI, to assign a MONI card, essentially a prepaid MasterCard, to all immigrants. An immigrant can use their MONI card to pay for anything online or in real life, use it as a direct deposit account or to receive and track the use of government assistance.

Underlying the MONI card is public blockchain technology. That is MONI  is not using normal credit card services to support it’s bank accounts, MONI money transfers are done through the use of public blockchains.

MONI accounts are essentially (crypto currency) wallets but used as a debit card. The user merely enters a series of numbers into web forms or uses their MONI card at a credit card terminals throughout Europe. Transferring money between MONI users anywhere in the World is also free and instantaneous.

Finland also sees an immutable record of all immigrant financial transactions,  that can be monitored to track immigrant (financial) integration into the country.

MONI is intending to make this service more broadly available. A MONI card account costs €2/month and MONI take’s a small cut out of each monetary transaction.

IDs on blockchain

I read another article the other day “Microsoft to implement blockchain-based ID system” in CoinTelegraph about using blockchains as a universal digital ID.

India has over the last decade, implemented a digital government ID using biometrics (see Aadhaar wikipedia article). Other countries have been moving to e-government where use of government services is implemented over the Internet (see EU article on eGovernment in Lithuania). Such eGovernment services depend on a digitized population registry.

Although it’s unclear whether Aadhaar and Lithuania make use of blockchain technology for their ID services, Microsoft’s definitely looking to blockchains to provide unique accounts/digital IDs to it’s population of users.

User signon’s has been a prevalent problem of the web for years. Each and every web and mobile App requires a person to signon to personalize their App. Nowadays, many Apps support using Google ID or Facebook ID for a single signon and there are other technologies being offered that provide similar services. Using a blockchain ID could easily support a single signon service.

The blockchain ID (wallet) public key could easily be used to encrypt an authentication transaction, identifying the App and the user. This authentication transaction would be processed by the blockchain digital ID service would use the private key to decrypt the transaction and use a backend ID App repository for the user to check to see that the user loging in, is the person that opened the account, acting as a sort of “proof of who you are”

Storage on blockchain

Filecoin and StorJ are storage providers that use blockchain services to allow others to use your local (or networked) storage to provide storage to the world.

A while back I had written about (free) peer to peer storage and compute services  (see my Free P2P cloud storage … post). But the problem was how do people benefit from hosting the P2P storage or compute. Filecoin and Storj solved this by paying in cryptocurrencies for storage hosted on your hardware.

Filecoin offers a storage auction and hosting service that anyone worldwide can log into and use. The data stored is encrypted end-to-end so that no one can see what’s being stored and the data is also erasure coded so that it  is protected and accessible even with having one or more hosting sites be offline.

Filecoin uses “proofs of storage“, “proofs of space”, “proofs of data possession“, and “proofs of retrievability” as a way to guarantee their storage service works properly. They also use chained “proofs of replication” as “proofs of spacetime” as service validation checks. Proofs of Replication are a way of insuring that storage providers are not deduplicating data copies and charging for non-deduped storage. (See Filecoin’s Proof of Replication paper for more info).

Storj looks somewhat similar to Filecoin, but without as much sophistication behind it.

Compute on blockchain

Ethereum was invented to support smart contracts that run on blockchain technology. IBM’s HyperLegder OpenLedger project (see our GreyBeardsOnStorga Podcast and RayOnStorage post on Hyperledger) is another example.

Smart contracts are essentially applications that run in a blockchains virtualized environment. Blockchain services are used to run an application and validate that’s it’s run only once. In some cases smart contracts use  external oracles to query as a way to verify something or some action has occurred outside the blockchain. Other oracles can be entirely digital entities that check on a particular commodity price, weather pattern, account value, etc. The oracle becomes a critical step in determining the go no go status of a smartcontract.

Advertisements vs. crypto mining

Salon, a news providing website, offers readers an option to see advertisements or to allow Salon to use their computer (browser) to mine crypto coins. (See Salon offers… article in CoinDesk).

I believe this offer is made when the website detects a viewer is using  ad blockers.

~~~~

Tthe trend is clear, people, organizations and even governments are looking at blockchain technology to provide basic and advanced services around the world.

If anyone would is interested in providing a pre-paid Visa card via blockchains, please contact me. I’d like to help.

Now if I could just find my GPU’s at a decent price somewhere…

Speaking of advertising… RayOnStorage doesn’t use advertising. But blogging like this takes time and money. If anyone’s interested in helping fund this blog, please consider sending some BTC our way, even 0.0001 BTC would help.

Our BTC wallet address is:

1MqBbAvMo6QbCVD6ZwtbLaPxmcUZGj9Ghw

Photo Credit(s): Blockchain and the public sector on OpenGovAsia.com

Unleash your design teams with single signon on Unifilabs.com

Understanding the difference between P2P and Client-server networks on LinkedIN

Blockgeek’s guide to smart contracts

AI reaches a crossroads

There’s been a lot of talk on the extendability of current AI this past week and it appears that while we may have a good deal of runway left on the machine learning/deep learning/pattern recognition, there’s something ahead that we don’t understand.

Let’s start with MIT IQ (Intelligence Quest),  which is essentially a moon shot project to understand and replicate human intelligence. The Quest is attempting to answer “How does human intelligence work, in engineering terms? And how can we use that deep grasp of human intelligence to build wiser and more useful machines, to the benefit of society?“.

Where’s HAL?

The problem with AI’s deep learning today is that it’s fine for pattern recognition, but it doesn’t appear to develop any basic understanding of the world beyond recognition.

Some AI scientists concede that there’s more to human/mamalian intelligence than just pattern recognition expertise, while others’ disagree. MIT IQ is trying to determine, what’s beyond pattern recognition.

There’s a great article in Wired about the limits of deep learning,  Greedy, Brittle, Opaque and Shallow: the Downsides to Deep Learning. The article says deep learning is greedy because it needs lots of data (training sets) to work, it’s brittle because step one inch beyond what’s it’s been trained  to do and it falls down, and it’s opaque because there’s no way to understand how it came to label something the way it did. Deep learning is great for pattern recognition of known patterns but outside of that, there must be more to intelligence.

The limited steps using unsupervised learning don’t show a lot of hope, yet

“Pattern recognition” all the way down…

There’s a case to be made that all mammalian intelligence is based on hierarchies of pattern recognition capabilities.

That is, at a bottom level  human intelligence consists of pattern recognition, such as vision, hearing, touch, balance, taste, etc. systems which are just sophisticated pattern recognition algorithms that label what we are hearing as Bethovan’s Ninth Symphony, tasting as grandma’s pasta sauce, and seeing as the Grand Canyon.

Then, at the next level there’s another pattern recognition(-like) system that takes all these labels and somehow recognizes this scene as danger, romance, school,  etc.

Then, at the next level, human intelligence just looks up what to do in this scene.  Almost as if we have a defined list of action templates that are what we do when we are in danger (fight or flight), in romance (kiss, cuddle or ?), in school (answer, study, view, hide, …), etc.  Almost like a simple lookup table with procedural logic behind each entry

One question for this view is how are these action templates defined and  how many are there. If, as it seems, there’s almost an infinite number of them, how are they selected (some finer level of granularity in scene labeling – romance but only flirting …).

No, it’s not …

But to other scientists, there appears to be more than just pattern recognition(-like) algorithms and lookup and act algorithms, going on inside our brains.

For example, once I interpret a scene surrounding me as in danger, romance, school, etc.,  I believe I start to generate possible action lists which I could take in this domain, and then somehow I select the one to do which makes the most sense in this situation or rather gets me closer to my current goal (whatever that is) in this situation.

This is beyond just procedural logic and involves some sort of memory system, action generative system, goal generative/recollection system, weighing of possible action scripts, etc.

And what to make of the brain’s seemingly infinite capability to explain itself…

Baby intelligence

Most babies understand their parents language(s) and learn to crawl within months after birth. But they haven’t listened to thousands of hours of people talking or crawled thousands of miles.  And yet, deep learning requires even more learning sets in order to label language properly or  learning how to crawl on four appendages. And of course, understanding language and speaking it are two different capabilities. Ditto for crawling and walking.

How does a baby learn to recognize these patterns without TB of data and millions of reinforcements (“Smile for Mommy”, say “Daddy”). And what to make of the, seemingly impossible to contain wanderlust, of any baby given free reign of an area.

These questions are just scratching the surface in what it really means to engineer human intelligence.

~~~~

MIT IQ is one attempt to try to answer the question that: assuming we understand how to pattern recognition can be made to work well on today’s computers what else do we need to do to build a more general purpose intelligence.

There are obvious ethical questions on whether we want to engineer a human level of intelligence (see my Existential risks… post). Our main concern is what it does (to humanity) once we achieve it.

But assuming we can somehow contain it for the benefit of humanity, we ought to take another look at just what it entails.

 

Photo Credits:  Tech trends for 2017: more AI …., the Next Silicon Valley website. 

HAL from 2001 a Space Odyssey 

Design software test labeling… 

Exploration in toddlers…, Science Daily website

Atomristors, a new single (atomic) layer memristor

Read an article the other day about the “Atomristor: non-volatile resistance  switching in atomic sheets of transition metal dichalcogenides” (TMDs), an ACS publication. The article shows research that discovered an atomic sheet level version of a memristor. The device is an atomic sheet of TMD that is sandwiched between two (gold, silver or graphene) electrodes.

They refer to the device switching non-volatile resistance (NVR) from low to high or vice versa but from our perspective it could just as easily be considered a non-volatile device usable for memory, storage, or electronic circuitry.

Prior to this research, it was believed that such resistance switching could not be accomplished with single atomic, sub-nanometre (0.7nm) sized, sheet of material.

NVR atomristor technological properties

The researchers discovered that NVR switching can occur at different device temperatures, sheet areas, compliance current, voltage sweep rate, and layer thickness. In all five degrees of freedom were tested to show that  TMD atomristors had wide applicability and allowed for significant environmental and electronic variability.

Not only was the effect extremely versatile, the researchers identified multiple materials which could be used for the atomic sheet. In fact, TMD are a class of materials and they showed 4 different TMD materials that had the NVR effect.

Surprisingly, some TMD materials exhibited the NVR effect using unipolar voltages and some using bipolar voltages, and some could use both.

The researchers went a long way to showing that the NVR was due to the atomic sheet. In one instance they specifically used non-lithographic measures to fabricate the devices. This process utilized graphene manufacturing like methods to produce an atomic sheet ontop of gold foil and depositing another gold layer ontop of that.

But they also used standard fabrication techniques to build the atomristor devices as well. Using these different fabrication methods, they were able to show the NVR effect using different electrodes types, testing gold, silver, and graphene, all of which worked similarly.

The paper talked of using atomristors in a software defined radio, as a electronic circuit/cross bar switch, or as a memory/storage device. But they also indicated that it could easily be used in a neuromorphic computer as well, effectively simulating neuron like computations.

There’s much more information in the ACS article.

How does it compare to flash?

As compared to flash, atomristors NVR devices should be able to provide higher levels (bits per mm) of density. And due to the lower current (~1v) required for (bipolar) NVR setting, reading and resetting, there’s a lower probability of leakage of stored charges as they’re scaled down to nm sizes.

And of course it comes in 2d sheets, so it’s just as amenable to 3D arrays as NAND and 3DX is today. That means that as fabs start scaling 3D NAND up in layers, atomristor NVR devices should be able to follow their technology roadmap to be scaled up just as high.

Atomristor computers, storage or switch devices

Going from the “lab” to an IT shop is a multifaceted endeavour that takes a lot of time. There are many steps to needed to get to commercialization and many lab breakthroughs never make it that far because of complexity, economics, and other factors.

For instance, memristors were first proposed in 1971 and HP(E) researchers first discovered material that could produce the memristor effect in 2008. In March 2012, HRL fabricated the first memristor chip on CMOS. In Dec. 2017, >9 years later, at their Discover Conference, HPE showed off “The Machine”, a prototype of a memristor based computer to the public. But we are still waiting to see one on the market for sale.

That being said, memristor technologies didn’t exist before 2008, so the use of these devices in a computer took sometime to be understood. The fact that atomristors are “just” an extremely, thinner version of memristors should help it be get to market faster that original memristor technologies. But how much faster than 9-12 years is anyone’s guess.

~~~~

Comments?

Picture Credit(s): All graphics and pictures are from the article in ACS