UK Biobank & the data economy – part 2

A couple of weeks back I wrote a post about repositories for all the data that users generate these days and what to do with it.  (See our post on Data banks, deposits, … data economy – part 1).

This past week I read an article (see ScienceDaily Genetics of brain structure … article) which partially exemplifies what that post talked about. The research used publicly available genetic information to tease out brain structure hereditary characteristics.  The Science Daily article was a summary of research done at the University of Oxford using information provided from the UK Biobank.

Biobank as a data bank

The Biobank has recruited 500K participants from the UK,  aged 40-69,  between 2006-2010, to share their anonymized health data with researchers and scientist around the world. The Biobank is set up as a Scottish charity, funded by various health organizations in UK both gov’t and private. 

In addition to information collected during the baseline assessment: 

  • 100K participants have worn a 24 hour health monitoring device for a week and 20K have signed up to repeat this activity.
  • 500K participants are providing have been genotyped (DNA sequencing to determine hereditary genes)
  • 100K participants will be medically scanned (brain, heart, abdomen, bones, carotid artery) with images stored in the Biobank
  • 100K participants have signed up to receive questionnaires asking  about diet, exercise, work history, digestive health and other medical indicators..

There’s more. Biobank is linking to electronic health records (EHR) of participants to track their health over time. The Biobank is also starting to provide blood analysis and other detailed medical measures of subjects in the study.

UK Biobank (data bank) information uses

“UK Biobank is an open access resource. The Resource is open to bona fide scientists, undertaking health-related research that is in the public good. Approved scientists from the UK and overseas and from academia, government, charity and commercial companies can use the Resource. ….” (from UK Biobank scientists page).

Somewhat like open source code, the Biobank resource is made available to anyone (academia as well as industry), that can make valid use of its data BUT any research derived from its data must be published and made freely available to the Biobank and the world.

Biobank’s papers page documents some of the research that has already been published using their data. It lists the paper on genetics of brain study mentioned above and dozens more.

Differences from Data Banks

In the original data bank post:

  1. We thought data was only needed by  AI/deep learning. That seems naive now. The Biobank shows that AI/deep learning is not the only application/research that needs data.
  2. We thought data would be collected by only by hyper-scalars and other big web firms during normal user web activity. But their data is not the only data that matters.
  3. We thought data would be gathered for free. Good data can take many forms, and some may cost money.
  4. We thought profits from selling data would be split between the bank and users and could fund data bank operations. But in the Biobank, funding came from charitable contributions and data is available for free (to valid researchers).

Data banks can be an invaluable resource and may take many forms. Data that’s difficult to find can be gathered by charities and others that use funding to create, operate and gather the specific information needed for targeted research.

Comments?

Photo Credit(s): Bank on it by Alan Levine

Latest MRI – two screws in the kneecap by Becky Stern

Other graphics from the Genetics of brain structure… paper

 

 

IBM using PCM to implement better AI – round 6

Saw a recent article that discussed IBM’s research into new computing architectures that are inspired by brain computational techniques (see A new brain inspired architecture … ). The article reports on research done by IBM R&D into using Phase Change Memory (PCM) technology to implement various versions of computer architectures for AI (see Tutorial: Brain inspired computation using PCM, in the AIP Journal of Applied Physics).

As you may recall, we have been reporting on IBM Research into different computing architectures to support AI processing for quite awhile now, (see: Parts 1, 2, 3, 4, & 5). In our last post, More power efficient deep learning through IBM and PCM, we reported on a unique hybrid PCM-silicon solution to deep learning computation.

Readers should also be familiar with PCM as well as it’s been discussed at length in a number of our posts (see The end of NAND is near, maybe; The future of data storage is MRAM; and New chip architectures with CPU, storage & sensors …). MRAM, ReRAM and current 3D XPoint seem to be all different forms of PCM (I think).

In the current research, IBM discusses three different approaches to support AI  utilizing PCM devices. All three approaches stem from the physical characteristics of PCM.

(Some) PCM physics

FIG. 2. (a) Phase-change memory is based on the rapid and reversible phase transition of certain types of materials between crystalline and amorphous phases by the application of suitable electrical pulses. (b) Transmission electron micrograph of a mushroom-type PCM device in a RESET state. It can be seen that the bottom electrode is blocked by the amorphous phase.

It turns out that PCM devices have many  characteristics that lend themselves to be useful for specialized computation. PCM devices crystalize and melt in order to change state. The properties associated with melting and crystallization of the PCM media cell can be used to support unique forms of computation. Some of these PCM characteristics include::

  • Analog, not digital memory – PCM devices are, at the core, an analog memory device. We mean that they don’t record just a 0 or 1 (actually resistant or conductive) state, but rather a continuum of values between those two.
  • PCM devices have an accumulation capability –   each PCM cell actually  accumulates a level of activation. This means that one cell can be more or less likely to change state depending on prior activity.
  • PCM devices are noisy – PCM cells arenot perfect recorders of state chang signals  but rather have a well known, random noise which impacts the state level attained, that can be used to introduce randomness into processing.

The other major advantage of PCM devices is that they take a lot less power than a GPU-CPU to work.

Three ways to use PCM for AI learning

FIG. 4. “In-memory computing,” computation is performed in place by exploiting the physical attributes of memory devices organized as a “computational memory” unit. For example, if data A is stored in a computational memory unit and if we would like to perform f(A), then it is not required to bring A to the processing unit. This saves energy and time that would have to be spent in the case of conventional computing system and memory unit. Adapted from Ref. 19.

The Applied Physics article describes three ways to use PCM devices in AI learning. These three include:

  1. Computational storage – which uses the analog capabilities of PCM to perform  arithmetic and learning computations. In a sort of combined compute and storage device.
  2. AI co-processor – which uses PCM devices, in an “all PCM nodes connected to all other PCM nodes” operation that could be used to perform neural network learning. In an AI co-processor there would be multiple all connected PCM modules, each emulating a neural network layer.
  3. Spiking neural networks –  which uses PCM activation accumulation characteristics & inherent randomness to mimic, biological spiking neuron activation.
FIG. 11.
A proposed chip architecture for a co-processor for deep learning based on PCM arrays.28

It’s the last approach that intrigues me.

Spiking neural nets (SNN)

FIG. 12. (a) Schematic illustration of a synaptic connection and the corresponding pre- and post-synaptic neurons. The synaptic connection strengthens or weakens based on the spike activity of these neurons; a process referred to as synaptic plasticity. (b) A well-known plasticity mechanism is spike-time-dependent plasticity (STDP), leading to weight changes that depend on the relative timing between the pre- and post-synaptic neuronal spike activities. Adapted from Ref. 31.

Biological neurons accumulate charge from all input (connected) neurons and when they reach some input threshold, generate an output signal or spike. This spike is then used to start the process with another neuron up stream from it

Biological neurons also exhibit randomness in their threshold-spiking process.

Emulating spiking neurons, n today’s neural nets, takes computation.  Also randomness takes more.

But with PCM SNN, both the spiking process and its randomness, comes from device physics. Using PCM to create SNN seems a logical progression.

PCM as storage, as memory, as compute or all the above

In the storage business, we look at Optane (see our 3D Xpoint post) SSDs as blazingly fast storage. Intel has also announced that they will use 3D Xpoint in a memory form factor which should provide sadly slower, but larger memory devices.

But using PCM for compute, is a radical departure from the von Neumann computer architectures we know and love today. HPE has been discussing another new computing architecture with their memristor technology, but only in prototype form.

It seems IBM, is also prototyping hardware done this path.

Welcome to the next computing revolution.

Photo & Caption Credit(s): Photo and caption from Figure 2 in AIP Journal of Applied Physics article

Photo and caption from Figure 4 in AIP Journal of Applied Physics article

Photo and caption from Figure 11 in AIP Journal of Applied Physics article

Photo and caption from Figure 12 in AIP Journal of Applied Physics article

 

 

Screaming IOP performance with StarWind’s new NVMeoF software & Optane SSDs

Was at SFD17 last week in San Jose and we heard from StarWind SAN (@starwindsan) and their latest NVMeoF storage system that they have been working on. Videos of their presentation are available here. Starwind is this amazing company from the Ukraine that have been developing software defined storage.

They have developed their own NVMe SPDK for Windows Server. Intel doesn’t currently offer SPDK for Windows today, so they developed their own. They also developed their own initiator (CentOS Linux) for NVMeoF. The target system was a multicore server running Windows Server with a single Optane SSD that they used to test their software.

Extreme IOP performance consumes cores

During their development activity they tested various configurations. At the start of their development they used a Windows Server with their NVMeoF target device driver. With this configuration and on a bare metal server, they found that they could max out the Optane SSD at 550K 4K random write IOPs at 0.6msec to a single Optane drive.

When they moved this code directly to run under a Hyper-V environment, they were able to come close to this performance at 518K 4K write IOPS at 0.6msec. However, this level of IO activity pegged 100% of 8 cores on their 40 core server.

More IOPs/core performance in user mode

Next they decided to optimize their driver code and move as much as possible into user space and out of kernel space, They continued to use Hyper-V. With this level off code, they were able to achieve the same performance as bare metal or ~551K 4K random write IOP performance at the 0.6msec RT and 2.26 GB/sec level. However, they were now able to perform only pegging 2 cores. They expect to release this initiator and target software in mid October 2018!

They converted this functionality to run under ESX/VMware and were able to see much the same 2 cores pegged, ~551K 4K random write IOPS at 0.6msec RT and 2.26 GB/sec. They will have the ESXi version of their target driver code available sometime later this year.

Their initiator was running CentOS on another server. When they decided to test how far they could push their initiator, they were able to drive 4 Optane SSDs at up to ~1.9M 4K random write IOP performance.

At SFD17, I asked what they could have done at 100 usec RT and Max said about 450K IOPs. This is still surprisingly good performance. With 4 Optane SSDs and consuming ~8 cores, you could achieve 1.8M IOPS and ~7.4GB/sec. Doubling the Optane SSDs one could achieve ~3.6M IOPS, with sufficient initiators and target cores with ~14.8GB/sec.

Optane based super computer?

ORNL Summit super computer, the current number one supercomputer in the world, has a sustained throughput of 2.5 TB/sec over 18.7K server nodes. You could do much the same with 337 CentOS initiator nodes, 337 Windows server nodes and ~1350 Optane SSDs.

This would assumes that Starwind’s initiator and target NVMeoF systems can scale but they’ve already shown they can do 1.8M IOPS across 4 Optane SSDs on a single initiator server. Aand I assume a single target server with 4 Optane SSDs and at least 8 cores to service the IO. Multiplying this by 4 or 400 shouldn’t be much of a concern except for the increasing networking bandwidth.

Of course, with Starwind’s Virtual SAN, there’s no data management, no data protection and probably very little in the way of logical volume management. And the ORNL Summit supercomputer is accessing data as files in a massive file system. The StarWind Virtual SAN is a block device.

But if I wanted to rule the supercomputing world, in a somewhat smallish data center, I might be tempted to put together 400 of StarWind NVMeoF target storage nodes with 4 Optane SSDs each. And convert their initiator code to work on IBM Spectrum Scale nodes and let her rip.

Comments?

New website monetization approaches

Historically, websites have made money by selling wares, services or advertising. In the last two weeks it seems like two new business models are starting to emerge. One more publicly supported and the other less publicly supported.

Europe’s new copyright law

According to an article I read recently (This newly approved European copyright law might break the Internet), Article 11 of Europe’s new Copyright Directive (not quite law yet) will require search engines, news aggregators and other users of Internet content to pay a “link tax” to copyright holders of anything they link to. As a long time blogger, podcaster and content provider, I find this new copyright policy very intriguing.

The article proposes that this will bankrupt small publishers as larger ones will charge less for the traffic. But presently, I get nothing for links to my content. And, I’d be delighted to get any amount – in fact I’d match any large publishers link tax amount that the market demands.

But my main concern is the impact this might have on site traffic. If aggregators pay a link tax, why would they want to use content that charges any tax. Yes at some point aggregators need content. But there are many websites full of content, certainly there would be some willing to forego tax fees for more traffic.

I also happen to be a copyright user. Most of my blog posts are from articles I read on the web. I usually link to an article in the 1st one or two paragraphs (see above and below) of a post and may refer (and link) to more that go deeper into a subject. Will I have to pay a link tax to the content owner?

How much of a link tax is anyone’s guess. I’m not sure it would amount to much. But a link tax, if done judiciously might even raise the quality of the content on the web.

Browser’s of the world, lay down your blockchains

The second article was a recent research paper (Digging into browser based crypto mining). Researchers at RWTH Aachen University had developed a new method to associate mined blocks to mining pools as a way to unearth browser-based mined crypto coins. With this technique they estimated that 1.8% of all Monero coins were mined by CoinHive using participant browsers to mine the coin or ~$250K/month from browser mining.

I see this as steeling compute power. But with that much coin being generated, it might be a reasonable way for an honest website to make some cash from people browsing their web pages. The browsing party would need to be informed of the mining operation in the page’s information, sort of like “we use cookies” today.

Just think, someone creates a WP plugin to do ETH mining and when activated, a WP website pops up a message that says “We mine coins while you browse – OK?”.

In another twist perhaps the websites could share the ETH mined on their browser with the person doing the browsing, similar to airline/hotel travel awards. Today most travel is done on corporate dime, but awards go to the person doing the traveling. Similarly, employees could browse using corporate computers but they would keep a portion of the ETH that’s mined while they browse away… Sounds like a deal.

Other monetization approaches

We’ve tried Google AdSense and other advertising but it only generated pennies a month. So, it wasn’t worth it.

We also sell research and occasionally someone buys some (see SCI Research Shop). And I do sell services but not through my website.

~~~

Not sure a link tax will fly. It would be a race to the bottom and anyone that charged a tax would suffer from less links until they decided to charge a $0 link tax.

Maybe if every link had a tax associated with it, whether the site owner wanted it or not there could be a level playing field. Recording, paying/receiving and accounting for all these link tax micro payments would be another nightmare altogether.

But a WP plugin, that announces and mines crypto coins with a user’s approval and splits the profit with them might work. Corporate wouldn’t like it but employees would just be browsing websites, where’s the harm in that.

Browse a website and share the mined crypto coin with site owner. Sounds fine to me.

Photo Credit(s): Strasburg – European Parliament|Giorgio Barlocco

Crypto News Daily – Telegram cancels ICO…

Photo of Bitcoin, Etherium and Litecoin|QuoteInspector

Data banks, data deposits & data withdrawals in the data economy – part 1

perspective by anomalous4 (cc) (from Flickr)
Big data visualization, Facebook friend connections
Facebook friend carrousel by antjeverena (cc) (from flickr)

Read an interesting article this week in The Atlantic, Why Technology Favors Tyranny by Yuvai Noah Harari, about the inevitable future of technology and how the use of data will drive it.

At the end of the article Harari talks about the need to take back ownership of our data in order to gain some control over the tech giants that currently control our data.

In part 3, Harari discusses the coming AI revolution and the impact on humanity. Yes there will still be jobs, but early on less jobs for unskilled labor and over time less jobs for skilled labor.

Yet, our data continues to be valuable. AI neural net (NN) accuracy increases as a function of the amount of data used to train it. As a result,  he has the most data creates the best AI NN. This means our data has value and can be used over and over again to train other AI NNs. This all sounds like data is just another form of capital, at least for AI NN training.

If only we could own our data, then there would still be value from people’s (digital) exertions (labor), regardless of how much AI has taken over the reigns of production or reduced the need for human work.Safe by cjc4454 (cc) (from flickr)

Safe by cjc4454 (cc) (from flickr)What we need is data (savings) banks. These banks would hold people’s data, gathered from social media likes/dislikes,  cell phone metadata, app/web history, search history, credit history, purchase history,  photo/video streams, email streams, lab work, X-rays, wearables info, etc. Probably many more categories need to be identified but ultimately ALL the digital data we generate today would need to be owned by people and deposited in their digital bank accounts.

Data deposits?

Social media companies, telecom, search companies, financial services app companies, internet  providers, etc. anywhere you do business should supply a copy of the digital data they gather for a person back to that persons data bank account.

There are many technical problems to overcome here but it could be as simple as an object storage bucket, assigned to each person that each digital business deposits (XML versions of) our  digital data they create for everyone that uses their service. They would do this as compensation for using our data in their business activities.

How to change data ownership?

Today, we all sign user agreements which essentially gives a company the rights to our data in perpetuity. That needs to change. I see a few ways that this change could come about

  1. Countries could enact laws to insure personal data ownership resides in the person generating it and enforce periodic distribution of this data
  2. Market dynamics could impel data distribution, e.g. if some search firm supplied data to us, we would be more likely to use them.
  3. Societal changes, as AI becomes more important to profit making activities and reduces the need for human work, and as data continues to be an important factor in AI success, data ownership becomes essential to retaining the value of human labor in society.

Probably, all of the above and maybe more would be required to change the ownership structure of data.

How to profit from data?

Technical entities needing data to train AI NNs could solicit data contributions through an Initial Data Offering (IDO). IDO’s would specify types of data required and a proportion of AI NN ownership, they would cede to all  data providers. Data providers would be apportioned ownership based on the % identified and the number of IDO data subscribers.

perspective by anomalous4 (cc) (from Flickr)
perspective by anomalous4 (cc) (from Flickr)

Data banks would extract the data requested by the IDO and supply it to the IDO entity for use. For IDOs, just like ICO’s or IPO’s, some would fail and others would succeed. But the data used in them would represent an ownership share sort of like a  stock (data) certificate in the AI NN.

Data bank responsibilities

Data banks would have various responsibilities and would need to collect fees to perform them. For example, data banks would be responsible for:

  1. Protecting data deposits – to insure data deposits are never lost, are never accessed without permission, are always trackable as to how they are used..
  2. Performing data deposits – to verify that data is deposited from proper digital entities, to validate that data deposits are in a usable form and to properly store the data in a customers object storage bucket.
  3. Performing data withdrawals – upon customer request, to extract all the appropriate data requested by an IDO,  anonymize it, secure it, package it and send it to the IDO originator.
  4. Reconciling data accounts – to track data transactions, data banks would supply a monthly statement that identifies all data deposits and data withdrawals, data revenues and data expenses/fees.
  5. Enforcing data withdrawal types – to enforce data withdrawal types, as data  withdrawals can have many different characteristics, such as exclusivity, expiration, geographic bounds, etc. Data banks would need to enforce withdrawal characteristics, at least to the extent they can
  6. Auditing data transactions – to insure that data is used properly, a consortium of data banks or possibly data accountancies would need to audit AI training data sets to verify that only data that has been properly withdrawn is used in trying the NN. .

AI NN, tools and framework responsibilities

In order for personal data ownership to work well, AI NNs, tools and frameworks used today would need to change to account for data ownership.

  1. Generate, maintain and supply immutable data ownership digests – data ownership digests would be a sort of stock registry for the data used in training the AI NN. They would need to be a part of any AI NN and be viewable by proper data authorities
  2. Track data use – any and all data used in AI NN training should be traceable so that proper data ownership can be guaranteed.
  3. Identify AI NN revenues – NN revenues would need to be isolated, identified and accounted for so that data owners could be rewarded.
  4. Identify AI NN data expenses – NN data costs would need to somehow be isolated, identified and accounted for so that data expenses could be properly deducted from data owner awards. .

At some point there’s a need for almost a data profit and loss statement as well as a data balance sheet for at an AI NN level. The information supplied above should make auditing data ownership, use and rewards much more feasible. But it all starts with identifying data ownership and the data used in training the AI.

~~~~

There are a thousand more questions that come to mind. For example

  • Who owns earth sensing satellite, IoT sensors, weather sensors, car sensors etc. data? Everyone in the world (or country) being monitored is laboring to create the environment sensed by these devices. Shouldn’t this sensor data be apportioned to the people of the world or country where these sensors operate.
  • Who pays data bank fees? The generators/extractors of the data could pay in addition to providing data deposits for the privilege to use our data. I could also see the people paying.  Having the company pay would give them an incentive to make the data load be as efficient and complete as possible. Having the people pay would induce them to use their data more productively.
  • What’s a decent data expiration period? Given application time frames these days, 7-15 years would make sense. But what happens to the AI NN when data expires. Some way would need to be created to extract data from a NN, or the AI NN would need to cease being used and a new one would  need to be created with new data.
  • Can data deposits be rented/sold to data aggregators? Sort of like a AI VC partnership only using data deposits rather than money to fund AI startups.
  • What happens to data deposits when a person dies? Can one inherit a data deposits, would a data deposit inheritance be taxable as part of an estate transfer?

In the end, as data is required to train better AI, ownership of our data makes us all be capitalist (datalists) in the creation of new AI NNs and the subsequent advancement of society. And that’s a good thing.

Comments?

 

 

Marketing meet Big Data, call records, credit card purchases & demographics

Read an article in Science Daily (Understanding urban issues through credit cards) that talked about a study published in Nature (Sequences of purchases in credit card data reveals lifestyles of urban populations) that applies big data to B2C marketing.

The researchers examined call data records (CDRs), credit card transactions records (CCRs) and demographic (age, sex, residential zip code, wage level, etc.) data and did a cross table between them to identify sequences of purchases. They then used these sequences to identify different lifestyle groups in the urban area.

Marketing 2.0

The analyzed data from Mexico City, Mexico. The CCR data was collected for 10 weeks across 150K users. The had CDR data for 1/10th of the users for 6 months surrounding the 10 weeks duration. Credit card adoption is still low in Mexico (18%), so the analysis may be biased.  When thy matched CCR expenditures against median wages in a district and they found their participants came from higher wage populations. Their data also spanned all districts within the city.

The analysis identified sequences of purchase categories as well as expenditures.  They characterized purchase sequences as “words”.

 

 

 

Using the word data and further statistical analysis they were able to split the population up into 5 distinct lifestyle groups. 

The loops of icons above represent major purchase categories derived from the CCR data merchant category codes (MCC).  Each of the rings in “a” above show the same 12 major MCC purchase categories. If you look at each ring, one can identify a central or core node that seems to have the most incoming or outgoing arks. These seem to be the central purchases made by that lifestyle group after which they branch out to other purchase categories.

There are five different lifestyle categories (they also show the city average) delineated in the data:

  • Commuter – generally they have to pay tolls, have longer travel between home and work and have a diverse sequence of purchase that occurs after purchases from the toll category.
  • Household – purchases seem to center on grocery stores/supermarkets and then branch off from there.
  • Young – purchases seem to center on the taxicab category and then go to computer-networking, restaurants, grocery stores/supermarkets.
  • Hi-Tech – purchases seem to center on computer-networking,  then go to gas stations, grocery stores/supermarkets, restaurants, and telecomm.
  • Average – seems to have two focuses grocery stores/supermarkets and restaurants and then goes out from there to gas stations, specialty food stores and department stores.
  • Dinner-out – purchases seem to center on restaurants and then branch out fro there to computer-networking, gas stations, supermarkets, fast food, etc.

In “b”  breakout above, you can see the socio-demographic characteristics of each lifestyle group as compared with the median user. And in “c” one can see some population histograms of the demographic data.

They were then able to use the CDR data to construct a map of which lifestyle called which other life style to identify call correlation data. Most calls were contacts between the same groups but the second most active call was calls to householders.

They took this same analysis to another city in Mexico and came up with six  lifestyle categories, five of the same and a different one.

~~~~

When I went to Uni (a long long time ago), I attended an urban geography class that was much more scientific and mathematical than any other geography class I had ever attended. I remember asking the professor when did geography become an exact science. As best as I can recall, he laughed and said over the last decade.

Analysis like the above could make B2C marketing, almost an exact science.

Big Data meet Marketing – Buyer beware.

Comments?

Photo Credit(s):  All charts/photos are from the Nature article Sequences of purchase in credit card data reveal lifestyles in urban populations

AI processing at the edge

Read a couple of articles over the past few weeks (TechCrunch: Google is making a fast, specialized TPU chip for edge devices … and IEEE Spectrum: Two startups use processing in flash for AI at the edge) about chips for AI at the IoT edge.

The two startups, Syntiant and Mythic, are moving to analog only or analog-digital solutions to provide AI processing needed at the edge while Google is taking their TPU technology to the edge.  We have written about Google’s TPU before (see: TPU and hardware vs. software  innovation (round 3) post).

The major challenge in AI processing at the edge is power consumption. Both  startups attack the power problem by using flash and other analog circuitry to provide power efficient compute.

Google attacked the power problem with their original TPU by reducing computational precision from 64- to 8-bits. By reducing transistor counts, they lowered power requirements proportionally.

AI today is based on neural networks (NN), that connect simulated neurons via simulated synapses with weights attached to indicate whether to boost or decrease the signal being transmitted. AI learning is done by setting those weights and creating the connections between simulated neurons and the synapses.  So learning is setting weights and establishing connections. Actual inferences (using AI to do something) is a process of exciting input simulated neurons/synapses and letting the signal flow through the NN with each weight being used to determine output(s).

AI with standard compute

The problem with doing AI learning or inferencing with normal CPUs or even CUDAs is that the NN does thousands if not millions of  multiplication-accumulation actions at each simulated synapse-neuron connection. Doing all these multiplication-accumulation takes power. CPUs and CUDAs can do these sorts of operations on 32 or 64 bit numbers or even floating point but it still takes power.

AI processing power

AI processing power is measured in trillions of (accumulate-multiply) operations per second per watt (TOPS/W). Mythic believes it can perform 4 TOPS/W and Syntiant says it can do 20 TOPS/W. In comparison, the NVIDIA Volta V100 can do about 0.4 TOPS/W (according to the article). Although  comparing Syntiant-Mythic TOPS to NVIDIA TOPS is a little like comparing apples to oranges.

A current Intel Xeon Platinum 8180M (2.5Ghz, 28 Core processors, 205 W) can probably do (assuming one multiplication-accumulation per hertz) about 2.5 Billion X 28 Cores = 70 Billion Ops Second/205 W or 0.3 GOPS/W (source: Platinum 8180M Data sheet).

As for Google’s TPU TOPS/W, TPU2 is rated at 45 GFLOPS/chip and best guess for power consumption is between 160W and 200W, let’s say 180W. With power at that level, TPU2 should hit 0.25 GFLOPS/W.  TPU3 is coming out with 8X the power but it uses water cooling (read LOTS MORE POWER).

Nonetheless, it appears that Mythic and Syntiant are one to two orders of magnitude better than the best that NVIDIA and TPU2 can do today and many orders of magnitude better than Intel X86.

Improving TOPS/W

Use NAND, as an analog memory to read, write and hold  NN weights is an easy way to reduce power consumption. Combine that with  analog circuitry that can do multiplication and addition with those flash values and you have a AI NN processor. This way you reduce the need to hold weights in memory and do compute in registers by collapsing both compute and memory into the same componentry.

The major difference between Syntiant and Mythic seems to be the amount of analog circuitry they use. Mythic seems to relegate the analog circuitry to an accelerator while Syntiant has a more extensive use of analog circuitry throughout their chip. Probably why it can perform 5X the TOPS/W of Mythic’s IPU.

IBM and others have been working on neuromorphic chips some of which are analog based and others which are all digital based. We’ve written extensively on IBM and some on MIT’s approaches (for the latest on IBM see: More power efficient deep learning through IBM and PCM, and for MIT see: MIT builds an analog synapse chip) and follow the links there to learn more.

~~~~

Special purpose AI hardware is emerging from the labs and finally reaching reality. IBM R&D has been playing with it for a long time. Google is working on TPU3 so there’s no stopping them. And startups are seeing an opening and are taking everyone on. Stay tuned, were in for a good long ride before the someone rises above the crowd and becomes the next chip giant.

Comments?

 

Photo Credit(s): TechCrunch  Google is making a fast, specialized TPU chip for edge devices … article

Introduction to Digital Design Verification at Mythic, Medium.com Article

Images from Google Cloud Platform Blog on the TPU

Two startups use processing in flash for AI at the edge, IEEE Spectrum article courtesy of Mythic

Skyrmion and chiral bobber solitons for racetrack storage

Read an article this week in Science Daily (Magnetic skyrmions: Not the only one of their class; …) about new magnetic structures that could lend themselves to creating a new type of moving, non-volatile storage.  (There’s more information in the press release and the Nature paper [DOI: 10.1038/s41565-018-0093-3], behind a paywall).

Skyrmions and chiral bobbers are both considered magnetic solitons, types of magnetic structures only 10’s of nm wide, that can move around, in sort of a race track configuration.

Delay line memories

Early in computing history, there was a type of memory called a delay line memory which used various mechanisms (mercury, magneto-resistence, capacitors, etc.) arranged along a circular line such as a wire, and had moving pulses of memory that raced around it. .

One problem with delay line memory was that it was accessed sequentially rather than core which could be accessed randomly. When using delay lines to change a bit, one had to wait until the bit came under the read/write head . It usually took microseconds for a bit to rotate around the memory line and delay line memories had a capacity of a few thousand bits 256-512 bytes per line,  in today’s vernacular.

Delay lines predate computers and had been used for decades to delay any electronic or acoustic signal before retransmission.

A new racetrack

Solitons are being investigated to be used in a new form of delay line memory, called racetrack memory. Skyrmions had been discovered a while ago but the existence of chiral bobbers was only theoretical until researchers discovered them in their lab.

Previously, the thought was that one would encode digital data with only skyrmions and spaces. But the discovery of chiral bobbers and the fact that they can co-exist with skyrmions, means that chiral bobbers and skyrmions can be used together in a racetrack fashion to record digital data.  And the fact that both can move or migrate through a material makes them ideal for racetrack storage.

Unclear whether chiral bobbers and skyrmions only have two states or more but the more the merrier for storage. I am assuming that bit density or reliability is increased by having chiral bobbers in the chain rather than spaces.

Unlike disk devices with both rotating media and moving read-write heads, the motion of skyrmion-chiral bobber racetrack storage is controlled by a very weak pulse of current and requires no moving/mechanical parts prone to wear/tear. Moreover, as a solid state devices, racetrack memory is not sensitive to induced/organic vibration or shock,  So, theoretically these devices should have higher reliability than disk devices.

There was no information comparing the new racetrack memory reliability to NAND or 3D Crosspoint/PCM SSDs, but there may be some advantage here as well. I suppose one would need to understand how to miniaturize the read-erase-write head to the right form factor for nm racetracks to understand how it compares.

And I didn’t see anything describing how long it takes to rotate through bits on a skyrmion-chiral bobber racetrack. Of course, this would depend on the number of bits on a racetrack, but some indication of how long it takes one bit to move, one postition on the racetrack would be helpful to see what its rotational latency might be.

~~~~

At the moment, reading and writing skyrmions and the newly discovered chiral bobbers takes a lot of advanced equipment and is only done in major labs. As such, I don’t see a skyrmion-chiral bobber racetrack storage device arriving on my desktop anytime soon. But the fact that there’s a long way to go before, we run out of magnetic storage options, even if it is on a chip rather than magnetic media,  is comforting to know. Even if we don’t ever come up with an economical way to produce it.

I wonder if you could synchronize rotational timing across a number of racetrack devices, at least that way you could be reading/erasing/writing a whole byte, word, double word etc, at a time, rather than a single bit.

Comments?

Photo Credit(s): From Experimental observation of chiral magnetic bobbers in B20 Type FeGe paper

From Experimental observation of chiral magnetic bobbers in B20 Type FeGe paper

From Timeline of computer history Magnetoresistive delay lines

From Experimental observation of chiral magnetic bobbers in B20 Type FeGe paper