IT in space

Read an article last week about all the startup activity that’s taking place in space systems and infrastructure (see: As rocket companies proliferate … new tech emerges leading to a new space race). This is a consequence of cheap(er) launch systems from SpaceX, Blue Origin, Rocket Lab and others.

SpaceBelt, storage in space

One startup that caught my eye was SpaceBelt from Cloud Constellation Corporation, that’s planning to put PB (4X library of congress) of data storage in a constellation of LEO satellites.

The LEO storage pool will be populated by multiple nodes (satellites) with a set of geo-synchronous access points to the LEO storage pool. Customers use ground based secure terminals to talk with geosynchronous access satellites which communicate to the LEO storage nodes to access data.

Their main selling points appear to be data security and availability. The only way to access the data is through secured satellite downlinks/uplinks and then you only get to the geo-synchronous satellites. From there, those satellites access the LEO storage cloud directly. Customers can’t access the storage cloud without going through the geo-synchronous layer first and the secured terminals.

The problem with terrestrial data is that it is prone to security threats as well as natural disasters which take out a data center or a region. But with all your data residing in a space cloud, such concerns shouldn’t be a problem. (However, gaining access to your ground stations is a whole different story.

AWS and Lockheed-Martin supply new ground station service

The other company of interest is not a startup but a link up between Amazon and Lockheed Martin (see: Amazon-Lockheed Martin …) that supplies a new cloud based, satellite ground station as a service offering. The new service will use Lockheed Martin ground stations.

Currently, the service is limited to S-Band and attennas located in Denver, but plans are to expand to X-Band and locations throughout the world. The plan is to have ground stations located close to AWS data centers, so data center customers can have high speed, access to satellite data.

There are other startups in the ground station as a service space, but none with the resources of Amazon-Lockheed. All of this competition is just getting off the ground, but a few have been leasing idle ground station resources to customers. The AWS service already has a few big customers, like DigitalGlobe.

One thing we have learned, is that the appeal of cloud services is as much about the ecosystem that surrounds it, as the service offering itself. So having satellite ground stations as a service is good, but having these services, tied directly into other public cloud computing infrastructure, is much much better. Google, Microsoft, IBM are you listening?

Data centers in space

Why stop at storage? Wouldn’t it be better to support both storage and computation in space. That way access latencies wouldn’t be a concern. When terrestrial disasters occur, it’s not just data at risk. Ditto, for security threats.

Having whole data centers, would represent a whole new stratum of cloud computing. Also, now IT could implement space native applications.

If Microsoft can run a data center under the oceans, I see no reason they couldn’t do so in orbit. Especially when human flight returns to NASA/SpaceX. Just imagine admins and service techs as astronauts.

And yet, security and availability aren’t the only threats one has to deal with. What happens to the space cloud when war breaks out and satellite killers are set loose.

Yes, space infrastructure is not subject to terrestrial disasters or internet based security risks, but there are other problems besides those and war that exist such as solar storms and space debris clouds. .

In the end, it’s important to have multiple, non-overlapping risk profiles for your IT infrastructure. That is each IT deployment, may be subject to one set of risks but those sets are disjoint with another IT deployment option. IT in space, that is subject to solar storms, space debris, and satellite killers is a nice complement to terrestrial cloud data centers, subject to natural disasters, internet security risks, and other earth-based, man made disasters.

On the other hand, a large, solar storm like the 1859 one, could knock every data system on the world or in orbit, out. As for under the sea, it probably depends on how deep it was submerged!!

Photo Credit(s): Screen shots from SpaceBelt youtube video (c) SpaceBelt

Screens shot from AWS Ground Station as a Service sign up page (c) Amazon-Lockheed

Screen shots from Microsoft’s Under the sea news feature (c) Microsoft

Scratch 3.0 is out

I’ve written on Scratch before (see my 10 years of Scratch and still counting post). It’s essentially an object oriented, visual programming language for kids. Nontheless, it is pretty sophisticated. The team at MIT just released Scratch 3.0, with a number of new extensions and updates to make it easier to work with.

Google also has a visual object oriented programming tool, called Blockly. I’ve used a variant of Blockly to program an Android phone based robot controller. It’s ok, but Blocky lacks a good collaboration mode and editing large Blockly code modules is not as easy as it should be.

On the other hand Scratch is made for collaboration. They have a web page with 1000s of collaborations listed. Seems like there’s a bit for everyone on the collaboration list.  And they have a a number of starter Scratch projects that anyone can tackle to earn coding cards that will gentling introduce you to scratch and coding.

Using Scratch

When I first ran across Scratch I used it to create sounds based on key combinations. Then I moved to animating sprites (drawn characters, which you can draw yourself or use one of many they have). Then I moved to animating planes, then groups of planes, then created a game where one plane would be followed by others. And then added a way where one plane could shoot another and so on.

It didn’t take me very long to get to a point where I had fleets of planes moving around the screen fighting each other. I haven’t done anything big with Scratch before but I’ve done a number of mini games/animations with my kids and it was fun to toy with.

Used to be you had to download and run Scratch locally on your PC/Mac. With later versions, they have Scratch Desktops that one can download for Windows and MacOS.

Alternatively, one could also use the web based version. In this way you can easily run it in any web browser.

The new desktop is more like a visual IDE than the old one I’m used to and looks exactly like the one on the web. The first Scratch I used presented itself in a table top screen with various Scratch tools surrounding this table top. I’m sure it makes things easier for beginning coders not to be presented with a Scratch world of tools right off the bat and just to have a sprite to play with. I suspect that all these tools are now buried in Scratch Tutorials

Scratch 3.0 comes with a number of extensions

One of the extensions allows you to program LEGO Robotics, another provides a way to interact with a blue tooth micro:bit controller, and another allows you to interact with your web cam to animate objects based on vision detection. There are plenty more and I’m sure this isn’t the end of them. (NB. Scratch team you need one for FIRST robotics) .

I just added a few for sounds and the text to speech extension. And it’s really easy to have Scratch 3.0 read out a text string for you. I suppose there would be a way for one to input a text file and have Scratch read it for you. But didn’t get that far with it.

~~~~

I am a strong supporter of everyone learning how to code and solutions like MIT’s Scratch (and Google’s Blockly) are a great way to understand coding without having to deal with the pain/semantics of compilers, APIs or function libraries etc.

Just start coding and having fun. it’s amazing what one can accomplish. That’s what Scratch was made to do, enjoy.

Learning machine learning – part 1

Saw an article this past week from AWS Re:Invent that they just released their Machine Learning curriculum and materials  free to the public. Google (Cloud Platform and elsewhere) TensorFlow,  (Facebook’s) PyTorch, and Microsoft Azure CNTK frameworks  education is also available and has been for awhile now.

My money is on PyTorch and Tensorflow as being the two frameworks most likely to succeed. However all the above use many open source facilities and there seems to be a lot of cross breeding across them. Both AWS ML solutions and Microsoft CNTK offer PyTorch and TensorFlow frameworks/APIs as one option among many others.  

AWS Machine Learning

I spent about an hour plus looking over the AWS SageMaker tutorial videos in the developer section of AWS machine learning curriculum. Signing up was fairly easy but I already had an AWS login. You also had to enroll/register for the course on your AWS login  but once that was through, you could access courses.

In the comments on the AWS blog post there were a number of entries indicating broken links and other problems but I didn’t have any issues. Then again, I didn’t start at the beginning, only looked at over one series of courses, and was using the websites one week after they were announced at Re:Invent.

Amazon SageMaker is an overarching framework that can be used to perform machine learning on AWS, all the way from gathering, analyzing and modifying the dataset(s), to training the model, to creating a inference engine available as an endpoint that can be used to perform the inferencing.

Amazon also has special purpose API based tools that allow customers to embed intelligence (inferencing) directly into their application, without needing to perform the ML training. These include:

  • Amazon Recognition which provides image (facial and other tagging) recognition services
  • Amazon Polly which provides text to speech services in multilple languages, and
  • Amazon Lex which provides speech recognition technology (used by Alexa) and together with Polly helps embed conversational interfaces into customer applications.

TensorFlow Machine Learning

In the past I looked over the TensorFlow tutorials and recently rechecked them out. I found them much easier to follow this time.

 

The Google IO 2018 video on TensorFlowGetting Started With TensorFlow High Level APIs, takes you through a brief introduction to the Colab(oratory),  a GCP solution that uses TensorFlow and how to use Tensorflow Keras, tf.data and TensorFlow Eager Execution to create machine learning models and perform machine learning.

 Keras on TensorFlow seems to be the easiest approach to  use machine learning technologies. The video spends most of the time discussing a Colab Keras code element,  ~9 lines, that loads a image classification dataset, defines a 1 level (one standard layer and one output layer), trains it, validates it and uses it to perform  inferencing.

The video also touches a bit on tf.data and TensorFlow Eager Execution but the main portion discusses the 9 line TensorFlow Keras machine learning example.

Both Colab and AWS Sagemaker use and discuss Jupyter Notebooks. These appear to be an open source approach to documenting and creating a workflow and executing Python code automatically.

GCP Colab is essentially a GCP-Google Drive based Jupyter notebook execution engine. With Colab you create a Jupyter notebook on google drive and interactively execute it under Colab. You can download your Juyiter notebook files and essentially execute them anywhere else that supports TensorFlow (that supports TensorFlow v1.7 or above, with Keras API support).

In the video, the Google IO   instructors (Josh Gordon and Lawrence Moroney) walk you through building a model to recognize handwritten digits and outputs a classification (0..9) of what the handwritten digit represents.

It uses a standard labeled handwriting to digits labeled data set, called the MNIST database of handwritten digits that’s already been broken up into a training set and a validation set. Josh calls this the “Hello World” of machine learning.

The instructor in the video walks you through the (Jupyter Notebook – Eager Execution-Keras) code that inputs the data set (line 2), builds a 1 level (really two layer, one neural net layer and one output layer) neural network model (lines 3-6), trains the model (line 7), tests/validates the model (line 8) and then uses it to perform an inference (line 9).

Josh spends a little time discussing neural networks and model optimizations and some of the other parameters used in the code above. He has a few visualizations of what this all means but for the most part, the code uses a simple way to build a neural net model and some standard optimization techniques for the network.

He then goes on to discuss tf.data which is an API that can be used to create machine learning datasets and provide this data to the neural net for training or inferencing.  Apparently tf.data has a number of nifty features that allow you to take raw data and transform it into something that can be used to feed neural nets. For example, separating the data into batches, shuffling (randomizing) the batches of data, pre-fetching it so as to not starve the GPU matrix multipliers, etc.

Then it goes into how machine learning is different than regular coding. And show how TensorFlow Eager Execution is really just like Python execution. They go through another example (larger) of machine learning, this one distinguishes between cats and dogs. While they use an open source Python IDE ,  PyCharm, to test and walk through their TF Eager Execution code, setting breakpoints and examining data along the way.

At the end of the video they show a link to a Google crash course on TensorFlow machine learning and they refer to a book Deep Learning with Python by Francois Chollet. They also mention a browser version of TensorFlow which uses Java Script and  your browser to develop, train and perform inferences using TensorFlow Keras machine learning.

~~~~

Never got around to Microsoft’s Azure training other than previewing some websites but plan to look over that soon.

I would have to say that the Google IO session on using TensorFlow high level APIs was a lot more enjoyable (~40 minutes) than the AWS multiple tutorial videos (>>40 minutes) that I watched to learn about SageMaker.

Not a fair comparison as one was a Google IO intro session on TensorFlow high level APIs and the other was a series of actual training videos on Amazon SageMaker and the AWS services you can use to take advantage of it.

But the GCP session left me thinking I can handle learning more and using machine learning (via TensorFlow, Keras, Eager Execution, & tf.data) to actually do something while the SageMaker sessions left me thinking, how much AWS facilities and AWS infrastructure services,  I would need to understand and use to ever get to actually developing a machine learning model.

I suppose one was more of an (AWS SageMaker) infrastructure tutorial  and the other was more of an intro into machine learning using TensorFlow wherever you wanted to execute it.

I think I’m almost ready to start creating and feeding a TensorFlow model with my handwriting and seeing if it can properly interpret it into searchable text. If it can do that, I would be a happy camper

Comments…

Photo credits: 

Screenshos from AWS Sagemaker series of tutorial video 1, 2, 3, 4 & 5, you may need a signin to view them

Screenshots from the Getting Started with TensorFlow High Level APIs YouTube video 

Crowdresearch, crowdsourced academic research

Read an article in Stanford Research, Crowdsourced research gives experience to global participants that discussed an activity in Stanford and other top tier research institutions to try to get global participation in academic research. The process is discussed more fully in a scientific paper (PDF here) by researchers from Stanford, MIT Media Lab, Cornell Tech and UC Santa Cruz.

They chose three projects:

  • A HCI (human computer interaction) project to design, engineer and build a new paid crowd sourcing marketplace (like Amazon’s Mechanical Turk).
  • A visual image recognition project to improve on current visual classification techniques/algorithms.
  • A data science project to design and build the world’s largest wisdom of the crowds experiment.

Why crowdsource academic research?

The intent of crowdsourced research is to provide top tier academic research experience to persons which have no access to top research organizations.

Participating universities obtain more technically diverse researchers, larger research teams, larger research projects, and a geographically dispersed research community.

Collaborators win valuable academic research experience, research community contacts, and potential authorship of research papers as well as potential recommendation letters (for future work or academic placement),

How does crowdresearch work?

It’s almost an open source and agile development applied to academic research. The work week starts with the principal investigator (PI) and research assistants (RAs) going over last week’s milestone deliveries to see which to pursue further next week. The crowdresearch uses a REDDIT like posting and up/down voting to determine which milestone deliverables are most important. The PI and RAs review this prioritized list to select a few to continue to investigate over the next week.

The PI holds an hour long video conference (using Google Hangouts On Air Youtube live stream service). On the conference call all collaborators can view the stream but only a select few are on camera. The PI and the researchers responsible for the important milestone research of the past week discuss their findings and the rest of the collaborators on the team can participate over Slack. The video conference is archived and available  to be watched offline.

At the end of the meeting, the PI identifies next weeks milestones and potentially directly responsible investigators (DRIs) to work on them.

The DRIs and other collaborators choose how to apportion the work for the next week and work commences. Collaboration can be fostered and monitored via Slack and if necessary, more Google live stream meetings.

If collaborators need help understanding some technology, technique, or too, the PI, RAs or DRIs can provide a mini video course on the topic or can point to other information used to get the researchers up to speed. Collaborators can ask questions and receive answers through Slack.

When it’s time to write the paper, they used Google Docs with change tracking to manage the writing process.

The team also maintained a Wiki on the overall project to help new and current members get up to speed on what’s going on. The Wiki would also list the week’s milestones, video archives, project history/information, milestone deliverables, etc.

At the end of the week, researchers and DRIs would supply a mini post to describe their work and link to their milestone deliverables so that everyone could review their results.

Who gets credit for crowdresearch?

Each week, everyone on the project is allocated 100 credits and apportions these credits to other participants the weeks activities. The credits are  used to drive a page-rank credit assignment algorithm to determine an aggregate credit score for each researcher on the project.

Check out the paper linked above for more information on the credit algorithm. They tried to defeat (credit) link rings and other obvious approaches to stealing credit.

At the end of the project, the PI, DRIs and RAs determine a credit clip level for paper authorship. Paper authors are listed in credit order and the remaining, non-author collaborators are listed in an acknowledgements section of the paper.

The PIs can also use the credit level to determine how much of a recommendation letter to provide for researchers

Tools for crowdresearch

The tools needed to collaborate on crowdresearch are cheap and readily available to anyone.

  • Google Docs, Hangouts, Gmail are all freely available, although you may need to purchase more Drive space to host the work on the project.
  • Wiki software is freely available as well from multiple sources including Wikipedia (MediaWiki).
  • Slack is readily available for a low cost, but other open source alternatives exist, if that’s a problem.
  • Github code repository is also readily available for a reasonable cost but  there may be alternatives that use Google Drive storage for the repo.
  • Web hosting is needed to host the online Wiki, media and other assets.

Initial projects were chosen in computer science, so outside of the above tools, they could depend on open source. Other projects will need to consider how much experimental apparatus, how to fund these apparatus purchases, and how a global researchers can best make use of these.

My crowdresearch projects

Some potential commercial crowdresearch projects where we could use aggregate credit score and perhaps other measures of participation to apportion revenue, if any.

  • NVMe storage system using a light weight storage server supporting NVMe over fabric access to hybrid NVMe SSD – capacity disk storage.
  • Proof of Stake (PoS) Ethereum pooling software using Linux servers to create a pool for PoS ETH mining.
  • Bipedal, dual armed, dual handed, five-fingered assisted care robot to supply assistance and care to elders and disabled people throughout the world.

Non-commercial projects, where we would use aggregate credit score to apportion attribution and any potential remuneration.

  • A fully (100%?) mechanical rover able to survive, rove around, perform  scientific analysis, receive/transmit data and possibly, effect repairs from within extreme environments such as the surface of Venus, Jupiter and Chernoble/Fukishima Daiichi reactor cores.
  • Zero propellent interplanetary tug able to rapidly transport rovers, satellites, probes, etc. to any place within the solar system and deploy theme properly.
  • A Venusian manned base habitat including the design, build process and ongoing support for the initial habitat and any expansion over time, such that the habitat can last 25 years.

Any collaborators across the world, interested in collaborating on any of these projects, do let me know, here via comments. Please supply some way to contact you and any skills you’re interested in developing or already have that can help the project(s).

I would be glad to take on PI role for the most popular project(s), if I get sufficient response (no idea what this would be). And  I’d be happy to purchase the Drive, GitHub, Slack and web hosting accounts needed to startup and continue to fruition the most popular project(s). And if there’s any, more domain experienced PIs interested in taking any of these projects do let me know.  

Comments?

Picture Credit(s): Crowd by Espen Sundve;

Videoblogger Video Conference by Markus Sandy;

Researchers Night 2014 by Department of Computer Science, NTNU;

A tale of two storage companies – NetApp and Vantara (HDS-Insight Grp-Pentaho)

It was the worst of times. The industry changes had been gathering for a decade almost and by this time were starting to hurt.

The cloud was taking over all new business and some of the old. Flash’s performance was making high performance easy and reducing storage requirements commensurately. Software defined was displacing low and midrange storage, which was fine for margins but injurious to revenues.

Both companies had user events in Vegas the last month, NetApp Insight 2017 last week and Hitachi NEXT2017 conference two weeks ago.

As both companies respond to industry trends, they provide an interesting comparison to watch companies in transition.

Company role

  • NetApp’s underlying theme is to change the world with data and they want to change to help companies do this.
  • Vantara’s philosophy is data and processing is ultimately moving into the Internet of things (IoT) and they want to be wherever the data takes them.

Hitachi Vantara is a brand new company that combines Hitachi Data Systems, Hitachi Insight Group and Pentaho (an analytics acquisition) into one organization to go after the IoT market. Pentaho will continue as a separate brand/subsidiary, but HDS and Insight Group cease to exist as separate companies/subsidiaries and are now inside Vantara.

NetApp sees transitions occurring in the way IT conducts business but ultimately, a continuing and ongoing role for IT. NetApp’s ultimate role is as a data service provider to IT.

Customer problem

  • Vantara believes the main customer issue is the need to digitize the business. Because competition is emerging everywhere, the only way for a company to succeed against this interminable onslaught is to digitize everything. That is digitize your manufacturing/service production, sales, marketing, maintenance, any and all customer touch points, across your whole value chain and do it as rapidly as possible. If you don’t your competition will.
  • NetApp sees customers today have three potential concerns: 1) how to modernize current infrastructure; 2) how to take advantage of (hybrid) cloud; and 3) how to build out the next generation data center. Modernization is needed to free capital and expense from traditional IT for use in Hybrid cloud and next generation data centers. Most organizations have all three going on concurrently.

Vantara sees the threat of startups, regional operators and more advanced digitized competitors as existential for today’s companies. The only way to keep your business alive under these onslaughts is to optimize your value delivery. And to do that, you have to digitize every step in that path.

NetApp views the threat to IT as originating from LoB/shadow IT originating applications born and grown in the cloud or other groups creating next gen applications using capabilities outside of IT.

Product direction

  • NetApp is looking mostly towards the cloud. At their conference they announced a new Azure NFS service powered by NetApp. They already had Cloud ONTAP and NPS, both current cloud offerings, a software defined storage in the cloud and a co-lo hardware offering directly attached to public cloud (Azure & AWS), respectively.
  • Vantara is looking towards IoT. At their conference they announced Lumada 2.0, an Industrial IoT (IIoT) product framework using plenty of Hitachi software functionality and intended to bring data and analytics under one software umbrella.

NetApp is following a path laid down years past when they devised the data fabric. Now, they are integrating and implementing data fabric across their whole product line. With the ultimate goal that wherever your data goes, the data fabric will be there to help you with it.

Vantara is broadening their focus, from IT products and solutions to IoT. It’s not so much an abandoning present day IT, as looking forward to the day where present day IT is just one cog in an ever expanding, completely integrated digital entity which the new organization becomes.

They both had other announcements, NetApp announced ONTAP 9.3, Active IQ (AI applied to predictive service) and FlexPod SF ([H]CI with SolidFire storage) and Vantara announced a new IoT turnkey appliance running Lumada and a smart data center (IoT) solution.

Who’s right?

They both are.

Digitization is the future, the sooner organizations realize and embrace this, the better for their long term health. Digitization will happen with or without organizations and when it does, it will result in a significant re-ordering of today’s competitive landscape. IoT is one component of organizational digitization, specifically outside of IT data centers, but using IT resources.

In the mean time, IT must become more effective and efficient. This means it has to modernize to free up resources to support (hybrid) cloud applications and supply the infrastructure needed for next gen applications.

One could argue that Vantara is positioning themselves for the long term and NetApp is positioning themselves for the short term. But that denies the possibility that IT will have a role in digitization. In the end both are correct and both can succeed if they deliver on their promise.

Comments?

 

Google releases new Cloud TPU & Machine Learning supercomputer in the cloud

Last year about this time Google released their 1st generation TPU chip to the world (see my TPU and HW vs. SW … post for more info).

This year they are releasing a new version of their hardware called the Cloud TPU chip and making it available in a cluster on their Google Cloud.  Cloud TPU is in Alpha testing now. As I understand it, access to the Cloud TPU will eventually be free to researchers who promise to freely publish their research and at a price for everyone else.

What’s different between TPU v1 and Cloud TPU v2

The differences between version 1 and 2 mostly seem to be tied to training Machine Learning Models.

TPU v1 didn’t have any real ability to train machine learning (ML) models. It was a relatively dumb (8 bit ALU) chip but if you had say a ML model already created to do something like understand speech, you could load that model into the TPU v1 board and have it be executed very fast. The TPU v1 chip board was also placed on a separate PCIe board (I think), connected to normal x86 CPUs  as sort of a CPU accelerator. The advantage of TPU v1 over GPUs or normal X86 CPUs was mostly in power consumption and speed of ML model execution.

Cloud TPU v2 looks to be a standalone multi-processor device, that’s connected to others via what looks like Ethernet connections. One thing that Google seems to be highlighting is the Cloud TPU’s floating point performance. A Cloud TPU device (board) is capable of 180 TeraFlops (trillion or 10^12 floating point operations per second). A 64 Cloud TPU device pod can theoretically execute 11.5 PetaFlops (10^15 FLops).

TPU v1 had no floating point capabilities whatsoever. So Cloud TPU is intended to speed up the training part of ML models which requires extensive floating point calculations. Presumably, they have also improved the ML model execution processing in Cloud TPU vs. TPU V1 as well. More information on their Cloud TPU chips is available here.

So how do you code a TPU?

Both TPU v1 and Cloud TPU are programmed by Google’s open source TensorFlow. TensorFlow is a set of software libraries to facilitate numerical computation via data flow graph programming.

Apparently with data flow programming you have many nodes and many more connections between them. When a connection is fired between nodes it transfers a multi-dimensional matrix (tensor) to the node. I guess the node takes this multidimensional array does some (floating point) calculations on this data and then determines which of its outgoing connections to fire and how to alter the tensor to send to across those connections.

Apparently, TensorFlow works with X86 servers, GPU chips, TPU v1 or Cloud TPU. Google TensorFlow 1.2.0 is now available. Google says that TensorFlow is in use in over 6000 open source projects. TensorFlow uses Python and 1.2.0 runs on Linux, Mac, & Windows. More information on TensorFlow can be found here.

So where can I get some Cloud TPUs

Google is releasing their new Cloud TPU in the TensorFlow Research Cloud (TFRC). The TFRC has 1000 Cloud TPU devices connected together which can be used by any organization to train machine learning algorithms and execute machine learning algorithms.

I signed up (here) to be an alpha tester. During the signup process the site asked me: what hardware (GPUs, CPUs) and platforms I was currently using to training my ML models; how long does my ML model take to train; how large a training (data) set do I use (ranging from 10GB to >1PB) as well as other ML model oriented questions. I guess there trying to understand what the market requirements are outside of Google’s own use.

Google’s been using more ML and other AI technologies in many of their products and this will no doubt accelerate with the introduction of the Cloud TPU. Making it available to others is an interesting play but this would be one way to amortize the cost of creating the chip. Another way would be to sell the Cloud TPU directly to businesses, government agencies, non government agencies, etc.

I have no real idea what I am going to do with alpha access to the TFRC but I was thinking maybe I could feed it all my blog posts and train a ML model to start writing blog post for me. If anyone has any other ideas, please let me know.

Comments?

Photo credit(s): From Google’s website on the new Cloud TPU

 

PCM based neuromorphic processors

Read an interesting article from Register the other day about  IBM’s Almadan Research lab using standard Non-volatile memory devices to implement a neural net. They apparently used 2-PCM (Phase Change Memory) devices to implement a 913 neuron/165K synapse pattern recognition system.

This seems to be another (simpler, cheaper) way to create neuromorphic chips. We’ve written about neuromorphic chips before (see my posts on IBM SyNAPSE, IBM TrueNorth and MIT’s analog neuromorphic chip). The latest TrueNorth chip from IBM uses ~5B transistors and provides 1M neurons with 256M synapses.

But none of the other research I have read actually described the neuromorphic “programming” process at the same level nor provided a “success rate” on a standard AI pattern matching benchmark as IBM has with the PCM device.

PCM based AI

The IBM summary report on the research discusses at length how the pattern recognition neural network (NN) was “trained” and how the 913 neuron/165K synapse NN was able to achieve 82% accuracy on NIST’s handwritten digit training database.

The paper has many impressive graphics. The NN was designed as a 3-layer network and used back propagation for its learning process. They show how the back propagation training was used to determine the weights.

The other interesting thing was they analyzed how hardware faults (stuck-ats, dead conductors, number of resets, etc.) and different learning parameters (stochasticity, learning batch size, variable maxima, etc.) impacted NN effectiveness on the test database.

Turns out the NN could tolerate ~30% dead conductors (in the Synapses) or 20% of stuck-at’s in the PCM memory and still generate pretty good accuracy on the training event. Not sure I understand the learning parameters but they varied batch size from 1 to 10 and this didn’t seem to impact NN accuracy whatsoever.

Which PCM was used?

In trying to understand which PCM devices were in use, the only information available said it was a 180nm device. According to a 2012 Flash Memory Summit Report report on alternative NVM technologies, 180nm PCM devices have been around since 2004, a 90nm PCM device was introduced in 2008 with 128Mb and even newer PCM devices at 45nm were introduced in 2010 with 1Gb of memory.  So I would conclude that the 180nm PCM device supported ~16 to 32Mb.

What can we do with todays PCM technology?

With the industry supporting a doubling of transistors/chip every 2 years a PCM device in 2014 should have 4X the transistors of the 45nm, 2010 device above and ~4-8X the memory. So today we should be seeing 4-16Gb PCM chips at ~22nm. Given this, current PCM technology should support 32-64X more neurons than the 180nm devices or ~29K to ~58K neurons or so

Unclear what technology was used for the  ‘synapses’  but based on the time frame for the PCM devices, this should also be able to scale up by a factor of 32-64X or between ~5.3M to ~10.6M synapses.

Still this doesn’t approach TrueNorth’s Neurons/Synapse levels, but it’s close. But then 2 4-16Gb PCMs probably don’t cost nearly as much to purchase as TrueNorth costs to create.

The programing model for the TrueNorth/Synapse chips doesn’t appear to be neural network like. So perhaps another advantage of the PCM model of hardware based AI is that you can use standard, well known NN programming methods to train and simulate it.

So, PCM based neural networks seem an easier way to create hardware based AI. Not sure this will ever match Neuron/Synapse levels that the dedicated, special purpose neuromorphic chips in development can accomplish but in the end, they both are hardware based AI that can support better pattern recognition.

Using commodity PCM devices any organization with suitable technological skills should be able to create a hardware based NN that operates much faster than any NN software simulation. And if PCM technology starts to obtain market acceptance, the funding available to advance PCMs will vastly exceed that which IBM/MIT can devote to TrueNorth and its descendants.

Now, what is HP up to with their memristor technology and The Machine?

Photo Credits: Neurons by Leandro Agrò

RoS video interview with Ron Redmer Sr. VP Cybergroup

Ray interviewed Ronald Redmer, Sr. VP Cybergroup at EMC’s Global Analyst Summit back in October. Ron is in charge of engineering and product management of their new document analytics service offering. Many of their new service offerings depend on EMC Federation solutions such as ViPR (see my post EMC ViPR virtues & vexations but no virtualization), Pivotal HD, and other offerings.

This was recorded on October 28th in Boston.