Strategic planning – Silverton Consulting

One agent to rule them all, Deepmind’s Gato – AGI part 7

Posted on August 30, 2023August 30, 2023 by Ray in AGI, Deep Learning, Scenario planning, Strategic Inflection Points, Strategic planning, System effectiveness

I was perusing Deepmind’s mountain of research today and ran across one article on their Gato agent (A Generalist Agent abstract, paper pdf). These days with Llama 2, GPT-4 and all the other LLM’s doing code, chatbots, image generation, etc. it seems generalist agents are everywhere. But that’s not quite right.

Gato can not only generate text from prompts, but can also control a robot arm for pick and place, caption images, navigate in 3D, play Atari and other (shooter) video games, etc. all with the same exact model architecture and the same exact NN weights with no transfer learning required.

Same weights/same model is very unusual for generalist agents. Historically, generalist agents were all specifically trained on each domain and each resultant model had distinct weights even if they used the same model architecture. For Deepmind, to train Gato and use the same model/same weights for multiple domains is a significant advance.

Gato has achieved significant success in multiple domains. See chart below. However, complete success is still a bit out of reach but they are making progress.

For instance, in the chart one can see that their are over 200 tasks in the DM Lab arena that the model is trained to perform and Gato’s mean performance for ~180 of them is above a (100%) expert level. I believe DM Lab stands for Deepmind Lab and is described as a (multiplayer, first person shooter) 3D video game built on top of Quake III arena.

Deepmind stated that the mean for each task in any domain was taken over 50 distinct iterations of the same task. Gato performs, on average, 450 out of 604 “control” tasks at better than 50% human expert level. Please note, Gato does a lot more than just “control tasks”.

Model size and RT robotic control

One thing I found interesting is that they kept the model size down to 1.2B parameters so that it can perform real time inferencing in controlling robot arms. Over time as hardware speed increases, they believe they should be able train larger models and still retain real-time control. But at the moment, with a 1.2B model it can still provide. real time inferencing.

In order to understand model size vs. expertise they used 3 different model sizes training on same data, 79M, 364M and 1.2B parameters. As can be seen on the above chart, the models did suffer in performance as they got smaller. (Unclear to me what “Tokens Processed” on the X axis actually mean other than data length trained with.) However, it seems to imply, that with similar data, bigger models performed better and the largest did 10 to 20% better than the smallest model trained with same data streams.

Examples of Gato in action

The robot they used to train for was a “Sawyer robot arm with 3-DoF cartesian velocity control, an additional DoF for velocity, and a discrete gripper action.” It seemed a very flexible robot arm that would be used in standard factory environments. One robot task was to stack different styles and colors of plastic blocks.

Deepmind says that Gato provides rudimentary dialogue generation and picture captioning capabilities. Looking at the chat streams persented, seems more than rudimentary to me.

Deepmind did try the (smaller) model on some tasks that it was not originally trained on and it seemed to perform well after “fine-tuning” on the task. In most cases, using fine-tuning of the original model, with just “same domain” (task specific) data, the finely tuned model achieved similar results to what it achieved if Gato was trained from scratch with all the data used in the original model PLUS that specific domain’s data.

Data and tokenization used to train Gato

Deepmind is known for their leading edge research in RL but Gato’s deep neural net model is all trained with supervised learning using transformer techniques. While text based transformer type learning is pervasive in LLM today, vast web class data sets on 3D shooter gaming, robotic block stacking, image captioning and others aren’t nearly as widely available. Below they list the data sets Deepmind used to train Gato.

One key to how they could train a single transformer NN model to do all this, is that they normalized ALL the different types of data above into flat arrays of tokens.

Text was encoded into one of 32K subwords and was represented by integers from 0 to 32K. Text is presented to the model in word order
Images were transformed into 16×16 pixel patches in rastor order. Each pixel is normalized -1,1.
Other discrete values (e.g. Atari button pushes) are flattened into sequences of integers and presented to the model in row major order.
Continuous values (robot arm joint torques) are 1st flattened into sequences of floats in row major order and then mu-law encoded into the range -1,1 and then discretized into one of 1024 bins.

After tokenization, the data streams are converted into embeddings. Much more information on the tokenization and embedding process used in the model is available in the paper.

One can see the token count of the training data above. Like other LLMs, transformers take a token stream and randomly zero one out and are trained to guess that correct token in sequence.

~~~~

The paper (see link above and below) has a lot more to say about the control and non-control domains and the data used in training/fine-tuning Gato, if you’re interested. They also have a lengthy section on risks and challenges present in models of this type.

My concern is that as generalist models become more pervasive and as they are trained to work in more domains, the difference between an true AGI agent and a Generalist agent starts to blur.

Something like Gato that can both work in real world (via robotics) and perform meta analysis (like in metaworld), play 1st person shooter games, and analyze 2D and 3D images, all at near expert levels, and oh, support real time inferencing, seems to not that far away from something that could be used as a killer robot in an army of the future and this is just where Gato is today.

One thing I note is that the model is not being made generally available outside of Google Deepmind. And IMHO, that for now is a good thing.

That is until some bad actor gets their hands on it….

Picture Credit(s):

All images, charts, and tables are from “A Generalist Agent” paper

The Hollowing out of enterprise IT

Posted on October 14, 2022 by Ray in Cloud services, Forecasting, IoT, Market dynamics, Storage, Strategic Inflection Points, Strategic planning, Visionary leadershp

We had a relatively long discussion yesterday, amongst a bunch of independent analysts and one topic that came up was my thesis that enterprise IT is being hollowed out by two forces pulling in opposite directions on their apps. Those forces are the cloud and the edge.

Western part of the abandoned Packard Automotive Plant in Detroit, Michigan. by Albert Duce

Cloud sirens

The siren call of the cloud for business units, developers and modern apps has been present for a long time now. And their call is more omnipresent than Odysseus ever had to deal with.

The cloud’s allure is primarily low cost-instant infrastructure that just works, a software solution/tool box that’s overflowing, with locations close to most major metropolitan areas, and the extreme ease of starting up.

If your app ever hopes to scale to meet customer demand, where else can you go. If your data can literally come in from anywhere, it usually lands on the cloud. And if you have need for modern solutions, tools, frameworks or just about anything the software world can create, there’s nowhere else with more of this than the cloud.

Pre-cloud, all those apps would have run in the enterprise or wouldn’t have run at all. And all that data would have been funneled back into the enterprise.

Not today, the cloud has it all, its siren call is getting louder everyday, ever ready to satisfy every IT desire anyone could possibly have, except for the edge.

The Edge, last bastion for onsite infrastructure

The edge sort of emerged over the last decade or so kind of in stealth mode. Yes there were always pockets of edge, with unique compute or storage needs. For example, video surveillance has been around forever but the real acceleration of edge deployments started over the last decade or so as compute and storage prices came down drastically.

These days, the data being generated is stagering and compute requirements that go along with all that data are all over the place, from a few ARMv/RISC V cores to a server farm.

For instance, CERN’s LHC creates a PB of data every second of operation (see IEEE Spectrum article, ML shaking up particle physics too). But they don’t store all that. So they use extensive compute (and ML) to try to only store interesting events.

Seismic ships roam the seas taking images of underground structures, generating gobs of data, some of which is processed on ship and the rest elsewhere. A friend of mine creates RPi enabled devices that measure tank liquid levels deployed in the field.

More recently, smart cars are like a data center on tires, rolling across roads around the world generating more data than you want can even imagine. 5G towers are data centers ontop of buildings, in farmland, and in cell towers doting the highways of today. All off the beaten path, and all places where no data center has ever gone before.

In olden days there would have been much less processing done at the edge and more in an enterprise data center. But nowadays, with the advent of relatively cheap computing and storage, data can be pre-processed, compressed, tagged all done at the edge, and then sent elsewhere for further processing (mostly done in the cloud of course).

IT Vendors at the crossroads

And what does the hollowing out of the enterprise data centers mean for IT server and storage vendors, mostly danger lies ahead. Enterprise IT hardware spend will stop growing, if it hasn’t already, and over time, shrink dramatically. It may be hard to see this today, but it’s only a matter of time.

Certainly, all these vendors can become more cloud like, on prem, offering compute and storage as a service, with various payment options to make it easier to consume. And for storage vendors, they can take advantage of their installed base by providing software versions of their systems running in the cloud that allows for easier migration and onboarding to the cloud. The server vendors have no such option. I see all the above as more of a defensive, delaying or holding action.

This is not to say the enterprise data centers will go away. Just like, mainframe and tape before them, on prem data centers will exist forever, but will be relegated to smaller and smaller, niche markets, that won’t grow anymore. But, only as long as vendor(s) continue to upgrade technology AND there’s profit to be made.

It’s just that that astronomical growth, that’s been happening ever since the middle of last century, happen in enterprise hardware anymore.

Long term life for enterprise vendors will be hard(er)

Over the long haul, some server vendors may be able to pivot to the edge. But the diversity of compute hardware there will make it difficult to generate enough volumes to make a decent profit there. However, it’s not to say that there will be 0 profits there, just less. So, when I see a Dell or HPE server, under the hood of my next smart car or inside the guts of my next drone, then and only then, will I see a path forward (or sustained revenue growth) for these guys.

For enterprise storage vendors, their future prospects look bleak in comparison. Despite the data generation and growth at the edge, I don’t see much of a role for them there. The enterprise class feature and functionality, they have spent the decades creating and nurturing aren’t valued as much in the cloud nor are they presently needed in the edge. Maybe I’m missing something here, but I just don’t see a long term play for them in the cloud or edge.

~~~~

For the record, all this is conjecture on my part. But I have always believed that if you follow where new apps are being created, there you will find a market ready to explode. And where the apps are no longer being created, there you will see a market in the throws of a slow death.

Photo Credit(s):

From Wikipedia Article on Urban decay, CC BY-SA 3.0
Surveillance, by Blake Burkhart
Kansas Portrait by Patrick Emerson
_MG_0181 by Last.fm

Dell EMC PowerStore X and the Edge – TFDxDell

Posted on November 11, 2021November 11, 2021 by Ray in data services, IoT, Strategic planning, System effectiveness, VMware storage, vVOL storage

This past summer I attended a virtual TFDxDell event where there was a number of sessions discussing Dell EMC technologies for the enterprise. One session sort of struck a nerve, the Dell EMC PowerStore session and I have finally figured out what interested me most in their talk, their PowerStore X appliances and AppsON technologies

What is AppsON and PowerStore X appliance?

Essentially PowerStore X with AppsON has an onboard ESXi hypervisor which allows customers to run vSphere VMs inside the storage system with direct vVol (I assume) access to PowerStore data storage without having to go out over a (storage) network.

PowerStore X ESXi is a little behind the most recent VMware vSphere releases (at least 30 days) but it’s current enough for most shops. In non-PowerStore X appliances, PowerStoreOS runs as containers but in PowerStore X, PowerStoreOS storage functionality runs as VMs, just like any other VMs running on its ESXi hypervisor.

Moreover, PowerStore X can still service IOs from other non-PowerStore X resident VMs or bare metal applications running in the environment. In this way you get all the data services of an enterprise class storage system, that also run VMs.

With PowerStore OS 2.0 they have added scale out to AppsON. That is any PowerStore X (1000X, 3000X, 5000X or 7000X) appliance, in a PowerStore X cluster, can have their VMs move from one appliance to another using vSphere vMotion. This means that as your PowerStore X storage clusters grow, you can rebalance VM application workloads across the cluster. A PowerStore X cluster can contain up to 4 PowerStore X appliances.

PowerStore’s heritage goes back quite a ways at Dell and EMC. Prior versions of EMC Unity storage and some of its progenitors had the ability to run applications on the storage itself. But by running an ESXi hypervisor on PowerStore X appliances, it takes all this to a whole new level.

Why would anyone want AppsON?

It’s taken me sometime to understand why anyone would want to use AppsON and I have concluded that the edge might be the best environment to deploy it.

Recent VMware enhancements have reduced minimum node configurations for edge environments to 2 servers. It’s unclear to me whether a single PowerStore X appliance with AppsON is one server or two but, for the moment lets assume its just one. This means that a minimum VMware vSphere edge deployment could use 1 PowerStore X and 1 standalone, ESXi server.

In such an environment, customers could run their data intensive VMs directly on the PowerStore X and some of their non-data intensive VMs on the standalone server. But the flexibility exists to vMotion VMs from one to the other as demand dictates.

But does the edge need storage?

Yes, some do. For instance, take 5G. it enables a whole new class of mobile services and many of them can be quite data intensive. 5G is being deployed around the world as mini-data centers in cell towers. Unclear whether these data centers run vSphere but I’m sure VMware is trying their hardest to make that happen. With vSphere running your 5G mini-datacenter, PowerStore X could make a smart addition.

Then there’s all the smart cars, which are creating TBs of sensor data every time they take to the road. You’re probably not going to have a PowerStore appliance in your smart car (at least anytime soon) but they just might have one at the local service station.

And maybe given all the smart devices in your home, smart cars, smart appliances, smart robots, etc., there’s going to be a whole lot of data generated from your smart home. Having something like PowerStore X in your smart home’s mini-data center would offer a place to hold all that data and to do some processing (compressing maybe) before sending it up to the cloud.

~~~~

We have just two more questions for Dell EMC,

Shouldn’t the base PowerStore appliance be called PowerStore K?
Shouldn’t customers be allowed to run their own K8s container apps on their PowerStore K just as easily as running VMs in their PowerStore X?

Legal Disclosure: TechFieldDay and Dell provided gifts to all participants (including me) for the TFDxDell event.

Photo credit(s):

From Dell EMC slides presented at TFDxDell event
From Dell EMC slides presented at TFDxDell event
From Dell EMC slides presented at TFDxDell event

For AGI, is reward enough – part 4

Posted on September 22, 2021September 22, 2021 by Ray in AGI, Artificial Intelligence, Cognitive computing, Machine Learning, Reinforcement Learning, Scenario planning, Strategic Inflection Points, Strategic planning, Visionary leadershp

Last May, an article came out of DeepMind research titled Reward is enough. It was published in an artificial intelligence journal but PDFs of it are available free of charge.

The article points out that according to DeepMind researchers, using reinforcement learning and an appropriate reward signal is sufficient to attain AGI (artificial general intelligence). We have written about the perils and pitfalls of AGI before (see Existential event risks [-part-0], NVIDIA Triton GMI, a step to far[-part-1], The Myth of AGI [-part-2], and Towards a better AGI – part 3ish. (Sorry, I only started numbering them after part 3ish).

My last post on AGI inclined towards the belief that AGI was not possible without combining deduction, induction and abduction (probabilistic reasoning) together and that any such AGI was a distant dream at best.

Then I read the Reward is Enough article and it implied that they saw a realistic roadmap towards achieving AGI based solely on reward signals and Reinforcement Learning (wikipedia article on Reinforcement Learning ). To read the article was disheartening at best. After the article came out, I made it a hobby to understand everything I could about Reinforcement Learning to understand whether what they are talking is feasible or not.

Reinforcement learning, explained

Let’s just say that the text book, Reinforcement Learning, is not the easiest read I’ve seen. But I gave it a shot and although I’m no where near finished, (lost somewhere in chapter 4), I’ve come away with a better appreciation of reinforcement learning.

The premise of Reinforcement Learning, as I understand it, is to construct a program that performs a sequence of steps based on state or environment the program is working on, records that sequence and tags or values that sequence with a reward signal (i.e., +1 for good job, -1 for bad, etc.). Depending on whether the steps are finite, i.,e, always ends or infinite, never ends, the reward tagging could be cumulative (finite steps) or discounted (infinite steps).

The record of the program’s sequence of steps would include the state or the environment and the next step that was taken. Doing this until the program completes the task or if, infinite, whenever the discounted reward signal is minuscule enough to not matter anymore.

Once you have a log or record of the state, the step taken in that state and the reward for that step you have a policy used to take better steps. Over time, with sufficient state-step-reward sequences, one can build a policy that would work’s very well for the problem at hand.

Reinforcement learning, a chess playing example

Let’s say you want to create a chess playing program using reinforcement learning. If a sequence of moves ends the game, you can tag each move in that sequence with a reward (say +1 for wins, 0 for draws and -1 for losing), perhaps discounted by the number of moves it took to win. The “sequence of steps” would include the game board and the move chosen by the program for that board position.

Figure 2: Comparison with specialized programs. (A) Tournament evaluation of AlphaZero in chess, shogi, and Go in matches against respectively Stockfish, Elmo, and the previously published version of AlphaGo Zero (AG0) that was trained for 3 days. In the top bar, AlphaZero plays white; in the bottom bar AlphaZero plays black. Each bar shows the results from AlphaZero’s perspective: win (‘W’, green), draw (‘D’, grey), loss (‘L’, red). (B) Scalability of AlphaZero with thinking time, compared to Stockfish and Elmo. Stockfish and Elmo always receive full time (3 hours per game plus 15 seconds per move), time for AlphaZero is scaled down as indicated. (C) Extra evaluations of AlphaZero in chess against the most recent version of Stockfish at the time of writing, and against Stockfish with a strong opening book. Extra evaluations of AlphaZero in shogi were carried out against another strong shogi program Aperyqhapaq at full time controls and against Elmo under 2017 CSA world championship time controls (10 minutes per game plus 10 seconds per move). (D) Average result of chess matches starting from different opening positions: either common human positions, or the 2016 TCEC world championship opening positions . Average result of shogi matches starting from common human positions . CSA world
championship games start from the initial board position.

If your policy incorporates enough winning chess move sequences and the program encounters one of these in a game and if move recorded won, select that move, if lost, select another valid move at random. If the program runs across a board position its never seen before, choose a valid move at random.

Do this enough times and you can build a winning white playing chess policy. Doing something similar for black playing program would build a winning black playing chess policy.

The researchers at DeepMind explain their AlphaZero program which plays chess, shogi, and Go in another research article, A general reinforcement learning algorithm that masters chess, shogi and Go through self-play.

Reinforcement learning and AGI

So now what does all that have to do with creating AGI. The premise of the paper is that by using rewards and reinforcement learning, one could program a policy for any domain that one encounters in the world.

For example, using the above chart, if we were to construct reinforcement learning programs that mimicked perception (object classification/detection) abilities, memory ((image/verbal/emotional/?) abilities, motor control abilities, etc. Each subsystem could be trained to solve the arena needed. And over time, if we built up enough of these subsystems one could somehow construct an AGI system of subsystems, that would match human levels of intelligence.

The paper’s main hypothesis is “(Reward is enough) Intelligence, and its associated abilities, can be understood as subserving the maximization of reward by an agent acting in its environment.”

Given where I am today, I agree with the hypothesis. But the crux of the problem is in the details. Yes, for a game of multiple players and where a reward signal of some type can be computed, a reinforcement learning program can be crafted that plays better than any human but this is only because one can create programs that can play that game, one can create programs that understand whether the game is won or lost and use all this to improve the game playing policy over time and game iterations.

Does rewards and reinforcement learning provide a roadmap to AGI

To use reinforcement learning to achieve AGI implies that

One can identify all the arenas required for (human) intelligence
One can compute a proper reward signal for each arena involved in (human) intelligence,
One can programmatically compute appropriate steps to take to solve that arena’s activity,
One can save a sequence of state-steps taken to solve that arena’s problem, and
One can run sequences of steps enough times to produce a good policy for that arena.

There are a number of potential difficulties in the above. For instance, what’s the state the program operates in.

For a human, which has 500K(?) pressure, pain, cold, & heat sensors throughout the exterior and interior of the body, two eyes, ears, & nostrils, one tongue, two balance sensors, tired, anxious, hunger, sadness, happiness, and pleasure signals, and 600 muscles actuating the position of five fingers/hand, toes/foot, two eyes ears, feet, legs, hands, and arms, one head and torso. Such a “body state, becomes quite complex. Any state that records all this would be quite large. Ok it’s just data, just throw more storage at the problem – my kind of problem.

The compute power to create good policies for each subsystem would also be substantial and in the end determining the correct reward signal would be non-trivial for each and every subsystem. Yet, all it takes is money, time and effort and all this could be accomplished.

So, yes, given all the above creating an AGI, that matches human levels of intelligence, using reinforcement learning techniques and rewards is certainly possible. But given all the state information, action possibilities and reward signals inherent in a human interacting in the world today, any human level AGI, would seem unfeasible in the next year or so.

One item of interest, recent DeepMind researchers have create MuZero which learns how to play Go, Chess, Shogi and Atari games without any pre-programmed knowledge of the games (that is how to play the game, how to determine if the game is won or lost, etc.). It managed to come up with it’s own internal reward signal for each game and determined what the proper moves were for each game. This seemed to combine a deep learning neural network together with reinforcement learning techniques to craft a rewards signal and valid move policies.

Alternatives to full AGI

But who says you need AGI, for something that might be a useful to us. Let’s say you just want to construct an intelligent oracle that understood all human generated knowledge and science and could answer any question posed to it. With the only response capabilities being audio, video, images and text.

Even an intelligent oracle such as the above would need an extremely large state. Such a state would include all human and machine generated information at some point in time. And any reward signal needed to generate a good oracle policy would need to be very sophisticated, it would need to determine whether the oracle’s answer; was good or not. And of course the steps to take to answer a query are uncountable, 1st there’s understanding the query, next searching out and examining every piece of information in the state space for relevance, and finally using all that information to answer to the question.

I’m probably missing a few steps in the above, and it almost makes creating a human level AGI seem easier.

Perhaps the MuZero techniques might have an answer to some or all of the above.

~~~~

Yes, reinforcement learning is a valid roadmap to achieving AGI, but can it be done today – no. Tomorrow, perhaps.

Photo credit(s):

From DeepMind research paper, A general reinforcement learning algorithm that masters chess, shogi and Go through self-play
From DeepMind research paper, Reward is enough

Photonic [AI] computing seeing the light of day – part 2

Posted on June 16, 2021June 16, 2021 by Ray in Artificial Intelligence, Deep Learning, Energy efficiency, Photonic computing, Strategic Inflection Points, Strategic planning

Read an interesting article in Analytics India Magazine (MIT Researchers Make New Chips That Work On Light) about a startup out of MIT focused on using photonics for AI/ML/DL activities. Not exactly neuromorphic chips, but using analog photonics interactions to perform computational intensive operations required by todays deep neural net training.

We’ve written about photonics computing before ( see Photonic computing seeing the light of day [-part 1]). That post was about spin outs from Princeton and MIT back in 2019. We showed a bit more on how photonics can perform multiplication and other computations with less power.

The article (noted above) talked about LightIntelligence, an MIT spinout/ startup that’s been around since ~2017, but there’s another company in the same space, also out of MIT called LightMatter that just announced early access to their hardware system.

The CEOs of both companies collaborated on a paper (#1&2 authors of the 10 author paper) written back in 2017 on Deep Learning with Coherent Nanophotonic Circuits. This seemed to be the event that launched both companies.

LightMatter just received $80M in Series B funding ( bringing total funding to $113M) last month and LightIntelligence seems to have $40M in total funding So both have decent funding but, LightMatter seems further ahead in funding and product technology.

LightMatter

LightMatter Envise AI chip uses standard RISC electronic cores together with Photo Arithmetic Units for accelerated AI computations. Each Envise chip has 500MB of SRAM for large models, offers 400Gbps chip to chip interconnect fabric, 256 RISC cores, a Graph processor, 294 photonic arithmetic units and PCIe 4.0 connectivity.

LightMatter has just announced early access for their Envise AI photonics server. It’s an 4U, AI server with 16 Envise chips, 2 AMD EPYC CPUs, (16×400=)6.4Tpbs optical fabric for inter-chip communications, 1TB of DDR4 DRAM, 3TB of NVMe SSD and supports 2-200GbE SmartNICs for outside communications.

Envise also offers Idiom Software that interfaces with standard AI frameworks to transform models for photonics computing to use Envise hardware . Developers select Envise hardware to run their AI models on and Idiom automatically re-compiles (IdCompile) their model into more parallelized, photonics operations. Idiom also has a model profiler (IdProfiler) to help debug and visualize photonic models in operation (training or inferencing?) on Envise hardware. Idiom also offers an AI model library (IdML) which provides a PyTorch frontend to help compress and quantize a standard set of AI models.

LightMatter also announced their Passage optical interconnect chip that supplies 100Tbps optical switch for photonics, CPU or GPU processing. It’s huge, 8″x8″ and built on 5nm/7nm node process. Passage can connect up to 48 photonics, CPU or GPU chips that are built onto of it (one can see the space for each of these 48 [sub-]chips on the chip). LightMatter states that 40 Passage (photonic/optical) lanes are the width of one optical fibre. Passage chips are sampling now.

LightMatter Passage *photonics-transistor* chip (carrier) that provides a photonics programmable interconnect for inter-[photonics-electronic-]chip communications.

LightIntelligence

They don’t appear to be announcing any specific hardware just yet but they are at work in creating the world largest integrated photonics processing system. But LightIntelligence have published a number of research papers focused on photonic approaches to CNNs, RNNs/LSTMs/GRUs, Recurrent ISING machines, statistical computing, and invisibility cloaking.

Turns out the processing power needed to provide invisibility cloaking is very intensive and as its all pixels, photonics offers serious speedups (for invisibility, see Nature article, behind paywall).

Photonics Recurrent ISLING Sampler (PRIS)

LightIntelligence did produce a prototype photonics processor in 2019. And they believe the will have de-risked 80-90% of their photonics technology by year end 2021.

If I had to guess, it would appear as if LightIntelligence is trying to re-imagine deep learning taking a predominately all photonics approach.

Why photonics for AI DL

It turns out that one can use the interaction/interference between two light beams to perform matrix multiplication and other computations a lot faster, with a lot less power than using standard RISC (or CISC) electronic processor architectures. Typical GPUs run 400W each and multi-GPU training activities are commonplace today.

The research documented in the (Deep learning using nanophotonics) paper was based on using an optical FPGA which we have talked about before (See Photonics or Optical FPGAs on the horizon) to prototype the technology back in 2017.

Can photonics change the technology underpinning AI or computing?

If by using photonics, one could speed up AI inferencing by 3-5X and do it with 5-6X less power, you might have a market. These are LightMatter Envise performance numbers on ResNet50 with ImageNet and BERT-Base with SQUAD v1.1 against NVIDIA DGX-A100 (state of the art) AI processing system.

The challenge to changing the technology behind multi-million/billion/trillion dollar industry is that it’s not sufficient to offer a product better than the competition. One has to offer a technology that’s better enough to fund the building of a new (multi-million/billion/trillion dollar) ecosystem surrounding that technology. In order to do that it’s got to be orders of magnitude faster/lower power/better so that commercial customers adopt it en masse.

I like where LightMatter is going with their Passage chip. But their Envise server doesn’t seem fast enough to give them enough traction to build a photonics ecosystem or to fund Envise 2, 3, 4, etc. to change the industry.

The 2017 (Deep learning using nanophotonics) paper predicted that an all optical/photonics implementation of CNN would use 3 orders of magnitude less power for small models and that advantage would only go up for larger models (not counting power for data movement, photo detectors, etc.). Now if that’s truly feasible and maybe it takes a more photonics intensive processor to get there, then photonics technology could truly transform the AI or for that matter the computing industry.

But the other thing that LightIntelligence and LightMatter may be counting on is the slowdown in Moore’s law which may inhibit further advances in electronics processing power. Whether the silicon industry is ready to throw in the towel yet on Moore’s law is TBD.

Comments?

Photo Credit(s):

From LightMatter‘s website
From Deep Learning with Coherent Nanophotonic Circuits
From Heuristic recurrent algorithms for photonic Ising machines paper

New era of graphical AI is near #AIFD2 @Intel

Posted on June 7, 2021June 7, 2021 by Ray in Artificial Intelligence, Cognitive computing, Deep Learning, Information economy, Machine Learning, Optical networking, Processing performance, Strategic Inflection Points, Strategic planning, Visionary leadershp

I attended AIFD2 ( videos of their sessions available here) a couple of weeks back and for the last session, Intel presented information on what they had been working on for new graphical optimized cores and a partner they have, called Katana Graph, which supports a highly optimized graphical analytics processing tool set using latest generation Xeon compute and Optane PMEM.

What’s so special about graphs

The challenges with graphical processing is that it’s nothing like standard 2D tables/images or 3D oriented data sets. It’s essentially a non-Euclidean data space that has nodes with edges that connect them.

But graphs are everywhere we look today, for instance, “friend” connection graphs, “terrorist” networks, page rank algorithms, drug impacts on biochemical pathways, cut points (single points of failure in networks or electrical grids), and of course optimized routing.

The challenge is that large graphs aren’t easily processed with standard scale up or scale out architectures. Part of this is that graphs are very sparse, one node could point to one other node or to millions. Due to this sparsity, standard data caching fetch logic (such as fetching everything adjacent to a memory request) and standardized vector processing (same instructions applied to data in sequence) don’t work very well at all. Also standard compute branch prediction logic doesn’t work. (Not sure why but apparently branching for graph processing depends more on data at the node or in the edge connecting nodes).

Intel talked about a new compute core they’ve been working on, which was was in response to a DARPA funded activity to speed up graphical processing and activities 1000X over current CPU/GPU hardware capabilities.

Intel presented on their PIUMA core technology was also described in a 2020 research paper (Programmable Integrated and Unified Memory Architecture) and YouTube video (Programmable Unified Memory Architecture).

Intel’s PIUMA Technology

DARPA’s goals became public in 2017 and described their Hierarchical Identity Verify Exploit (HIVE) architecture. HIVE is DOD’s description of a graphical analytics processor and is a multi-institutional initiative to speed up graphical processing. .

Intel PIUMA cores come with a multitude of 64-bit RISC processor pipelines with a global (shared) address space, memory and network interfaces that are optimized for 8 byte data transfers, a (globally addressed) scratchpad memory and an offload engine for common operations like scatter/gather memory access.

Each multi-thread PIUMA core has a set of instruction caches, small data caches and register files to support each thread (pipeline) in execution. And a PIUMA core has a number of multi-thread cores that are connected together.

PIUMA cores are optimized for TTEPS (Tera-Traversed Edges Per Second) and attempt to balance IO, memory and compute for graphical activities. PIUMA multi-thread cores are tied together into (completely connected) clique into a tile, multiple tiles are connected within a single node and multiple nodes are tied together with a 8 byte transfer optimized network into a PIUMA system.

P[I]UMA (labeled PUMA in the video) multi-thread cores apparently eschew extensive data and instruction caching to focus on creating a large number of relatively simple cores, that can process a multitude of threads at the same time. Most of these threads will be waiting on memory, so the more threads executing, the less likely that whole pipeline will need to be idle, and hopefully the more processing speedup can result.

Performance of P[I]UMA architecture vs. a standard Xeon compute architecture on graphical analytics and other graph oriented tasks were simulated with some results presented below.

Simulated speedup for a single node with P[I]UMAtechnology vs. Xeon range anywhere from 3.1x to 279x and depends on the amount of computation required at each node (or edge). (Intel saw no speedups between a single Xeon node and multiple Xeon Nodes, so the speedup results for 16 P[I]UMA nodes was 16X a single P[I]UMA node).

Having a global address space across all PIUMA nodes in a system is pretty impressive. We guess this is intrinsic to their (large) graph processing performance and is dependent on their use of photonics HyperX networking between nodes for low latency, small (8 byte) data access.

Katana Graph software

Another part of Intel’s session at AIFD2 was on their partnership with Katana Graph, a scale out graph analytics software provider. Katana Graph can take advantage of ubiquitous Xeon compute and Optane PMEM to speed up and scale-out graph processing. Katana Graph uses Intel’s oneAPI.

Katana graph is architected to support some of the largest graphs around. They tested it with the WDC12 web data commons 2012 page crawl with 3.5B nodes (pages) and 128B connections (links) between nodes.

Katana runs on AWS, Azure, GCP hyperscaler environment as well as on prem and can scale up to 256 systems.

Katana Graph performance results for Graph Neural Networks (GNNs) is shown below. GNNs are similar to AI/ML/DL CNNs but use graphical data rather than images. One can take a graph and reduce (convolute) and summarize segments to classify them. Moreover, GNNs can be used to understand whether two nodes are connected and whether two (sub)graphs are equivalent/similar.

In addition to GNNs, Katana Graph supports Graph Transformer Networks (GTNs) which can analyze meta paths within a larger, heterogeneous graph. The challenge with large graphs (say friend/terrorist networks) is that there are a large number of distinct sub-graphs within the graph. GTNs can break heterogenous graphs into sub- or meta-graphs, which can then be used to understand these relationships at smaller scales.

At AIFD2, Intel also presented an update on their Analytics Zoo, which is Intel’s MLops framework. But that will need to wait for another time.

~~~~

It was sort of a revelation to me that graphical data was not amenable to normal compute core processing using today’s GPUs or CPUs. DARPA (and Intel) saw this defect as a need for a completely different, brand new compute architecture.

Even so, Intel’s partnership with Katana Graph says that even today compute environment could provide higher performance on graphical data with suitable optimizations.

It would be interesting to see what Katana Graph could do using PIUMA technology and appropriate optimizations.

In any case, we shouldn’t need to wait long, Intel indicated in the video that P[I]UMA Technology chips could be here within the next year or so.

Comments?

Photo Credit(s):

From Intel’s AIFD2 presentations
From Intel’s PUMA you tube video

A tale of two countries and how they controlled the Coronavirus

Posted on March 18, 2020May 16, 2022 by Ray in Scenario planning, Strategic Inflection Points, Strategic planning, System effectiveness, System quality

Read an article in IEEE Spectrum last week about Taiwan’s response to COVID-19 (see: Big data helps Taiwan fight Coronavirus) which was reporting on an article in JAMA (see Response to COVID-19 in Taiwan) about Taiwan’s success in controlling the COVID-19 outbreak in their country.

I originally intended this post to be solely about Taiwan’s response to the virus but then thought that it more instructive to compare and contrast Taiwan and South Korea responses to the virus, who both seem to have it under control now (18 Mar 2020).

But first a little about the two countries (source wikipedia: South Korea and Taiwan articles):

Taiwan (TWN) and South Korea (ROK) both enjoy close proximity, trade and travel between their two countries and China

South Korea (ROK) has a population of ~50.8M, an area of 38.6K SqMi (100.0K SqKm) and extends about 680 Mi (1100 Km) away from the Asian mainland (China).
Taiwan (TWN ) has a population of ~23.4M, an area of 13.8K SqMi (35.8K Sq Km) and is about 110 Mi (180 Km) away from the Asian mainland (China).

COVID-19 disease progression & response in TWN and ROK

There’s lots of information about TWN’s response (see articles mentioned above) to the virus but less so on ROK’s response.

Nonetheless, here’s some highlights of the progression of the pandemic and how they each reacted (source for disease/case progression from : wikipedia Coronavirus timeline Nov’19 to Jan’20, and Coronavirus timeline Feb’20; source for TWN response to virus JAMA article supplement and ROK response to virus Timeline: What the world can learn from South Korea’s COVID-19 response ).

Dec. 31, 2019: China Wuhan municipal health announced “urgent notice on the treatment of pneumonia of unknown cause”. Taiwan immediately tightened inbound screening processes. ==> TWN: officials board and inspect passengers for fever or pneumonia symptoms on direct flights from Wuhan
Jan. 8, 2020: ROK identifies 1st possible case of the disease in a women who recently returned from China Wuhan province
Jan 20: ROK reports 1st laboratory confirmed case ==> TWN: Central Epidemic Command Center activated, activates Level 2 travel alert for Wuhan; ROK CDC starts daily press briefings on disease progress in the nation
Jan. 21: TWN identifies 1st laboratory confirmed case ==> TWN: activates Level 3 travel alert for Wuhan
Jan 22: ==> TWN: cancels entry permits for 459 tourists from Wuhan set to arrive later in Jan
Jan 23: ==> TWN: bans residents from Wuhan, travelers from China required to make online health declaration before entering
Jan. 24 ROK reports 2nd laboratory confirmed case ==> TWN bans export of facemasks; ROK, sometime around now the gov’t started tracking confirmed cases using credit card and CCTV data to understand where patients contacted the disease
Jan. 25: ==> TWN: tours to china are suspended until Jan 31, activates level 3 travel alert for Hubei Province and Level 2 for rest of China, enacts export ban on surgical masks until Feb 23
Jan 26: ==> TWN: all tour groups from Wuhan have to leave,
Jan. 27: TWN reports 1st domestic transmission of the disease ==>TWN NHIA and NIA (National health and immigration authorities) integrate (adds all hospital) patients past 14-day travel history to NHIA database, all tour groups from Hubei Province have to leave
Jan 28: ==> TWN: activates Level 3 travel alert for all of China except Hong Kong and Macau; ROK requests inspection of all people who have traveled from Wuhan in the past 14 days
Jan 29: ==> TWN: institutes electronic monitoring of all quarantined patients via gov’t issued cell phones; ROK about now requests production of massive numbers of WHO approved test kits for the Coronavirus
Jan. 30: ROK reports 2 more (4 total) confirmed cases of the disease ==> TWN: tours to or transiting China suspended until Feb 29;
Jan 31: ==> TWN: all remaining tour groups from China asked to leave
Feb 2 ==> TWN extended school break from Feb 15 to Feb 25,gov’t facilities available for quarantine, soldiers mobilized to man facemask production lines, 60 additional machines installed daily facemask output to reach 10M facemasks a day.
Feb 3: ==> TWN: enacts name based rationing system for facemasks, develops mobile phone app to allow public to see pharmacy mask stocks, Wenzhou city Level 2 travel alert; ROK CDC releases enhanced quarantine guidelines to manage the disease outbreak, as of today ROK CDC starts making 2-3 press releases a day on the progress of the disease
Feb 5: ==> TWN: Zheijanp province Level 2 travel alert, all cruise ships with suspected cases in past 28 days banned, any cruise ship with previous dockings in China, Hong Kong, or Macau in past 14 days are banned
Feb 6:==> TWN: Tours to Hong Kong & Macau suspended until Feb 29, all Chinese nationals banned, all international cruise ship are banned, all contacts from Diamond Princess cruise ship passengers who disembarked on Jan 31 are traced
Feb 7: ==> TWN: All foriegn nationals with travel to China, Hong Kong or Macau in the past 14 days are banned, all Foreigners must see an immigration officer,
Feb 14:==> TWN: Entry quarantine system launched fill out electronic health declaration for faster entry
Feb 16: ==> TWN: NHIA database expanded to cover 30 day travel history for travelers form or transited through China, Hong Kong, Macau, Singapore and Thailand.
Feb 18 ==> TWN: all hospitals, clinics and pharmacies have access to patients travel history; ROK most institutions postpone the re-start of school after spring break
Feb 19 ==> TWN establishes gov’t policies to disinfect schools and school areas, school buses, high speed rail, railways, tour busses and taxis
Feb 20 ==> ROK Daegu requests all individuals to stay home
Feb 21 ==> TWN establishes school suspension guidelines based on cases diagnosed in school; ROK Seoul closes all public gatherings and protests
Feb 24 ==> TWN, travelers with history of travel to china, from countries with level 1 or 2 travel alerts, and all foreign nationals subject to 14 day quarantine (By this time many countries are in level 1-2-3 travel alert status in TWN)
Feb 26 ==> ROK opens drive-thru testing clinics, patients are informed via text messages (3 days later) the results of their tests
Mar 3? ==> ROK starts selling facemasks at post offices
Mar 5 ==> ROK bans the export of face masks

As of Mar 16, (as reported in Wikipedia), TWN had 67 cases and 1 death; and ROK had 8,326 cases and 75 deaths. As of Mar 13 (as reported is Our world in data article), TWN had tested 16,089 and ROK had tested 248,647 people.

Summary of TWN and ROK responses to the virus

For starters, both TWN and ROK learned valuable lessons from the last infections from China SARS-H1N1 and used those lessons to deal better with COVID-19. Also neither country had any problem accessing credit information, mobile phone location data, CCTV camera or any other electronic information to trace infected people in their respective countries.

If I had to characterize the responses to the virus from the two countries:

TWN was seemingly focused early on reducing infections from outside, controlling & providing face masks to all, and identifying gov’t policies (ceasing public gathering, quarantine and disinfectant procedure) to reduce transmission of the disease. They augmented and promoted the use of public NHIA databases to track recent travel activity and used any information available to monitor the infected and track down anyone they may have contacted. Although TWN has increased testing over time, they did not seem to have much of an emphasis on broad testing. At this point, TWN seems to have the virus under control.
ROK was all about public communications, policies (quarantine and openness), aggressively testing their population and quarantining those that were infected. ROK also tracked the goings on and contacts of anyone that was infected. ROK started early on broadly testing anyone that wanted to be tested. Using test results, infected individuals were asked to quarantine. A reporter I saw talking about ROK mentions 3 T’s: Target, Test, & Trace At this point, ROK seems to have the virus under control.

In addition, Asian countries in general are more prone to use face masks when traveling, which may be somewhat restrict Coronavirus transmission. Although it seems to primarily reduce transmission, most of the public in these countries (now) routinely wear face masks when out and about. And previously they routinely wore face masks when traveling to reduce disease transmission.

Also both countries took the news out of Wuhan China about the extent of the infections, deaths and ease of disease transmission as truthful and acted on this before any significant infections were detected in their respective countries

What the rest of the world can learn from these two countries

What we need to take from TWN a& ROK is that

Face masks and surgical masks are a critical resource during any pandemic. National production needs to be boosted immediately with pricing and distribution controls so that they are not hoarded, nor subject to price gouging. In the USA we have had nothing on this front other than requests to the public to stop hoarding them and the lack of availability to support healthcare workers).
Test kits are also a critical resource during any pandemic. Selection of the test kit, validation and boosting production of test kits needs to be an early and high priority. The USA seems to have fallen down on this job.
Travel restrictions, control and quarantines need to be instituted early on from infected countries. USA did take action to restrict travel and have instituted quarantines on cruise ship passengers and any repatriated nationals from China.
Limited testing can help control the virus as long as it’s properly targeted. Mass, or rather less, targeted testing can also help control the virus as well. In the USA given the lack of test kits, we are limited to targeted testing.
Open, rapid and constant communications can be an important adjunct to help control virus spread. The USA seems to be still working on this. Many states seem to have set up special communications channels to discuss the latest information. But there doesn’t seem to be any ongoing, every day communications effort on behalf of the USA CDC to communicate pandemic status.
When one country reports infections, death and ease of transmission of a disease start to take serious precautions immediately. Disease transmission in our travel intensive world is much too easy and rapid to stop once it takes hold in a nation. Any nation today that starts to encounter and infectious agent with high death rates and seemingly easy transmission must be taken seriously as the start of something much bigger.

Stay safe, be well.

~~~~

Comments?

Photo Credit(s):

Coronavirus picture USA CDC
Topographic map of ROK, wikipedia article
Topographic map of TWN, researchgate article
“North Dakota National Guard” by The National Guard is licensed under CC BY 2.0.

Clouds an existential threat – part 2

Posted on July 24, 2019July 24, 2019 by Ray in Cloud services, Decision making, Distributed computing, Executive leadership, Information economy, Market dynamics, Scenario planning, Strategic Inflection Points, Strategic planning, Visionary organizations

Recall that in part 1, we discussed most of the threats posed by clouds to both hardware and software IT vendors. In that post we talked about some of the more common ways that vendors are trying to head off this threat (for now).

In this post we want to talk about some uncommon ways to deal with the coming cloud apocalypse.

But first just to put the cloud threat in perspective, the IT TAM is estimated, by one major consulting firm, to be a ~$3.8T in 2019 with a growth rate of 3.7% Y/Y. The same number for public cloud spending, is ~$214B in 2019, growing by 17.5% Y/Y. If both growth rates continue (a BIG if), public cloud services spend will constitute all (~98.7%) of IT TAM in ~24 years from now. No nobody would predict those growth rates will continue but it’s pretty evident the growth trends are going the wrong way for (non-public cloud) IT vendors.

There are probably an infinite number of ways to deal with the cloud. But outside of the common ones we discussed in part 1, only a dozen or so seem feasible to me and even less are fairly viable for present IT vendors.

Move to the edge and IoT.
Make data center as easy and cheap to use as the cloud
Focus on low-latency, high data throughput, and high performing work and applications
Move 100% into services
Move into robotics

The edge has legs

Probably the first one we should point out would be to start selling hardware and software to support the edge. Speaking in financial terms, the IoT/Edge market is estimated to be $754B in 2019, and growing by over a 15.4% CAGR ).

So we are talking about serious money. At the moment the edge is a very diverse environment from cameras, sensors and moveable devices. And everybody seems to be in the act, big industrial firms, small startups and everyone in between. Given this diversity it’s hard to see that IT vendors could make a decent return here. But given its great diversity, one could say it’s ripe for consolidation.

And the edge could use some reference architectures where there are devices at the extreme edge, concentrators at the edge, more higher concentrators at nodes and more at the core, etc. So there’s a look and feel to it that seems like Ro/Bo – central core hub and spoke architectures, only on steroids with leaf proliferation that can’t be stopped. And all that data coming in has to be classified, acted upon and understood.

There are plenty of other big industrial suppliers in this IoT/edge field but none seem to have the IT end of the market that Hitachi Vantara can claim to. Some sort of combination of a large IT vendor and a large industrial firm could potentially do the same

However, Hitachi Vantara seems to be focusing on the software side of the edge. This may be an artifact of Hitachi family of companies dynamics. But it seems to be leaving some potential sales on the table.

Hitachi Vantara has the advantage of being into industrial technology in a big way so the products they create operate in factories, rail yards, ship yards and other industrial sites around the world already. So, adding IoT and edge capabilities to their portfolio is a natural extension of this expertise.

There are a few vendors going into the Edge/IoT in a small way, but no one vendor personifies this approach more than Hitachi Vantara. The Hitachi family of companies has a long and varied history in OT (operational technology) or industrial technology. And over the last many years, HDS and now Hitachi Vantara, have been pivoting their organization to focus more on IoT and edge solutions and seem to have made IOT, OT and the edge, a central part of their overall strategy.

So there’s plenty of money to be made with IoT/Edge hardware and software, one just has to go after it in a big way and there’s lots of competition. But all the competition seems to be on the same playing field (unlike the public cloud playing field).

Getting to “data center as a cloud”

There are a number of reasons why customers migrate work to the cloud, ease of use, ease of storage, ease of scale, access to myriad applications, access to multi-regional data centers, CAPex financial model, to name just a few.

There’s nothing that says much of this couldn’t be provided at the data center. It’s mostly just a lot of open source software and a lot of common hardware. IT vendors can do this sort of work if they put their vast resources to go after it.

From the pure software side, there are a couple of companies trying to do this namely VMware and Nutanix but (IBM) RedHat, (Dell) Pivotal, HPE Simplivity and others are also going after this approach.

Hardware wise CI and HCI, seem to be rudimentary steps towards common hardware that’s easy to deploy, operate and support. But these baby steps aren’t enough. And delivery to deployment in weeks is never going to get them there. If Amazon can deliver books, mattresses, bicycles, etc in a couple of days. IT vendors should be able to do the same with some select set of common hardware and have it automatically deployable in seconds to minutes once powered on.

And operating these systems has to be drastically simplified. On any public cloud there’s really no tuning required, almost minimal configuration, and then it’s just load your data and go. Yes there’s a market place to select, (virtual) hardware, (virtual) storage hardware, (virtual) networking hardware, (virtual server) O/S and (virtual?) open source applications.

Yes there’s a lots of software behind all that virtualization. And it’s fundamentally different than today’s virtualized systems. It’s made to operate only on commodity hardware and only with open source software.

The CAPex financial model is less of a problem. Today. I find many vendors are offering their hardware (and some software) on a CAPex, pay as you go model. More of this needs to be made available but the IT vendors see this, and are already aggressively moving in this direction.

The clouds are not standing still what with Azure Stack, AWS and GCP all starting to provideversions of their stack on prem in the enterprise. This looks to be a strategic battleground between the clouds and IT vendors.

Making everything IT can do in the cloud available in the data center, with common hardware and software and with the speed and ease of deployment, operations and support (maintenance) should be on every IT vendors to do list.

Unfortunately, this is not going to stop the public cloud completely, but it has the potential to slow the growth rate. But time is short, momentum has moved to the public cloud and I don’t (yet) see the urgency of the IT vendors to make this transition happen today.

Focus on low-latency, high data throughput and high performance work

This is somewhat unfair as all the IT vendors are already involved in these markets in a big way. But, there are some trends here, that indicate this low-latency market will be even more important over time.

For example, more and more of commercial IT is starting to take advantage of big data and AI to profit from all their data. And big science is starting to migrate to IT, where massive data flows and data analysis tools are becoming important to the data center. If anything, the emergence of IoT and the edge will increase data flows that need to be analyzed, understood, and ultimately dealt with.

DNA genomics may be relegated to big pharma/medical but 3D visualization is becoming so mainstream that I can do it on my desktop. These sorts of things were relegated to HPC/big science just a decade or so ago. What tools exist in HPC today that the IT data center of the future will deam a necessary part of their application workload.

Is this a sizable TAM, probably not today. In all honesty it’s buried somewhere in the IT TAM above. But it can be a growing niche, where IT vendors can stake a defensive position and the cloud may have a tough time dislodging.

I say the cloud “may have trouble dislodging” because nothing says that the entire data flow/work flow couldn’t migrate to the cloud, if the responsiveness was available there. But, if anything (guaranteed) responsiveness is one of the few achilles heels of the public cloud. Security may be the other one.

We see IBM, Intel, and a few others taking this space seriously. But all IT vendors need to see where they can do better here.

Focus on services

This not really out-of-box thinking. Some (old) IT vendors have been moving into services for over 50 years now others are just seeing there’s money to be made here. Just about every IT vendor has deployment & support services. most hardware have break-fix services.

But standalone IT services are more specialized and in the coming cloud apocalypse, services will revolve around implementing cloud applications and functionality or migrating work from the cloud or (rarely in the future) back to on prem.

TAM for services is buried in the total IT spend but industry analysts estimate that in 2019 total worldwide TAM for IT services will be about $1.0 in 2019 and growing by 2.6% CAGR.

So services are already a significant portion of IT spend today. And will probably not be impacted by the move to the cloud. I’d say that because implementing applications and services will still exist as long as the cloud exists. Yes it may get simpler (better frameworks, containerization, systemization), but it won’t ever go away completely.

Robots, the endgame

Ok laugh now. I understand this is a big ask to think that Robot spending could supplement and maybe someday surpass IT spending. But we all have to think long term. What is a self driving car but a robotic data center on wheels, generating TB of data every day it’s driven.

Robots over the next century will invade every space, become ever present and ever necessary to modern world functioning . They will have sophisticated onboard computing, motors, servos, sensors and on board and backend processing requirements. The real low-latency workload of the future will be in the (computing) minds of robots.

Even if the data center moves entirely to the cloud, all robotic computation will never reside there because A) it’s too real time and B) it needs to operate well even disconnected from the Internet.

Is all this going to happen in the next 10 or 20 years, maybe not but 30 to 50 years out this world will have a multitude of robots operating within it. .

Who’s going to develop, manufacture, support and sustain these mobile computing data centers on wheels, legs, slithering and flying bodies?

I would say IT vendors of today are uniquely positioned to dominate this market. Here to the industry is very fragmented today. There are a few industrial robotic companies and just about every major auto manufacturer is going after self driving cars. And there are many bit players today. So it’s ripe for disruption and consolidation. .

Yet, none of the major IT vendors seem to be going after this. Ok Amazon (hardware & software) and Microsoft (software) have done work in this arena. If anything this should tell IT vendors that they need to start working here as well.

But alas, none have taken up the mantle. In the mean time robot startups are biting the dust left and right, trying to gain market traction.

~~~~

That seems to be about it for the major viable out of the box approaches to the public cloud threat. I have a few other ideas but none seem as useful as the above.

Let me know what you think.

Picture credit(s):

From AWS marketplace website
From “libellium en FR” by sylvieatd is licensed under CC BY-NC-ND 2.0
From “Trading Floor at the New York Stock Exchange during the Zendesk IPO” by Scott Beale is licensed under CC BY-NC-ND 2.0
From “IoT Junior Cup 2015 press conference, by Linemetric” by horstjens is licensed under CC BY 2.0