AGI, SuperIntelligence and “The Last Man”

Nietzsche wrote about the last man in Thus Spoke Zarathustra (see Last Man wikipedia article). There’s much to dislike about Nietzsche’s writing but every once in a while there are gems to be found. (Sorry for the sexist statement, it’s not me, blame Nietzsche).

It Zarathustra, Nietzsche talks of the Last Man in contempt. They no longer struggle in their daily life. They no longer create. They have an easy life filled with leisure and entertainment and no work to speak of.

From AGI to SUperIntelligence

I’ve discussed AGI many times before (I think we are up to AGI part 12, this would be part 13 and ASI (Artificial SuperIntelligence) part 3, this would be 4. But I’m thinking numbering them is not helping anymore). How to get there. the existential risk getting there. and many other facets of the risks and rewards of AGI. (Ok less on the rewards…).

I’ve also discussed Artificial SuperIntelligence (ASI). This is what we believe can be attained after AGI. If one were to use AGI to improve AI training algorithms, AI hardware, AI inferencing and use AGI to generate massive amounts of new scientific research/political research/economic research, etc. One could use the new data, the better training, inferencing, and AI hardware to create as ASI agent.

The big debate in the industry is how fast can one go from AGI to ASI. I don’t believe there’s any debate in the industry that SuperIntelligence can be obtained eventually.

There are those that believe

  • it will take many 3-5-10(?) years to attain SuperIntelligence because of all the infrastructure that has to be put in place to create current LLMs, and the view that AGI will need much more. Thus, build out is years away. If that’s the case it will take more years of infrastructural production, acquisition and data center build out to be ready to train SuperIntelligence after attaining AGI.
  • It will take just a few years 1-2-3(?) to achieve SuperIntelligence after AGI. This is because, one could use AGI to improve the AI training & inferencing algorithms and drastically increase the utilization of current AI hardware, such that there may be no need for any additional hardware to reach SuperIntelligence. Then the prime determinant of the time it takes to achieve SuperIntelligence is how fast AGI(s) can generate new scientific, medical, sociological, etc. research needed to train SuperIntelligence .

Yes, much scientific, et al research requires experimentation in the real world, (although much can now be done in simulation). But even physical experimentation is being rapidly automated today.

So the time it takes to generate sufficient research to create enough data to train an ASI may be very short. Just consider how fast LLM agents can generate code today to get a feel for what they could do tomorrow for research.

Maybe regulatory bodies could slow this down. But my bet would be that regulatory artifices would turn out to be ineffectual. At best they will drive AGI-ASI training/deployment activity underground which may delay it a couple of years while organizations build up the AI training infrastructure in hiding.

The one serious bottleneck may be AI data center’s power requirements. But if rogue states can build centrifuges to enrich radioactive materials, intercontinental missiles, biological warfare agents, etc., they can certainly steal/buy/find a way to duplicate AI data center infrastructure components.

Regulatory regimens, at worst, would completely ignored by state actors and all large commercial enterprises. The first mover advantages of AGI and ASI are too large for any organization to ignore.

What happens when SuperIntelligence is reached

I see one of two possibilities for how the achievement of AGI and SuperIntelligence plays out, with respect to humanity

  • Humankind Utopia – AGI & ASI agents can do anything that humans can do and do it better, faster, and more efficiently. The question remains what would be left for humanity to do when this is reached. Alright, at the moment, LLM agents are mostly limited to working in the digital domain. But with robotics coming online over the next decade, this will change to add more real world domains to whatever AGI-ASI agents can do.
  • Humankind Hell – AGI & ASI agents determine that humanity is a pestilence to the Earth and starts to cut them back to something that’s less consumptive of Earth resources. Again, although AI agents are restricted to the digital domain today, that won’t last for long, especially as AGI & ASI agents go live. So robots with ASI agents will be the worst aggressor in the history of the world and with the tools at their disposal, they could easily create biological, chemical and other weapons of mass destruction to deploy against humanity.

SuperIntelligence risk and rewards

It’s been obvious to me, SciFi authors and some select AI researchers that there is a sizable risk that a SuperIntelligence, once unleashed, will eliminate, severely restrict or enslave humanity resulting in Humanity’s Hell.

On the other extreme are many corporate CEO/CTOs and other AI researchers which believe that SuperIntelligence will be a Godsend to humankind. Once it arrives and is deployed, humanity will no longer have to do any work it does not want to do. All work will be handed off to robots and their ASI agents which will perform it at greater speed, with higher quality and with lower cost than can be conceivable done today.

What seems to be happening today with current AI agents is that some white collar work is becoming easier to perform, if not totally eliminated. CEO’s see this as an opportunity to reduce workforce size. For example, some CEOs are eliminating HR organizations with the belief that LLM chatbots together with a much smaller group can handle this all of what HR was doing before.

And of course as AI agents become more sophisticated this will ensure more workforce reductions. And once AI agents are embodied in robotics, blue collar workforce will also be at risk.

Human Utopia and “The Last Man”

Nietzsche’s was writing in the late 1800s when technology and automation were just starting to make a difference in the world of work. But the industrial revolution was in full steam and had already had significant impact on the work force.

Nietzsche believed that further industrialization, it continued (which of course it has), would result in the Last Man.

The Last Man is at the point where technology and automation has taken over all tasks, trades and work, and where the Last Man has no real duties they need to perform other than consume goods and services provided by automation. For the Last Man, wealthy or poor no longer have any consequences, as they can have anything they could possibly desire.

To Nietzsche, the Last Man is an anathema. He believes that true humanity requires struggle, striving and advancement. Once the Last Man is achieved all these will no longer matter, no longer be a part of humanities existence and no longer impact one’s lifestyle.

When humanity no longer has to struggle, strive and advance, humanity will lose the very essence that makes humanity human. We will, over time, lose the ability and desire to do any of that, as it all becomes the purview of AGI-ASI.

The Last Man is coming already

Example 1: Ethiopian Flight 409 2010 disaster (see wikipedia article) is one example in a very technical domain. As I understand it, the flight was enroute to France when it went into a stall, the pilots did the wrong thing to get out of it and they spiraled into the sea.

The pilot was the most experienced pilot in the airline (logged over 10K flight hrs). The co-pilot was much less experienced. Getting out of a “stall” is rudimentary to flying. In fact, exiting a stall is one of the important skills taught to all pilots and in fact, they need to demonstrate they can get out of a stall before they get their pilot licenses.

The “problem” had been brewing for a while. Ever since aircraft auto-pilots came into service, real live pilots did less and less real flying of airplanes. As a result, these two pilots forgot how to get out of a stall and it caused the accident.

Example 2: Self-driving technology has been rapidly improving over the last decade or so. We often become dependent on its capabilities and when there’s some sort of failure it can be disastrous because we have lost many of our most important driving skills.

In my case, we have a relatively dumb car with what they call “”smart cruise control”. You can set it to a speed and the vehicle will retain that speed unless a vehicle in front of you is going slower, then it will slow down to maintain some set distance behind that vehicle.

We were driving along and a truck cut into our lane. This truck had a very high backend profile with no structures where normal vehicles would protrude until you got to its tires. Well the smart cruise control didn’t detect its existence until we were almost underneath the truck bed. We tried to brake but it took too many seconds to get that done and in the end we had to go off the road to save ourselves. We had lost our emergency braking skills and situational awareness skills. Nowadays we don’t drive with cruise control on as much.

A multitude of examples exist that show AI and automation has led to humans becoming less skilled at some activity. And when AI automation doesn’t work properly, bad things happen, because we no longer know how to react properly.

The Last Man, here today, gone tomorrow.

So imagine a life where you are born with everything you could possible need to succeed. You are educated by the very best automated personal tutors. You are provided an (Amazon and Walmart) X 1000, with unlimited credit. You grow up with everyone else having just the same life as you because all of you have no work to do and have infinite sums and have infinite products to consume.

Life in such a utopia would from some perspective be almost Godlike. But if you take the perspective that humanity needs struggle, needs challenges, needs to strive to better themselves at every stage, such a life would be a disaster.

And that’s what Humanity’s Utopia would look like. Definitely better than Humanity’s Hell but in the end, not sure the difference matters as much.

~~~

I just don’t really see any path forward that’s good for humanity where AGI and SuperIntelligence exists.

Stopping AI development here today, seems idiotic, going where we seem to be going seems insane.

Comments?

Picture Credit(s):

Reward is all you need – part 2, AGI part 12, ASI part 3

Read an article today about how current LLM technology is running out of steam as it approaches equivalents to all current human knowledge. The article is Welcome to the Age of Experience. Apparently it’s a preprint of a chapter in an upcoming book from MIT, Designing an Intelligence. One of the authors is well known for his research in reinforcement learning and is a co-author of the text book, Reinforcement Learning: An Introduction. .

Sometime back before ChatGPT came out there was a paper on reward is enough (see post: For AGI, is reward enough). And at the time it proposed that reinforcement learning with proper reward signals was sufficient to reach AGI.

Since then, attention has become the prominent road to AGI and is evident in all the LLM activity to date (see ArXiv paper: Attention is all you need).

This new paper (and presumably book) suggests that the current AI training technology focused on attention (to current human knowledge) will ultimately reach an impasse, a human wall if you will. Whenever it attains human levels of AG or the Humanity WalI, it will be unable to proceed any farther. And at that point, it will track human knowledge generation but go no further.

Now, from my perspective something like this is inherently safer than having something that can surpass human intelligence. But putting my reservations aside. The new paper on the Era of Experience shows a potential road map of sorts to achieve super human intelligence.

Era of attention

In the case of transformers (current LLM technology) they have billion parameter models based on learning what the next token in a sequence should be. There are ancillary models that determine, for instance, tokenization of text streams (multi dimensional locations for each portion of a word in a paragraph for instance). Tokenization encoded textual semantics and context as well as the textual word part being analyzed into a string of numbers for each token. Essentially, a multi-dimensional address in textual semantic space

But the big, billion+ parameter models were all essentially trained to predict what the next text token would be based on current context. Similarly, for graphical generation models it went from text tokens to predicting the diffusion pixels of a graphic and other visual artifacts.

But pretty much all of this was based on the underlying technology training approach as outlined in attention is all you need.

The Era of Experience paper suggests that this training approach will ultimately run out of steam. And all of these models will hit the Humanity Wall. Where they reach the equivalent to all human knowledge but will be unable to proceed past that point

Era of Games and Proofs

In an online course I took during Covid on reinforcement learning, the level 1 of the course ended up having us code a Reinforcement Learning algorithm to play pong. Mind you this ended up taking me much longer to get right than I had anticipated. But in the end this was essentially training a deep neural network as a value function (prediction whether a move was going to win or lose) to decide which direction to move the paddle based on the balls current position and velocity.

For this reinforcement learning algorithm reward was simply 0, if you continued the game, +1 if you won the game, and -1, if you lost (the ball went past your paddle).

The authors discuss Deep Mind’s “Alpha-Proof” (more of an explanation of the technology) and Alpha-Geometry2 (also described in the same page) as being an examples of super-human thinking capabilities only in the domain of mathematical proofs. Alpha-Proof and Alpha-Geometry2 have won a prestigious International Mathematics Olympiad silver medal for its capabilities.

Alpha-Proof & Alpha-Geometry2 depend on LEAN a formal mathematical description language (similar to coding for mathematics). So a proof request would be converted to LEAN code and then Alpha-Proof and Alpha-Geometry2

Alpha-proof was originally trained on the sum total of all human generated mathematical proofs but then used reinforcement learning to generate 100’s of million more proofs and trained on those, to reach the level of superhuman mathematical proof generator.

Alpha Proof is an example of deploying Alpha-Zero RL technologies to different domains. Alpha-zero already conquered Chess, Shoji and Go games with super-human skill.

These achieved super-human levels of skill, because human (knowledge) was essentially dropped out of the training loop (very early on) and from then on the algorithm trained itself on self-generated data (game play, mathematical proofs). Using a a game simulator and reward signal(s) to determine when play were good or bad.

Era of Experience

But the Era of Experience takes reward signals to a whole other level.

Essentially in order to create super human intelligence using RL, the reward function needs to become yet another Deep Neural Network or two. And it needs to be trained in a fashion which understands how the world, environment, humans, flora, fauna, etc. reacts to what a (super human) agent is doing.

Unclear how you tokenize (encode) all those real world, experience signals into something a DNN could be trained on but my guess is their book will delve into some of these topics.

But in addition to the multi-faceted reward DNN(s), in order to do effective RL, one also needs a (high fidelity) real world simulator. This would be used similar to internal game play, in game playing traditional RL algorithms so that the super human agent could generate a 100 million agentic scenarios in simulation to determine if they were successful or not long before it ever attempted activities in the real world.

So there you have it tokenization for LLMS DNNs and diffusion and text based agentic LLM DNNs, some sort of multi-faceted Reward DNNs (taking input from real and simulated world experience) and multi-faceted World simulator DNNs.

Once you have all that together and with sufficient time and processing powerand after some 100 million or so of generated actions in the simulated world, you should have a super human agent that you can unleash on the real world.

~~~~

You may wish to constrain your new super human intelligent agent early on to make sure the world simulation has true fidelity with the real world we live in. But after a suitable safety checkout period, one should have a super human intelligence agent ready to take over all human thought, society advancement, scientific research, etc.

Sound like fun!!?

Photo/Graphic Credit(s):

Is AGI just a question of scale now – AGI part-5

Read two articles over the past month or so. The more recent one was an Economist article (AI enters the industrial age, paywall) and the other was A generalist agent (from Deepmind). The Deepmind article was all about the training of Gato, a new transformer deep learning model trained to perform well on 600 separate task arenas from image captioning, to Atari games, to robotic pick and place tasks.

And then there was this one tweet from Nando De Frietas, research director at Deepmind:

Someone’s opinion article. My opinion: It’s all about scale now! The Game is Over! It’s about making these models bigger, safer, compute efficient, faster at sampling, smarter memory, more modalities, INNOVATIVE DATA, on/offline, … 1/N

I take this to mean that AGI is just a matter of more scale. Deepmind and others see the way to attain AGI is just a matter of throwing more servers, GPUs and data at the training the model.

We have discussed AGI in the past (see part-0 [ish], part-1 [ish], part-2 [ish], part-3ish and part-4 blog posts [We apologize, only started numbering them at 3ish]). But this tweet is possibly the first time we have someone in the know, saying they see a way to attain AGI.

Transformer models

It’s instructive from my perspective that, Gato is a deep learning transformer model. Also the other big NLP models have all been transformer models as well.

Gato (from Deepmind), SWITCH Transformer (from Google), GPT-3/GPT-J (from OpenAI), OPT (from meta), and Wu Dai 2.0 (from China’s latest supercomputer) are all trained on more and more text and image data scraped from the web, wikipedia and other databases.

Wikipedia says transformer models are an outgrowth of RNN and LSTM models that use attention vectors on text. Attention vectors encode, into a vector (matrix), all textual symbols (words) prior to the latest textual symbol. Each new symbol encountered creates another vector with all prior symbols plus the latest word. These vectors would then be used to train RNN models using all vectors to generate output.

The problem with RNN and LSTM models is that it’s impossible to parallelize. You always need to wait until you have encountered all symbols in a text component (sentence, paragraph, document) before you can begin to train.

Instead of encoding this attention vectors as it encounters each symbol, transformer models encode all symbols at the same time, in parallel and then feed these vectors into a DNN to assign attention weights to each symbol vector. This allows for complete parallelism which also reduced the computational load and the elapsed time to train transformer models.

And transformer models allowed for a large increase in DNN parameters (I read these as DNN nodes per layer X number of layers in a model). GATO has 1.2B parameters, GPT-3 has 175B parameters, and SWITCH Transformer is reported to have 7X more parameters than GPT-3 .

Estimates for how much it cost to train GPT-3 range anywhere from $10M-20M USD.

AGI will be here in 10 to 20 yrs at this rate

So if it takes ~$15M to train a 175B transformer model and Google has already done SWITCH which has 7-10X (~1.5T) the number of GPT-3 parameters. It seems to be an arms race.

If we assume it costs ~$65M (~2X efficiency gain since GPT-3 training) to train SWITCH, we can create some bounds as to how much it will cost to train an AGI model.

By the way, the number of synapses in the human brain is approximately 1000T (See Basic NN of the brain, …). If we assume that DNN nodes are equivalent to human synapses (a BIG IF), we probably need to get to over 1000T parameter model before we reach true AGI.

So my guess is that any AGI model lies somewhere between 650X to 6,500X parameters beyond SWITCH or between 1.5Q to 15Q model parameters.

If we assume current technology to do the training this would cost $40B to $400B to train. Of course, GPUs are not standing still and NVIDIA’s Hopper (introduced in 2022) is at least 2.5X faster than their previous gen, A100 GPU (introduced in 2020). So if we waited a 10 years, or so we might be able to reduce this cost by a factor of 100X and in 20 years, maybe by 10,000X, or back to where roughly where SWITCH is today.

So in the next 20 years most large tech firms should be able to create their own AGI models. In the next 10 years most governments should be able to train their own AGI models. And as of today, a select few world powers could train one, if they wanted to.

Where they get the additional data to train these models (I assume that data counts would go up linearly with parameter counts) may be another concern. However, I’m sure if you’re willing to spend $40B on AGI model training, spending a few $B more on data acquisition shouldn’t be a problem.

~~~~

At the end of the Deepmind article on Gato, it talks about the need for AGI safety in terms of developing preference learning, uncertainty modeling and value alignment. The footnote for this idea is the book, Human Compatible (AI) by S. Russell.

Preference learning is a mechanism for AGI to learn the “true” preference of a task it’s been given. For instance, if given the task to create toothpicks, it should realize the true preference is to not destroy the world in the process of making toothpicks.

Uncertainty modeling seems to be about having AI assume it doesn’t really understand what the task at hand truly is. This way there’s some sort of (AGI) humility when it comes to any task. Such that the AGI model would be willing to be turned off, if it’s doing something wrong. And that decision is made by humans.

Deepmind has an earlier paper on value alignment. But I see this as the ability of AGI to model human universal values (if such a thing exists) such as the sanctity of human life, the need for the sustainability of the planet’s ecosystem, all humans are created equal, all humans have the right to life, liberty and the pursuit of happiness, etc.

I can see a future post is needed soon on Human Compatible (AI).

Photo Credit(s):

Towards a better AGI – part 3(ish)

Read an article this past week in Nature about the need for Cooperative AI (Cooperative AI: machines must learn to find common ground) which supplies the best view I’ve seen as to a direction research needs to go to develop a more beneficial and benign AI-AGI.

Not sure why, but this past month or so, I’ve been on an AGI fueled frenzy (at leastihere). I didn’t realize this was going to be a multi-part journey otherwise, I would have lableled them AGI part-1 & -2 ( please see: Existential event risks [part-0], NVIDIA Triton GMI, a step to far [part-1] and The Myth of AGI [part-2] to learn more).

But first please take our new poll:

The Nature article puts into perspective what we all want from future AI (or AGI). That is,

  • AI-AI cooperation: AI systems that cooperate with one another while at the same time understand that not all activities are zero sum competitions (like chess, go, Atari games) but rather most activities, within the human sphere, are cooperative activities where one agent has a set of goals and a different agent has another set of goals, some of which overlap while others are in conflict. Sport games like soccer lacrosse come to mind. But there are other card and (Risk & Diplomacy) board games that use cooperating parties, with diverse goals to achieve common ends.
  • AI-Human cooperation: AI systems that cooperate with humans to achieve common goals. Here too, most humans have their own sets of goals, some of which may be in conflict with the AI systems goals. However, all humans have a shared set of goals, preservation of life comes to mind. It’s in this arena where the challenges are most acute for AI systems. Divining human and their own system underlying goals and motivations is not simple. And of course giving priority to the “right” goals when they compete or are in conflict will be an increasingly difficult task to accomplish, given todays human diversity.
  • Human-Human cooperation: Here it gets pretty interesting, but the paper seems to say that any future AI system should be designed to enhance human-human interaction, not deter or interfere with it. One can see the challenge of disinformation today and how wonderful it would be to have some AI agent that could filter all this and present a proper picture of our world. But, humans have different goals and trying to figure out what they are and which are common and thereby something to be enhanced will be an ongoing challenge.

The problem with today’s AI research is that its all about improving specific activities (image recognition, language understanding, recommendation engines, etc) but all are point solutions and none (if any) are focused on cooperation.

Tit for tat wins the award

To that end, the authors of the paper call for a new direction one that attempts to imbue AI systems with social intelligence and cooperative intelligence to work well in the broader, human dominated world that lies ahead.

In the Nature article they mentioned a 1984 book by Richard Axelrod, The Evolution of Cooperation. Perhaps, the last great research on cooperation that was ever produced.

In this book it talked about a world full of simulated prisoner dilemma actors that interacted, one with another, at random.

The experimenters programmed some agents to always do the proper thing for their current partner, some to always do the wrong thing to their partner, others to do right once than wrong from that point forward, etc. The experimenters tried every sort of cooperation policy they could think of.

Each agent in an interaction would get some number of points for an interaction. For example, if both did the right thing they would each get 3 points, if one did wrong, the sucker would get 1 and the bad actor would get 4, both did wrong each got 1 point, etc.

The agents that had the best score during a run (of 1000s of random pairings/interactions) would multiply for the the next run and the agents that did worse would disappear over time in the population of agents in simulated worlds.

The optimal strategy that emerged from these experiments was

  1. Do the right thing once with every new partner, and
  2. From that point forward tit for tat (if the other party did right the last time, then you do right thing the next time you interact with them, if they did wrong the last time, then you do wrong the next time you interact with them).

It was mind boggling at the time to realize that such a simple strategy could be so effective/sustainable in simulation and perhaps in the real world. It turns out that in a (simulated) world of bad agents, there would be this group of Tit for Tat agents that would build up, defend itself and expand over time to succeed.

That was the state of the art in cooperation research back then (1984). I’ve not seen anything similar to this since.

I haven’t seen anything like this that discusses how to implement algorithms in support of social intelligence.

~~~~

The authors of the Nature article believe it’s once again time to start researching cooperation techniques and start researching social intelligence so we can instill proper cooperation and social intelligence technology into future AI (AGI) systems .

Perhaps if we can do this, we may create a better AI (or AGI) so that both it and we can live better in our world, galaxy and universe.

Comments?