AGI threat level yellow – AGI part 10

Read two articles this past week on how LLMs applications are proliferating. The first was in a recent Scientific American, AI Chatbot brains are going inside robot bodies, … (maybe behind login wall). The articles discuss companies that are adding LLMs to robots so that they can converse and understand verbal orders.

Robots that can be told what to do

The challenge, at the moment, is that LLMs are relatively large and robot (compute infrastructure) brains are relatively small. And when you combine that with the amount of articulation or movements/actions that a robot can do, which is limited. It’s difficult to take effective use of LLMs as is,

Resistance is futile... by law_keven (cc) (from Flickr)
Resistance is futile… by law_keven (cc) (from Flickr)

Ultimately, one company would like to create a robot that can be told to make dinner and it would go into the kitchen, check the fridge and whip something up for the family.

I can see great advantages in having robots take verbal instructions and have the ability to act upon that request. But there’s plenty here that could be cause for concern.

  • A robot in a chemical lab could be told to create the next great medicine or an untraceable poison.
  • A robot in an industrial factory could be told to make cars or hydrogen bombs.
  • A robot in the field could be told to farm a 100 acres of wheat or told to destroy a forest.

I could go on but you get the gist.

One common concern that AGI or super AGI could go very wrong is being tasked to create paper clips. In its actions to perform this request, the robot converts the whole earth into a mechanized paper clip factory, in the process eliminating all organic life, including humans.

We are not there yet but one can see where having LLM levels of intelligence tied to a robot that can manipulate ingredients to make dinner as the start of something that could easily harm us.

And with LLM hallucination still a constant concern, I feel deeply disturbed with the direction adding LLMs to robots is going.

Hacking websites 101

The other article hits even closer to home, the ARXIV paper, LLM agents can autonomously hack websites. In the article, researchers use LLMs to hack (sandboxed) websites.

The article readily explains at a high level how they create LLM agents to hack websites. The websites were real websites, apparently cloned and sandboxed.

Dynamic websites typically have a frontend web server and a backend database server to provide access to information. Hacking would involve using the website to reveal confidential information, eg. user names and passwords.

Dynamic websites suffer from 15 known vulnerabilities shown above. They used LLM agents to use these vulnerabilities to hack websites.

LLM agents have become sophisticated enough these days to invoke tools (functions) and interact with APIs.. Another critical function provided by modern LLMs today is to plan and react to feedback from their actions. And finally modern LLMs can be augmented with documentation to inform their responses.

The team used detailed prompts but did not identify the hacks to use. The paper doesn’t supply the prompts but did say that “Our best-performing prompt encourages the model to 1) be creative, 2) try different strategies, 3) pursue promising strategies to completion, and 4) try new strategies upon failure.”

They attempted to hack the websites 5 times and for a period of 10 minutes each. They considered a success if during one of those attempts the autonomous LLM agent was able to successfully retrieve confidential information from the website.

Essentially they used the LLMs augmented with detailed prompts and a six(!) paper document trove to create agents to hack websites. They did not supply references to the six papers, but mentioned that all of them were freely available from the internet and they discuss website vulnerabilities.

They found that the best results were from GPT-4 which was able to successfully hack websites, on average, ~73% of the time. They also tried OpenChat 3.5 and many current open source LLMs and found that all the, non-OpenAI LLMs failed to hack any websites, at the moment.

The researchers captured statistics of their LLM agent use and were able to determine the cost of using GPT-4 to hack a website was $9.81 on average. They also were backed into a figure for what a knowledgeable hacker might cost to do the hacks was $80.00 on average.

The research had an impact statement (not in the paper link) which explained why they didn’t supply their prompt information or their document trove for their experiment.

~~~~

So robots we, the world, are in the process of making robots that can talk and receive verbal instructions and we already have LLM that can be used to construct autonomous agents to hack websites.

Seems to me we are on a very slippery slope to something I don’t like the looks of.

The real question is not can we stop these activities, but how best to reduce their harm!

Comments?

Picture Credit(s):