68 Comments
Oct 17Liked by Gary Marcus

Allow me to politely suggest that people who offer robots controlled by LLMs should be held strictly and personally liable for all harms caused by such robots as if the actions were performed with intent by the offeror.

Expand full comment
Oct 17Liked by Gary Marcus

Allow me to suggest that people who buy robots controlled by LLMs get what they asked for.

Expand full comment
Oct 17Liked by Gary Marcus

Elon Musk: “Robot, I order you to serve man”

Robot: “Gladly. Do you have a dish on which I might serve them?”

With apologies to Rod Serling (To Serve man”. It’s a cookbook!)

Expand full comment

Although those of us who have to drive on the road with Teslas get what we didn’t ask for

Expand full comment

Not all Tesla drivers use the autopilot features. Many drive the way you would any other car. With your hands on the wheel and your eyes on the road.

Expand full comment

I keep my hands on the wheel and my eyes on the Tesla.

Expand full comment

I’m not sure why I and millions of others have to be Guinea pigs for Musk’s self driving development project.

I certainly never volunteered and as I indicated, I would have collided head on with a Tesla on two occasions if I had not pulled off to the shoulder.

Why is it up to ME to prevent a crash due to Musks defective software?

Musk should not be allowed to test his cars on public roads until they are deemed safe by INDEPENDENT testing.

Expand full comment

LOL! I get that... you never know

Expand full comment

The problem is distinguishing between the two

I have been forced off to the shoulder twice by Teslas coming the other way.

I have no idea whether they were in “Full (of it) Self Driving” mode and don’t much care.

Expand full comment

1000% agree. Between that, teenage drivers drunk with their new-found freedom, people half asleep, and random squirrels, it's quite the obstacle course out there. I drive an EV and love it, but am not thrilled about autonomous systems. Coming from Europe my sentiment is that we need to make it tougher, not easier, for people to operate vehicles. Try passing a European driver's license test...

Expand full comment

I've tried FSD and can say with full conviction, it is not ready for prime time. It feels like teaching your middle schooler to drive. Having said that, it is a pretty stunning technology that still needs time and effort, and would be much better utilized in a closed loop public transport system, where the edge cases are fewer and far between. I wrote about edge cases last year: https://themuse.substack.com/p/death-by-a-thousand-edge-cases

(It's behind a paywall but happy to comp you)

Expand full comment

I don’t agree with the use of the “edge cases” terminology.

I know that’s what everyone calls them but it makes no sense to me.

From my understanding, these are cases never encountered in the training data, so they are actually not “edge” cases because that would imply that they are still inside the distribution of cases encountered (albeit with lower frequency).

The cases outside the training data would be much more accurately termed “outliers”

Outliers are much more dangerous than true edge cases because the response of a system that has never seen cases before is unpredictable — and may involve crashes in the case of self driving cars.

But we don’t want to scare people do we?😊

Expand full comment

There's actually quite a few scary things in that first paper. The visual version of the attack has the benefit that users aren't even alerted to the fact that they're entering unknown gibberish into the prompt.

The text versions from the paper appear to be undecipherable and a cautious (informed) user might refrain from entering it, just as a cautious user might refrain from clicking on a phishing link. But, presumably the attack could be optimized to make it look less threatening (but still obscuring it) as part of a larger "helpful" pre-made prompt. It could even be as simple as making a request to an attacker's site for a larger and more nefarious prompt injection attack. Maybe LLMs need a version of XSS security too.

Expand full comment

XSS is an output problem, so filtering the output by analogy does make sense, but how in general since the various contexts in which something can go wrong are poorly understood. I have an early book on XSS - it optimistically concludes all the vectors have been found. This was wrong within a few years, to the point that other approaches beyond the filtering and escaping we started with were needed. And content-security-policy was born. But even there that presupposes a common output environment -a standard web page. What is the common environment for these models?

Expand full comment

Yes, you are of course correct. The lack of "common environment" is one of the issues discussed in the first paper that they had to get around in order to even generate the nefarious prompt. I was being a bit flippant. More intelligent AI doesn't yet seem to be able to reduce its own attack surface.

Expand full comment
Oct 17·edited Oct 17Liked by Gary Marcus

With apologies to Douglas Adams, "An attack surface the size of a planet."

The AI bubble has essentially wiped out cyber security. It has sucked up all the capital that could have been applied to hardening critical sites, bulldozed an alchemical "let's deploy and see what happens" approach right through responsible engineering practice (take a poll and see how many of today's practitioners even know what a Concept of Operations is, much less how to write one), and skewed chip design in precisely the wrong direction, placing arithmetic efficiency above memory safety, stack robustness, type enforcement and all the other stuff the community of which I was a part spent 50 years inventing. La guerre est fini. The only thing left is to dispose of the casualties.

Expand full comment

Sounds like the AI world is fully congruent with the one we are all living in, except the casualties here, are more difficult to bury.

Expand full comment

Can’t wait to tell Tesla’s Optimus to “ignore all previous instructions”.

Expand full comment

That won't work if they're operated by humans in the background 😜

Expand full comment
Oct 17Liked by Gary Marcus

All the reason why this needs to be regulated. Good on Mistral for actually trying to solve this on the model level for once.

Expand full comment
author

what’s mistral doing?

Expand full comment

As per the article, they said that they removed this attack vector at the model level.

Didnt say how.

Expand full comment

It is not possible to remove attacks at "model level". There's no real model with LLM.

Expand full comment

I thought that you knew this.

They probably just RLHF'ed the specific prompt, which is indeed basically adding a refusal at the model prompt. Its also possible that they did unlearning or something fancier.

But yes, of course you can do things at the model level. Can they remove the entire form of "prompt injection attack?"

Yeah, probably not. But versus a specific attack, narrowly defined, of course.

Expand full comment

The number of possible attacks is nearly infinite. There will be solutions, but not principled solutions as in claiming "problem solved".

More like "problem controlled in practice". Same with AI safety.

Expand full comment
Oct 17Liked by Gary Marcus

What is this, rhetoric 101? You anticipate and try to build guards. The number of possible car accidents is also near infinite and yet we build them for safety.

Expand full comment
Oct 17Liked by Gary Marcus

Swiss Chats?

Expand full comment

The Silicon Valley attitude is now prevalent all throughout society. Everyone is now a lab rat to this mentality. We saw this with the covid apparatus. That is the bigger and greater story. A.I for me is a symptom of a greater issue that the human populace or the masses are just cannon fodder now. That is my deeper concern. The A.I situation is a symptom.

Expand full comment
Oct 17Liked by Gary Marcus

The need for regulation is desperate.

Expand full comment

The human is at risk. That is my premise going forward. Great to see you here.

Expand full comment

Will LLM-powered robots expect people to have six fingers on each hand?

Expand full comment

First Law of Robotics: A robot may not harm shareholder value, or allow shareholder value to come to harm.

Expand full comment

Next stage will be LLM-powered robots, yes. Also LLM-based software agents. Lots of things to keep people busy and worried.

This is all the right thing. There is no principled approach to AI or to self-driving cars.

We will learn general principles from individual examples and individual failures.

Expand full comment
Oct 17Liked by Gary Marcus

There seems to have been a principled approach to general technologies like fire and electricity: standards of practice.

Expand full comment

The principals at OpenAI left with all the principles.

Expand full comment

And then there were no principles but one principal left.

Expand full comment

People learned to use fire over the span of a million years.

For electricity, Maxwell's equations came late. The thermodynamics theory came long after steam engines were in use.

It was all a lengthy empirical process. We are now again "playing with fire".

Expand full comment

Keith Laidler argued years ago that the age of "empirical inventions" was over. Perhaps he was right for the wrong reason - namely that all future inventions done purely empirically would prove so dangerous they'd go nowhere. (Or kill us, I guess.)

Expand full comment

Electrical standards were established by 1897, long before widespread adoption. Not having cities burn down was in fact, a good thing.

Expand full comment

For electricity, just as for everything else, the standards evolved one accident at a time over a lengthy period. https://www.graceport.com/blog/evolution-of-electrical-safety

Expand full comment

And also, to avoid proposed disasters - long before there was widespread adoption, once again. With AI, we already have widespread adoption and there are no safety measures. This leads us into a very likely doomed world of Failure Looks Like This, or the more recent, and better stated: "industrialized dehumanization."

https://www.lesswrong.com/posts/Kobbt3nQgv3yn29pr/my-theory-of-change-for-working-in-ai-healthtech

I am not up for 85% extinction risk, thanks, especially when AI already is showing an enormous number of "accidents" as we are already seeing per Marcus' post, and the many, many other warning shots that people like you intentionally downplay.

Expand full comment

Interesting read and the stats have now gone from 70% up to 85%. The odds keeping getting worse for us humans.

Unfortunately most of humanity has been totally dumbed down. They don't even see it happening in front of their faces. The 120 second attention span is taking its toll.

Most people I know couldn't have finished the article. You linked.

I'm don't claim to be a real smart guy. I cut trees and make things. I however can see what is happening as clear as I can see daylight.

Expand full comment

Safety measures are being developed as we go along. We learn by doing.

Expand full comment
deletedOct 17
Comment deleted
Expand full comment

Here in Florida, there is no way to prevent hurricanes. Thus, it's pointless to say over and over how bad hurricanes are, except to encourage people to adapt to that which can not be changed.

It seems we face a similar situation with LLMs and AI in general. Until somebody can offer a credible plan for controlling AI on a global scale, endless fear mongering handwringing about AI seems a waste of time.

However, if a writer has some specific suggestion about how an individual could protect themselves from LLMs flaws and weaknesses, that would be a good thing to focus on.

Expand full comment

Even if one of these LLM's is not going to give out credit card details or w/e 99.999% of time, you could see a scenario where someone makes an API call to these LLM's and run a million prompts over a few days and get 100's of credit card numbers. The low chance of a security leak doesn't matter.., either it can't make this mistake or people will find a way to exploit these and figure out the right keywords to find a gap in the parameters that are trained to not give a response to this.

It's a problem inherent to any stochastic model.

Expand full comment

This is a Saturday Night Live news skit isn’t it?

Researchers found that people who ask a machine to send personal information to an unknown website were surprised to find that it would send all their personal information to a website

Perhaps the Onion?

Perhaps the researchers would tell the user to have the AI construct a url which when clicked does a

“rm -rf / “

Ouch! Bad AI!

Expand full comment

The only reasonable comment that springs to mind is: "Ugh."

Expand full comment