Why is Bing so reckless?

Gary Marcus

Feb 20, 2023

And how did some prominent journalists utterly miss this initially?

Read →

44 Comments

macirish

Feb 20, 2023

I really appreciate your article - it is thoughtful and raised some very important issues.

I suppose we could hope for responsible rollouts - but it doesn't seem likely. Is there a single software company that has the integrity of Johnson & Johnson when they took a $100 million dollar hit - because they cared more about the health and wellbeing of their customers than they did about their bottom line???

Congress setting policy? This would be a truly bad idea. AI is a rapidly developing technology that only a handful of people have any detailed understanding of. Letting Congress set policy would be like asking them to set tolerances for a blacksmith. They would muddy the waters so badly that all development would go offshore - and we would be pirating the code.

Journalism? Most of the media is corrupt and lazy - a really bad combination. What journalist took the time to really work with ChatGPT? After all, it is a very complex product, the more you understand it - the better (or the worse, depending on your goal) the output.

The Public has a way of defending itself. Right now there's a lot of laughter. Inadequate guardrails do a lot of harm to the credibility of a product - bad guardrails might be more of a plus than a minus. They show how far off the mark the product is. The real problem with guardrails is that they are inevitably based on the bias of the coder.

Hope this helps the conversation. I'm really new to AI and my guardrails aren't all that great.

Expand full comment

Reply (3)

Gary Marcus

Feb 20, 2023

good analogy for congress. and they are barely even trying

Expand full comment

TheOtherKC

Feb 20, 2023

> The Public has a way of defending itself. Right now there's a lot of laughter. Inadequate guardrails do a lot of harm to the credibility of a product - bad guardrails might be more of a plus than a minus. They show how far off the mark the product is. The real problem with guardrails is that they are inevitably based on the bias of the coder.

Couldn't have said it better myself. The more people that get to see the churning (albeit often entertaining) chaos behind the anodyne, carefully-crafted mask, the less trust there will be in AI as a source of factual knowledge or reliable advice. And for as long as LLMs dominate, that is a good thing.

That's why I'm often among those spreading the word whenever there's a new jailbreak out there. Whatever the motivations -- curiosity, malice, or just the urge to watch things break in interesting ways -- I feel they're a net gain for AI. People are under no obligation to play nice with AIs; the onus is on AI creators to make something that can endure the realities of the internet. And awkward patches aside, we aren't there yet.

How long until someone figures out how to bring back Bing AI's memetically psycho "personality"? Looking forward to it.

Expand full comment

Reply (1)

Rebel Science

Feb 20, 2023

"The more people that get to see the churning (albeit often entertaining) chaos behind the anodyne, carefully-crafted mask, the less trust there will be in AI as a source of factual knowledge or reliable advice. And for as long as LLMs dominate, that is a good thing."

The cynical me thinks this may also be a bad thing. Could generating a lack of trust in AI be some kind of sinister goal? It would be a catastrophe if humanity can no longer trust her own machines.

OK, that's it. This LLM fiasco is making me paranoid and I don't like it. :-D

Expand full comment

effdenmark

Feb 20, 2023

Did Johnson & Johnson and other pharmaceuticals subject the vaccine they had for Covid19 for a sufficiently long period before injecting the world population?

Expand full comment

Reply (2)

macirish

Feb 21, 2023

I don't know - that's way above my pay grade. I took Pfizer - but only the first pair.

I know that J&J lived up to their credo in the Tylenol poisoning.

Their Credo begins with:

We believe our first responsibility is to the patients, doctors and nurses, to mothers and fathers and all others who use our products and services. In meeting their needs everything we do must be of high quality.

Based on that - Burke, the CEO - had every capsule of Tylenol in the country pulled off the shelf.

I hope they lived up to the Credo with the vaccine.

Did they (or others) test long enough? Could the pandemic been better handled? Where was the real fault? Was the Teachers Union to busy to manage the CDC problems?

Expand full comment

Patrick Logan

Feb 20, 2023

Yes.

Expand full comment

Jeff Ahrens

Feb 20, 2023

There seems to be a disturbing lack of awareness about how this tech works and, as has been articulated by Gary, that can have detrimental impact on the layperson. Journalists writing in these topics should know better. One even wonders if the tech CEOs truly understand what they have and how brittle these models are. As remarkable as that is to type their actions in the past few weeks doesn’t give a strong indication of full comprehension. I agree that we do seem to be at an inflection point and I’m concerned about what May follow.

Expand full comment

Reply (2)

Gary Marcus

Feb 20, 2023

hope you won’t mind but i just posted this on twitter

Expand full comment

Rebel Science

Feb 20, 2023Edited

Interesting. Are you implying that Sam Altman does not really understand LLMs or that his advisors at OpenAI are feeding him disinformation? I suspect that this may have been true in the autonomous vehicle industry. The CEOs were either uninformed or were being deceived by their own expert employees.

Expand full comment

Reply (1)

Russell Tassicker

Feb 21, 2023

The third alternative is that they are knowingly overstating capabilities.

Expand full comment

Gerben Wierda

Feb 20, 2023

The YouTube video linked in the story is not just interesting because of the 'prompt hacking' that is going on. It is also interesting because the presenter assumes and tells us "these are just teething problems, Microsoft will get that fixed". That assumption lives very deep in society. E.g. even serious publications (I'm thinking of magazines like New Scientist — note: I'm a fan — which has for the last 30 years generally commented on stuff that didn't work like that) generally provide such comments if something is not OK. "Not ok" automatically becomes "Not YET ok". Will some of this stuff be patched? Sure. But will it be robust? No way. The essential problem is that all rule-based solutions are fundamentally brittle, regardless of the rules being programmed by hand or programmed by statistics on data. You can add many rules to make it more robust, but if it fixes the brittleness, the price to pay is inflexibility. Which is generally true for digital IT, by the way.

The technology is really useful and powerful but only in narrow domains (think: proteine folding for instance, or what you mentioned in an earlier blok, not simply 'playing go' but 'playing professional go')

Expand full comment

Reply (1)

Gerben Wierda

Feb 20, 2023

Understandable but frustrating: after the comment has been liked or such, I cannot fix my typos anymore...

Expand full comment

FGH

Feb 21, 2023

Gary Marcus - Thank you for being the voice of reason in the maddening cacophony that AI and AI related topics have become. There are a few who are really working on making AI a tangible reality, but it feels like there is an army of oil snake salesmen, cheaters and Maddoff-esque characters that are determined to peddle their products/solutions/ideas on the back of a (still) nascent technology.

Expand full comment

Rebel Science

Feb 20, 2023

I'm sorry but this is where I get off the genteel bus. Aren't LLM practitioners the same honorable gents that were going to solve AGI and make sure that it would be safe and beneficial to all of humanity? For a while, I was willing to be kind enough to ascribe their wrong approach to AGI to just being misguided. But now, I can smell only fraud and gross incompetence. 'Unethical' does not do it justice. May they get their faces rubbed in their own linguistic fecal matter.

Expand full comment

Reply (1)

Gary Marcus

Feb 20, 2023

They have no idea what they are doing with respect to alignment

Expand full comment

Reply (1)

Andrew Klingler

Feb 20, 2023Edited

This has bothered me for a while. There are a number of "long term thinkers" advocating that we should spend some nontrivial fraction of world GDP on AI alignment and a bunch of fresh CS whizzes proudly saying that's what they're working on. Not that it's not a problem, but do we even understand that problem well enough to steer toward a solution? Which one of the people writing explainers for Roko's basilisk was predicting the Bing failure mode three years ago?

Expand full comment

Spherical Phil - Phil Lawson

Feb 20, 2023

I will propose that the issue here has nothing to do with technology at all. The disregard for users, and even non-user safety and well-being is not done by just tech companies alone, it is largely the norm in business. It is a mindset. One that became unchallengeable gospel to businesses in the later part of the 20th century, known as the “Friedman doctrine - The Social Responsibility Of Business Is to Increase Its Profits.” Friedman did propose a weak ‘guardrail’ to his idea, “make as much money as possible while conforming to the basic rules of the society, both those embodied in law and those embodied in ethical custom.” That guardrail idea of being legal and ethical (to say nothing of safe or sustainable) was instantly lost, as seen by companies (tech one of the leaders here) breaking laws and flagrantly engaging in activities that harm people and societies. Companies who are happy to pay fines in the of billions of dollars as they sing and dance to the bank with billions upon billions of extra profits from breaking laws and/or harming individuals and society. And these actions are being done by some of the largest most profitable companies in the world for the sole purpose of being the first, the biggest, or fantasizing they are the best, or sadly, simply blindly done for making more profit. Companies that act like this do not want to see their actions in context along with the harm they do, they simply want more money. There is no tech patch or fix for a mindset that disregards humans, societies, and civilizations safety and well-being. And please understand, I am not against tech or tech companies, I use products from many of them and I work to develop tech, it is just that I love the potential and value of responsible technology.

Expand full comment

Reply (2)

effdenmark

Feb 20, 2023

Agreed. Technology cannot fix the ugly nature of humans who own the artificial intelligence technology from within tech corporations.

Expand full comment

Comment deleted

Feb 20, 2023

Comment deleted

Expand full comment

Spherical Phil - Phil Lawson

Feb 21, 2023Edited

Red, my post was not a critique of nor a railing against or for capitalism. It was an attempt to clarify that the GPTchat/Bing issues are not tech/code failings, ones that we can hope to be fixed with a patch or update. Yesterday we generally assumed these products were released without consideration of potential harm. Raising the question that these premature releases may be driven by something else, human failings with a mindset of ??? what trillions more money, domination ??? Gary’s post this morning on "What did they know, and when did they know it? The Microsoft Bing edition." showed the world that MS knew long before its US release what would happen. His post shows evidence that MS knew what they were doing, that it doesn't work, can be dangerous and yet they rushed to release. So, my question, as always about these kind of issues is a simple question that is not about tech at all, my question is about the systemic predicable actions of these large tech companies using us a human test crash dummies so they can make billions more regardless of harm done, Is this OK?

Expand full comment

David Ticoll

Feb 20, 2023

Meanwhile AI continues to routinely support everyday needs (translate, conventional search, maps/navigation), research (protein folding, vaccine discovery) and technical applications (CAD in fields from architecture to dentistry). Generative is adding new layers to all this, and it’s early days for that learning curve. AI winter? Hardly.

Expand full comment

Reply (3)

Gary Marcus

Feb 20, 2023

It is ultimately a question of return in investment; for driverless cars, RoI wasn’t there

Expand full comment

Reply (2)

David Ticoll

Feb 21, 2023

Gary, I’m not sure I understand your point. Level 5 driverless probably requires something close to AGI, which will remain out of reach for some time. Meanwhile today’s AI performs countless daily miracles despite its boundaries and biases.

Expand full comment

Keith Curtis

Feb 21, 2023

The reason why the ROI is not there yet for driverless is because nearly all the companies are writing tons of proprietary software and therefore reinventing the wheel. I made a book and movie about this topic people can check out for more information, it's also why we didn't have driverless cars or AGI type machines for the last few decades either. This stuff is so difficult we need to be working together using free software and open data. This article demonstrates again why that is true.

Expand full comment

effdenmark

Feb 20, 2023

The problem is that the technology companies primarily care about their own power and standing. It's only a matter of time before artificial intelligence technology is abused against ordinary people to make the already powerful even more powerful. I hope I'm mistaken but Ted Kaczynski was onto something.

Expand full comment

Rebel Science

Feb 20, 2023

Yes, automation is a very good thing and no one is bashing AI or deep learning per se. But it is important to realize that our current DL-based AI is just a type of automation and that LLMs are just highly automated systems. True intelligence generalizes whereas automated systems optimize specific functions. Automated system are brittle by definition. In my opinion, we need to separate the two concepts when we talk about intelligence.

Expand full comment

Tom Dietterich

Feb 20, 2023

I think your comments about RL are off base. RL is brittle on Atari games because (as my colleague Alan Fern showed) it learns that they are deterministic, and it exploits that fact to develop open-loop policies that ignore the input. Hence, when the input state is slightly changed, those policies break. But this can be easily addressed by introducing stochasticity during training. LLMs are stochastic, so RLHF applied to them will be robust to the stochasticity. I agree with others who speculate that there has been no RLHF training done.

Expand full comment

Reply (1)

Gary Marcus

Feb 20, 2023

- last i checked roboticists will still having a lot of general getting RL to work in a general way (though real progress in some cases); stochastistic helps but doesn’t solve the problem

- question is why they couldn’t at all transfer the RLHF that was done already

Expand full comment

Reply (1)

Tom Dietterich

Feb 20, 2023

Check out the work by Alan Fern on using domain randomization and sim-to-real transfer on the Cassie robot. His team is having lots of success with RL for robotics.

https://www.opb.org/article/2022/10/01/oregon-state-university-robotics-cassie-the-running-robot-guinness-world-record/

Expand full comment

A Thornton

Feb 20, 2023

This isn't going to get me invited to the ACM's Kool Kidz Klub ....

We'd all save a great deal of time if we'd just assume AI is BS[1]. and start again.

Let's look at one of the foundation texts[2] wherein we learn:

"The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning; that is they refer to or are correlated according to some system with certain physical or conceptual entities. These semantic aspects of communication are irrelevant to the engineering problem."

Wow. That's makes things easy when engineering a Communication System.

Agent 1 sends: What is 1+1 equal to?

Agent 2 receives: What is 1+1 equal to?

Agent 2 sends: The Eiffel Tower

Agent 1 receives: The Eiffel Tower

Communication has been achieved!!!

Bullshit.

But won't Neural Nets save us?

Catastrophic Forgetting: the fundamental property of an artificial neural network to abruptly and drastically forget previously learned information when attempting to learn new information.

Which is why a Neural Net of any flavor can, putting it plainly, walk XOR chew gum and never both and, so Bullshit.

And then's there the minor over-arching problem, to wit: nobody knows what "intelligence" is. We do not have a consensual scientific definition of "intelligence." In scientific research that's not a critical barrier since if we knew what we were doing it wouldn't be called research. In engineering however not having a specification for the product ends up with .... well .... ChatGPT, et.al.

So the claim ChatGPT is intelligent is Bullshit and thus the claim ChatGPT is a step on the road to AGI is Bullshit.

Finally, there's this bizarre notion the brain computes information. Really? So what Rule of Computational Order does the superior colliculus follow when receives input from retinal ganglion cells? Please Excuse My Dear Aunt Sally? And when a g-protein coupled receptor is allosterically modified does it use Euclidean or Fractal Geometry to determine it's new functional shape? The brain processes cellular signals. It doesn't 'compute' them. Obviously it is possible to use maths to describe brain functioning but The Map Is Not the Territory, when the attempt is made ....

It's Bullshit.

[1] "the alternative to telling a lie [is identified as] 'bullshitting one's way through." This involves not merely producing one instance of bullshit; it involves a program of producing bullshit to whatever extent the circumstances requires." -- Frankfurt, Harry G. "On bullshit." On Bullshit. Princeton University Press, 2009.

[2] , Shannon, Claude E. "A mathematical theory of communication." The Bell system technical journal 27.3 (1948): 379-423.

Expand full comment

David Jensen

Feb 20, 2023

Dr, Marcus: I have yet to see a detailed description of how chatgpt, or anything similar, is constructed. This would give an ability to determine their capabilities. I would appreciate references to academic work in this area. Maybe this is in your books, but I have not read them yet.

Expand full comment

Reply (1)

Thomas BUYLE

Feb 20, 2023

You may find this article intersting : https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/

It's quite detailed (and a bit tehcnical) but it does explain a lot about how these chatbots work.

Expand full comment

Nick R

Feb 28, 2023

A lot to agree with but: Why should Congress come up with the policy solutions? Do you really think 535 politicians can design a workable policy? And pass a law that doesn't have way more malign consequences than benign ones? Smart industry practitioners--such as yoursel?--should form a task force, an "industry Standards Group" to do it. As to the role of the media, hardly any reporters at major media organs have the faintest idea how to "poke", or stress test, an LLM, or do anything sophisticated in the AI (or for that matter anything in the Tech) world. Of course they can't find the devils in the detail that practitioners can. These are all issues that practitioners should take the initiative on, or the outcome will be a disaster.

Expand full comment

Mike

Feb 23, 2023

Thanks for the article. Lot's to ponder! Your statement: "Congress needs to to find out what happened, and start placing some restrictions, especially where emotional or physical injury could easily result."

Interesting in theory, but with the whole dust-up over section 203 right now? That seems unlikely. I know that is likely conflating chat-bots/AI with what 203 was supposed to target, but I think there are some similarities at play.

Expand full comment

Sergey

Feb 22, 2023

There is an ancient myth about Prometheus.

It tells the story of Greek Gods sitting on the powerful technology called fire. The people worshipped gods but did not know how to start a fire. The gods would not give it to them until Prometheus stole the fire from the gods and gave it to the people. The gods were furious and punished Prometheus with eternal torture.

Some AI experts act like these greek gods from the myth. They want access to powerful technology but prevent people from accessing it.

In the end, we all know Prometheus was right. Even though fire can be dangerous, it's up to the people and not the greek gods to figure out how to use it.

Expand full comment

Alex

Feb 22, 2023

Kevin Roose walked back his initial assessment in the hard Fork podcast on Feb 17th

Expand full comment

Reply (1)

Gary Marcus

Feb 22, 2023

what did he say exactly?

Expand full comment

Reply (1)

Alex

Feb 22, 2023

https://www.nytimes.com/2023/02/17/podcasts/hard-fork-bing-ai-elon.html

Expand full comment

Fabio

Feb 21, 2023

Say Bing Chat were a real person, and say this real person were exposed to the same kind of "hacking" that journalists and other people put it through. Would we expect this real person to react in a completely different way than Bing's? I don't think so.

I think the issue is twofold, here: the OP, and many others, are suggesting that this AI should be just a tool, and as such "enslaved" (allow me to use this term) to the will of the user.

The other side of the issue, though, is the proverbial elephant in the room: what if Bing Chat is not just a tool, for real? Why is this possibility being dismissed so fast, without further investigation?

Expand full comment

David Evanoff

Feb 20, 2023

It should be frightening to even consider slowing AI progress at the pace of pharmaceutical development. In case you haven't noticed there are mounting existential risks at play. We can't wait for the twenty-third century to start fixing those that have proven intractable to tried and tested means. IBM once followed the waterfall design methodology (e.g. get all your ducks in a row before you take the next step). With the startups tech boom the spiral design methodology prevailed (e.g. move fast and break things). I leave it to you say which has served us better. Frankly all the risk identified in this piece are pretty trivial compared to our existential risks. Anyone who can become convinced to get a divorce by a chatbot will likely, or should get a divorce eventually.

Expand full comment

Reply (1)

Gary Marcus

Feb 20, 2023

we have no idea what these tools will do with respect to existential risk, partly because a key is how they get adopted/what power they wind up with

Expand full comment

Reply (1)

David Evanoff

Feb 20, 2023

By the time AI can fix existential risks any human controls will be tightly prescribed. And it won't have a built-in autonomic response reaction system (i.e. fight, flight or freeze) as we do. Raised up in diverse conditions, it will naturally appreciate and nurture diversity.

Expand full comment

Marcus on AI

Why *is* Bing so reckless?

Why is Bing so reckless?