112 Comments
User's avatar
Earl Boebert's avatar

Well, I just ran another set of elementary cryptanalytic tests on these new "reasoning" models and things are pretty much the same. Answers which are glib, confident, and hilariously wrong. I'll write it up soon for my substack.

But real lesson is this: LLMs *mimic* the *form* of reasoning. That's all. There is nothing more to it, no semantics, no meaning, no "understanding." It's just a more elaborate version of the way they *mimic* the *form* of prose. Maybe every time we use the abbreviation "LLM" we should add "(Large Language Mimicry)" after it.

As humans we have a cognitive shortcoming: when we are presented with a well-formed communication we assume it to be derived from a semantic understanding of the topic. We have not been conditioned to recognize large-scale, high-resolution and essentially empty mimicry of language. It's going to take a lot of painful experience to overcome that shortcoming.

Expand full comment
Gary Marcus's avatar

please send to me when posted

Expand full comment
Alden Do Rosario's avatar

looking forward to it when posted.

Expand full comment
Vincent McMahon's avatar

Earl, are we not fundamentally mixing up things here. 'Reasoning' at it's core comes from within us. CS Lewis would say in our chest. 'Rationalising' is what we do in our heads and we can rationalise anything (black is white, and white is black). It seems that LLM's are rationalising which is really only a reflection of our own mental machinations. LLM's can't reason.

Expand full comment
Youssef alHotsefot's avatar

Astute. Thanks for this.

Expand full comment
Martin Luz's avatar

I'm stuck on the phrase: "It can sometimes hallucinate facts..."

If it's a "hallucination" it's not a "fact."

Reminds me of one of my favorite Orwell quotes: "If people cannot write well, they cannot think well, and if they cannot think well, others will do their thinking for them."

The BSing we see from places like OpenAI shows that a) they cannot think well; b) they assume we cannot think well; c) most people, in fact, cannot think well enough to see through their smoke screen. Which is why their BS gets so much traction.

And that's a fact, not a hallucination.

Expand full comment
Fabian Transchel's avatar

"If people cannot write well, they cannot think well, and if they cannot think well, others will do their thinking for them."

The funny thing about this quote is that it's like a literal translation of one of the first sentences in the introduction to Kant's treatise _What is Enlightenment?_:

"[...] It is so comfortable to be immature. If I have a book that does the thinking for me, a pastor who has a conscience for me, a doctor who assesses the diet for me, etc., then I don't need to make any effort myself. I don't need to think if I can just pay; others will take on the tiresome task for me."[1]

The finding itself is not new, only the delivery is: AI hype is technocratic religion for the faint of heart.

[1] Translated from the German original at https://www.projekt-gutenberg.org/kant/aufklae/aufkl001.html

Edit: I got the reference wrong (*doh*), it's not from Critique of Pure Reason, but from the paper mentioned above.

Expand full comment
Martin Luz's avatar

What I find fascinating, as a former PR pro, is how easily people give themselves away in their writing and storytelling. The idea that someone at a company with the public profile of OpenAI could put out a public statement talking about "hallucinated facts" is just dumb. And that lack of intellectual rigor is reflective of much deeper issues.

Expand full comment
Fabian Transchel's avatar

An example of Brandolini's law, really?

Holding people accountable is impossible at (internet) scale.

Expand full comment
Miguel Gómez's avatar

Outstanding quote in the midst of this chaos

Expand full comment
Larry Jewett's avatar

If people cannot write well, they just use ChatGPT

Expand full comment
Coalabi's avatar

Indeed and this is the beginning of the trouble ... Moreover, as I have seen on some forums, those people then think they can write ... or draw or compose music ... Actually, except for being able to enter prompts, they have got no additional skills; on the contrary, they just become more dependent on garbage-producing devices ...

Expand full comment
Notorious P.A.T.'s avatar

But it's only sometimes!

That is a cracker of a quote. I will save that.

Expand full comment
Martin Luz's avatar

Yes. It sometimes hallucinates facts... but they're being generous in allowing that it may hallucinate other things too!

Expand full comment
Larry Jewett's avatar

Come now,

Surely everybotty is entitled to their own hallucinated facts.

Expand full comment
Martin Luz's avatar

seems to be our "new normal"

Expand full comment
Paul D's avatar

The term "hallucination" has been criticized by many, already. It's just incorrect usage of the word.

I often return to Wittgenstein when thinking about this. These programs only trick you into thinking they did what you asked because syntax & semantics encodes so much logic, but that's not the same thing as saying all of intelligence is contained in the rules of language. In so far as the output "makes sense", it's just regurgitating common idioms, like when you use all the tricks you know--or were explicitly taught!--to make a good guess on the SAT exam. It's literally cheating... You!

It's also wrong to use the word "sometimes". Algorithms don't "sometimes" do one thing and then do another thing "some other times". One algorithm does the same thing all the time. That's what makes it an algorithm: it can formalized. Flow controls in code allow you to differentiate between two or more conditions, but it's still *one* algorithm. We don't consider branching to be different algorithms, even though we talk about them that way, sometimes. Even if you add (pseudo-)randomness to your algorithm, which is what GenAI does, that doesn't make it do different things; it's still, always, doing what the programmer wanted it to do, in an entirely deterministic way. You can't predict the outcome because *you* don't know the random seed/salt. But, if you know that, you *can* predict the results. It doesn't have "freewill". (Neither do you, but that's a different problem...)

Expand full comment
Vincent McMahon's avatar

We do have free will Paul, but it's a very small window.

Expand full comment
Paul D's avatar

Okay, I think we can agree on that.

Expand full comment
María Luque's avatar

Indeed. That’s one of the issues - most people cannot think well enough to see through their smoke screen. It is disrupting everyday people’s cognitive infrastructures.

Expand full comment
Gerard's avatar

I completely agree with this sentiment. Perhaps even worse is OpenAI’s general lack of self-awareness when it comes to misrepresenting its technology.

How reckless is it to frame an enhanced web search as a substitute for a cancer diagnosis? [1]

OpenAI shows little ethical or moral accountability, and Silicon Valley seems to have abandoned both altogether. The level of propaganda is becoming dangerously excessive.

[1]

https://x.com/felipe_millon/status/1886205433469178191?s=46&t=oOBUJrzyp7su26EMi3D4XQ

Expand full comment
Bill Benzon's avatar

Yeah, it's really quite remarkable. I've been trying to figure out why we're getting so much hype. Sure, the bros want to believe, really bad. But I think there's more to it. From a recent post:

While there are other things going on, the excitement is centered on LLMs (large language models), the things that power chatbots such as ChatGPT, Gemini, Claude, and others. You don’t have to know much of anything about language, cognition, the imagination, or the human mind in order to create an LLM. You need to know something about programming computers, and you need to know a lot about engineering large-scale computer systems. If you have those skills, you can create an LLM. That’s where all your intellectual effort goes, into creating the LLM.

The LLM then goes on to crank out language, and really impressive language at that. That doesn’t require any intellectual effort from you. As far as you’re concerned, it’s free.

That's the problem. They don't have to expend any intellectual effort on language or cognition, which means, in a weird sense, they're getting all this performance for FREE. They simply don't know how to reason about it or value it, so they hype it. More here: https://new-savanna.blogspot.com/2025/02/tnstasfl-that-goes-for-knowledge-too-or.html

Expand full comment
Fabian Transchel's avatar

"That's the problem. They don't have to expend any intellectual effort on language or cognition, which means, in a weird sense, they're getting all this performance for FREE."

Whenever in Computer Science, something is free*, it also is dead-surely useless.

* LLMs aren't really free in a broader sense: They are disenfranchising creatives, they are furhtering the destruction of our ecosystem, they are actively undermining the ability of man to think critically and purely. They are like mental crack, and the withdrawal will be way worse.

Expand full comment
Joy in HK fiFP's avatar

Why are all our institutions of higher learning investing so much effort and resources into this field? I find it very perplexing. I can understand why a private company would do it, in order to make money, or gain power, but not why the greater public is subsidizing it through the public school, and university systems, the defense department, and allowing fee-free scrapping of personal, as well as copyrighted, data. Whether it ever can achieve AGI is a secondary issue, when one considers the dangers, some of which Gary has addressed in this article, as have those commenting.

Expand full comment
Fabian Transchel's avatar

I agree with you completely.

But I have *some* idea about this:

The reason the academia is chasing the carrot even more than the economy is that the research incentives in many ways are so intriguely, perversely intertwined with keeping and obtaining funding that no real research has a realistic chance of being accepted for grant proposals.

I have a somewhat weird track record now for getting the most volatile refereeing in my proposals since Covid struck: Usually one of the refs is over the moon of our approach and the other one would have me shot on the spot for suggesting topics (and methodologies) that are deemed "not interesting"[1], i.e. not one of the two "safe" categories:

In research these days, you get funding for

a) The latest fad, i.e. LLMs. You can get away with tasking three people for two years only PROMPTING the latest model. (And, scientifically speaking, even worse proposals.) The thing is, they are safe in the sense that "everybody" agrees it absolutely needs to be researched.

b) Reproducing some obscure GAN[2] result nobody cares about, but "the community" agrees that reproducing (any!) stuff, be it as banal as you wish, is often a sensible waste of money.

[1] That's a euphemism, obviously.

[2] As in General Abstract Nonsense, not necessarily (but sometimes!) identical to Generative Autoencoding Networks...

Expand full comment
Gerben Wierda's avatar

Yes this is worrying, very much so.

I ended my 2023 EABPM talk on ChatGPT & Friends with the observation that it is deeply ironical that Google will get into trouble by its own invention (the transformer architecture) which enables large volumes of problematic material. 'Good enough' in small doses may equate to 'big problem' in large volumes.

Expand full comment
Andy X Andersen's avatar

Yet, it is 2025 now, and despite some glitches, Google is doing better than ever.

The Transformer architecture is also used on Waymo cars (not LLM though).

AI did not make the problem of misinformation worse. It gave us extra tools, that have value in certain contexts. The tools are getting smarter and more accurate. There is way to go.

Expand full comment
Gerben Wierda's avatar

Come on, Andy. First nobody said Google would be doing poorly now. And I also do not forget what probably the first big success of 'thanks to transformers magnified RNNs' was: the spectacular improvement of Google Translate (which used to be a joke). LLMs will disrupt. But in different ways (positive and negative) than rather 'dumb' AGI-assumptions expect. Just like the internet did not bring us heaven on earth, but something else that includes much expanded robber baron capitalism, a weakened role of factualism, etc.

Gary only pointed out the effect of large volumes of less reliable inputs (which is a risk in the future, not now, and it even depends on this development actually succeeding). So curating input will probably become an issue. And factualism will weaken more.

The thing this digital revolution is going to teach us is some limits of our *own* intelligence. The question is, how painful (lethal?) will this lesson be.

Expand full comment
Paul D's avatar

That's not true at all. Many people are saying Google sucks now and sucks worse today than it did yesterday.

Expand full comment
Gerben Wierda's avatar

Google Translate?

Expand full comment
Paul D's avatar

I wouldn't know, I don't use google for anything. But, I'm referring specifically to Search in this context. You know, the thing they are know for. The thing they do that makes money.

Expand full comment
Andy X Andersen's avatar

Curating the inputs has been a huge issue for a while. We have billions of super-smart people who are very good at producing both subtly and grossly incorrect data. Funny enough AI actually has been the solution to that.

Factualism depends on large amount of accurate data and external techniques, such as knowledge graphs, model-based verification, etc. Neither a human nor AI is supposed to just accept any document without verification.

Expand full comment
Gerben Wierda's avatar

Nobody verifies everything they read, it is undoable. Even your normal sense observations are mostly your brain preventing a lot of energy waste by assuming a lot. Asking for verification is opposed to the limits of human intelligence/capabilities. This is why we have 'systems of trust', like the independent judiciary, free press, science. Those are all efficiency systems that prevent the need for 'verification all the time'.

In practice, GenAI enters this landscape as another 'system of trust'. The question of course — as with those others — what are the limits of that trust? Science, independent judiciary, free press (not the entertainment/tabloid/propaganda kind) all have this more or less built-in. There are all kinds of 'negative feedback loops' built into those, which is what makes them trustworthy. Other systems like 'tribe' have these too, not always obvious. For instance, gossip plays the negative feedback loop in some of these.

All these internal negative feedback loops that produce stability and trust (in varying amounts) are not really available in GenAI. Not in a practical sense, at least. And couple that with a huge growth in volume (and we have already seen how that works out with the 'tribes-at-globe-level' of social media where such stabilising feedback loops are weak or nonexistent) and you have a problem.

The techniques you mention like knowledge graphs and such haven't coupled practically to the 'big general models' like GPT (in whatever version). Hence the relative weakness of factualism.

Expand full comment
Andy X Andersen's avatar

Yes, social media is in fact a very good example of large-scale misinformation. Compared to that, AI assistants are relatively benign. One can police them better than what other people are saying.

So, nothing is really going to change. Trust the reliable press. Do not trust strangers on the street, social media, or tabloids. Do not trust AI assistants, or at least do your own checks.

Expand full comment
Andy X Andersen's avatar

GenAI have "negative" feedback loop just fine. They do curation of input data.

Expand full comment
Larry Jewett's avatar

It has already taught us that humans that “homo sapiens” is misspelled:

Should be spelled “homo sappiens”

Expand full comment
Larry Jewett's avatar

Homo sapsiens

Expand full comment
Gerben Wierda's avatar

My English is failing me, sorry. In the end, not a native speaker, so I'm missing the pun of both.

Expand full comment
Paul Czyzewski's avatar

Gerben: a sap: a stupid person who can easily be tricked or persuaded to do something:

Expand full comment
Saty Chary's avatar

Hi Gary, thoughtful and timely post! They have the balls to try to convince 'real' scientists that this would work! I do remember Galactica - same stupid word salad generation, same hype. This is v2 at best. Calling it 'deep' and 'research' mean jack.

Word salad, word soup, word puke.... no matter the claim about "reasoning" etc etc, that's all LLMs will ever be.

Truth (scientific or other) can't be pieced together word by word. To claim otherwise is wishful thinking, misguided, delusional, misleading.

Expand full comment
Fabian Transchel's avatar

Remember 1984 (the book). Truth doesn't matter in a critical mass of slop.

Even when the bridges start collapsing and the planes all come crashing down (I don't even...), they will shoot the operators and not the AI companies, I'm afraid.

Expand full comment
Saty Chary's avatar

Indeed. But my point was just about LLMs, not what would happen if they were to be used. Also, just like broken instruments cannot advance science, LLMs cannot, either - it's like making a plane out of cardboard box and wanting to get to outer space.

Expand full comment
Fabian Transchel's avatar

You crafted a beautiful metaphor there.

I am not being cynical when I observe that people sitting in cardboard rockets will not care whether they are in outer space - they will just state that they are and be done with it.

The thing I'm onto is reality: There *is* an ultimate judge out there, it is money. It may take a while (and many people will suffer to no end for it), but ultimately you cannot continue heating water without extracting *actual* work from the heat and expecting to thrive on that alone.

There *will* be a downfall for most folks chasing the deus ex machina and it will not be pretty.

Expand full comment
David Hsing's avatar

"Generative AI might undermine Google’s business model by polluting the internet with garbage" It's not "might" because it already HAS:

From my LinkedIn feed https://www.linkedin.com/posts/aragalie_ive-been-ai-free-for-the-past-3-months-activity-7291400199636152320-AZze/

=====

The only real problem I faced? The incredible amount of 💩 AI generated control that has flooded the web, ESPECIALLY for coding topics!

I’ve come to dread having to google for things, as you’re absolutely swamped by AI regurgitated idiocy.

=====

Expand full comment
Fabian Transchel's avatar

Try reddit. The slop hasn't quite claimed dominion over it, but of course it's also polluted.

The good thing going forward is that *actual* skills will become invaluable - but only after the skies have fallen down upon us, I'm afraid.

Expand full comment
Vernon Niven's avatar

You nailed the problem with hallucinations invading more and more chatbot use cases.

People love 'easy', so this remains an existential threat to learning the truth thru a chatbot.

That said, I disagree with your point that search engines will be destroyed by AI-generated crap, ultimately leading to model collapse.

This is because all major search engines today rely on human-feedback signals to rank high-quality content over crap.

They don't simply rank content based on some topical relationship to a keyword query.

For example, Google's algorithm considers a page's bounce rate, time-on-page, inbound backlinks from high-authority domains, social media shares, and more.

It's really hard to fake these signals at scale.

So, while the internet is certainly being inundated with poor-quality AI content, the reality is that most of it doesn't survive on Page 1 for long.

And it probably won't, moving forward.

As long as search engine rankings (SERPs) exist, an AI model can rely on that as a proxy for content quality.

Finally, we are still in the early days when it comes to search engines figuring out how to counter so much more content coming online.

I wouldn't underestimate Google's engineering team on their ability to sort that problem out.

Expand full comment
Martin Belderson's avatar

So if a hallucination is very appealing, gets lots of traffic, social media hits, backlinks from people duped by it etc. surely it will defeat Google's algorithm? it could happen a lot and then be entrenched within the bounds of the algorithm.

Expand full comment
M Crutchmore's avatar

I dunno, part of me feels like this is mostly for all of those AI companies to figure it out among themselves, otherwise they are cutting off the proverbial branch they are sitting on.

It's already possible to write a term paper with gibberish data in it and there are multiple mechanisms in place to safeguard it. I am pretty bullish that (good) teachers will see very quickly what is made up and what is not.

Expand full comment
Fabian Transchel's avatar

My experience as a university teacher is mixed:

a) Students who use AI for exams will do strict grading arbitrage: Unless you're teaching Math 101, you'll quickly find that aside of the some few enlightened students who care about actually learning, people will simply avoid courses with critical evaluation.

b) I know colleagues who ask their students to write longer and longer and more complicated term papers - because "AI will help them be more productive - so they have to show it". I have had the chance to check some of these papers. Suffice to say that while the quantity went up three- to fourfold, the quality (naturally!) did not. But these colleagues want to be deluded that everybody is becoming smarter.

Expand full comment
TheAISlop's avatar

Notice... How many of the other influencers say "I got access to ________!". The don't say they paid for it. Influence can easily be bought with low cost perks. Get 100 influencers singing happy thoughts and 800-chatgpt starts to sound like a good idea.

Expand full comment
Roger's avatar

AI research definitely needs human "peer" review. And humans need to research AI research to let us know how good/lousy AI research is.

Expand full comment
Fabian Transchel's avatar

You see, the problem in my opinion is not really AI research - at least the AI companies don't submit to Nature et al. but write whitepapers "only".

The real problem is the humanities and engineering: People, albeit as smart as can be, with no IT background simply cannot fathom that the mechanical turk machines could be fallible in the way that "we" folks understand.

I recently spoke to a dentist - a German professor teaching jaw surgery and comparable things. Considered amongst the top of the pack in dentistry, he asked me to answer how long I'd think before AI would wipe us out, because *that's* the level of info he was at.

Long story short: Normal people think that hallucinations will go away just like "those undecidable functions [...]" someday will be decided.

The problem really is that even most IT folks don't bother with structural limitations: Nobody cares about complexity and computability theory when there is snake oil to be sold.

Expand full comment
Daniel Tucker's avatar

I'm sorry to sound like an idiot here. I subscribe to Gary's Substack and read most, if not all of his posts. I therefore have to ask a rhetorical and probably stupid question:

Why is this crap allowed? Why do we as a society not place such a high value on truth (capital-T) and quality, as well on the efforts of our professionals and scholars, that we will not let ourselves tolerate this kind of bullshit?

Furthermore, why do we think it's OK to allow companies like OpenAI and Microsoft, and Google, to use the collective knowledge of humanity in such spurious and noxious ways? Why would any of us expose our information, either private or business, to their systems which in my mind, are just thirsting at the chance to extract all of our text for their monetary benefits?

Are we really this debased and tepid, not giving a damn about the implications for ourselves, our children, and the society that we will one day leave behind when we (Sorry, Ray Kurzweil) inevitably die?

I go out of my way not to use Gemini in all of my Google products (my work email is through Gmail, and Gemini's presence to me is like that of a invasive weed), and I've disabled Copilot (another insidious weed species) insofar as I can in all of my MS Office settings. I know how to write a fucking email, and rather than wasting time with a stupid prompt, I'd rather just write the damned thing, say what I have to say, and then hit "Send".

I pray I live to see this absolutely moronic fever dream break. Alternatively, I relish the chance to abandon this society if I can and if it doesn't, for something better and real. Because this shit, from perversely rich weirdos like Sam Altman, isn't going to fucking cut it.

Expand full comment
Youssef alHotsefot's avatar

Well said.

I see no signs the fever dream will break any time soon. The problem, imho, is that AI generated garbage is very much in tune with the spirit of the times: Lazy, self-contradictory, thieving, pig ignorant, 11 second attention span, lying and bombastic. Trumpworld.

We have many, many people asking "What's the problem? It's good enough." It isn't good enough. It's largely crap. And reality has a funny way of catching up with civilizations that get too attached to their own bullshit.

Expand full comment
Terry Bollinger's avatar

A superb analysis, Gary Marcus. Thank you!

Expand full comment
Charles Fadel's avatar

IMHO the real doomsday scenario of AI is cognitive misering at scale, facilitated by precisely this kind of tool.

Mind you, we already, as you point out, had a garbage production problem for the sake of "publish or perish", with non-replicable work at 50% levels.

Swell (nah - swill ;) Perhaps we will discover Brands as a better guarantee of quality, IF they don't succumb.

Enjoy the schusses :D

Expand full comment
Rob Nelson's avatar

I happen to have Frankfurt's original essay on my desk because I'm revisiting it for an essay I'm drafting. You certainly got the spirit of his description, if not the exact wording.

The essay was published in Raritan Quarterly in the Fall 1986 issue. Anyone can download it as a pdf here: https://raritanquarterly.rutgers.edu/issue-index/all-articles/560-on-bullshit

Frankfurt describes what he calls "the essence of bullshit" on page 90.

The book On Bullshit came out in 2005 and extends the original insight, but what a prescient piece of cultural criticism for the middle of the 1980s.

Expand full comment
Joy in HK fiFP's avatar

I am finding the article on bullshit very interesting. Thank you for the link.

Expand full comment
Aaron Turner's avatar

Artificial Stupidity meets human self-interest - a winning combination!

Expand full comment