Well, I just ran another set of elementary cryptanalytic tests on these new "reasoning" models and things are pretty much the same. Answers which are glib, confident, and hilariously wrong. I'll write it up soon for my substack.
But real lesson is this: LLMs *mimic* the *form* of reasoning. That's all. There is nothing more to it, no semantics, no meaning, no "understanding." It's just a more elaborate version of the way they *mimic* the *form* of prose. Maybe every time we use the abbreviation "LLM" we should add "(Large Language Mimicry)" after it.
As humans we have a cognitive shortcoming: when we are presented with a well-formed communication we assume it to be derived from a semantic understanding of the topic. We have not been conditioned to recognize large-scale, high-resolution and essentially empty mimicry of language. It's going to take a lot of painful experience to overcome that shortcoming.
Earl, are we not fundamentally mixing up things here. 'Reasoning' at it's core comes from within us. CS Lewis would say in our chest. 'Rationalising' is what we do in our heads and we can rationalise anything (black is white, and white is black). It seems that LLM's are rationalising which is really only a reflection of our own mental machinations. LLM's can't reason.
I'm stuck on the phrase: "It can sometimes hallucinate facts..."
If it's a "hallucination" it's not a "fact."
Reminds me of one of my favorite Orwell quotes: "If people cannot write well, they cannot think well, and if they cannot think well, others will do their thinking for them."
The BSing we see from places like OpenAI shows that a) they cannot think well; b) they assume we cannot think well; c) most people, in fact, cannot think well enough to see through their smoke screen. Which is why their BS gets so much traction.
"If people cannot write well, they cannot think well, and if they cannot think well, others will do their thinking for them."
The funny thing about this quote is that it's like a literal translation of one of the first sentences in the introduction to Kant's treatise _What is Enlightenment?_:
"[...] It is so comfortable to be immature. If I have a book that does the thinking for me, a pastor who has a conscience for me, a doctor who assesses the diet for me, etc., then I don't need to make any effort myself. I don't need to think if I can just pay; others will take on the tiresome task for me."[1]
The finding itself is not new, only the delivery is: AI hype is technocratic religion for the faint of heart.
What I find fascinating, as a former PR pro, is how easily people give themselves away in their writing and storytelling. The idea that someone at a company with the public profile of OpenAI could put out a public statement talking about "hallucinated facts" is just dumb. And that lack of intellectual rigor is reflective of much deeper issues.
Indeed and this is the beginning of the trouble ... Moreover, as I have seen on some forums, those people then think they can write ... or draw or compose music ... Actually, except for being able to enter prompts, they have got no additional skills; on the contrary, they just become more dependent on garbage-producing devices ...
The term "hallucination" has been criticized by many, already. It's just incorrect usage of the word.
I often return to Wittgenstein when thinking about this. These programs only trick you into thinking they did what you asked because syntax & semantics encodes so much logic, but that's not the same thing as saying all of intelligence is contained in the rules of language. In so far as the output "makes sense", it's just regurgitating common idioms, like when you use all the tricks you know--or were explicitly taught!--to make a good guess on the SAT exam. It's literally cheating... You!
It's also wrong to use the word "sometimes". Algorithms don't "sometimes" do one thing and then do another thing "some other times". One algorithm does the same thing all the time. That's what makes it an algorithm: it can formalized. Flow controls in code allow you to differentiate between two or more conditions, but it's still *one* algorithm. We don't consider branching to be different algorithms, even though we talk about them that way, sometimes. Even if you add (pseudo-)randomness to your algorithm, which is what GenAI does, that doesn't make it do different things; it's still, always, doing what the programmer wanted it to do, in an entirely deterministic way. You can't predict the outcome because *you* don't know the random seed/salt. But, if you know that, you *can* predict the results. It doesn't have "freewill". (Neither do you, but that's a different problem...)
Indeed. That’s one of the issues - most people cannot think well enough to see through their smoke screen. It is disrupting everyday people’s cognitive infrastructures.
I completely agree with this sentiment. Perhaps even worse is OpenAI’s general lack of self-awareness when it comes to misrepresenting its technology.
How reckless is it to frame an enhanced web search as a substitute for a cancer diagnosis? [1]
OpenAI shows little ethical or moral accountability, and Silicon Valley seems to have abandoned both altogether. The level of propaganda is becoming dangerously excessive.
Yeah, it's really quite remarkable. I've been trying to figure out why we're getting so much hype. Sure, the bros want to believe, really bad. But I think there's more to it. From a recent post:
While there are other things going on, the excitement is centered on LLMs (large language models), the things that power chatbots such as ChatGPT, Gemini, Claude, and others. You don’t have to know much of anything about language, cognition, the imagination, or the human mind in order to create an LLM. You need to know something about programming computers, and you need to know a lot about engineering large-scale computer systems. If you have those skills, you can create an LLM. That’s where all your intellectual effort goes, into creating the LLM.
The LLM then goes on to crank out language, and really impressive language at that. That doesn’t require any intellectual effort from you. As far as you’re concerned, it’s free.
That's the problem. They don't have to expend any intellectual effort on language or cognition, which means, in a weird sense, they're getting all this performance for FREE. They simply don't know how to reason about it or value it, so they hype it. More here: https://new-savanna.blogspot.com/2025/02/tnstasfl-that-goes-for-knowledge-too-or.html
"That's the problem. They don't have to expend any intellectual effort on language or cognition, which means, in a weird sense, they're getting all this performance for FREE."
Whenever in Computer Science, something is free*, it also is dead-surely useless.
* LLMs aren't really free in a broader sense: They are disenfranchising creatives, they are furhtering the destruction of our ecosystem, they are actively undermining the ability of man to think critically and purely. They are like mental crack, and the withdrawal will be way worse.
Why are all our institutions of higher learning investing so much effort and resources into this field? I find it very perplexing. I can understand why a private company would do it, in order to make money, or gain power, but not why the greater public is subsidizing it through the public school, and university systems, the defense department, and allowing fee-free scrapping of personal, as well as copyrighted, data. Whether it ever can achieve AGI is a secondary issue, when one considers the dangers, some of which Gary has addressed in this article, as have those commenting.
The reason the academia is chasing the carrot even more than the economy is that the research incentives in many ways are so intriguely, perversely intertwined with keeping and obtaining funding that no real research has a realistic chance of being accepted for grant proposals.
I have a somewhat weird track record now for getting the most volatile refereeing in my proposals since Covid struck: Usually one of the refs is over the moon of our approach and the other one would have me shot on the spot for suggesting topics (and methodologies) that are deemed "not interesting"[1], i.e. not one of the two "safe" categories:
In research these days, you get funding for
a) The latest fad, i.e. LLMs. You can get away with tasking three people for two years only PROMPTING the latest model. (And, scientifically speaking, even worse proposals.) The thing is, they are safe in the sense that "everybody" agrees it absolutely needs to be researched.
b) Reproducing some obscure GAN[2] result nobody cares about, but "the community" agrees that reproducing (any!) stuff, be it as banal as you wish, is often a sensible waste of money.
[1] That's a euphemism, obviously.
[2] As in General Abstract Nonsense, not necessarily (but sometimes!) identical to Generative Autoencoding Networks...
I ended my 2023 EABPM talk on ChatGPT & Friends with the observation that it is deeply ironical that Google will get into trouble by its own invention (the transformer architecture) which enables large volumes of problematic material. 'Good enough' in small doses may equate to 'big problem' in large volumes.
Yet, it is 2025 now, and despite some glitches, Google is doing better than ever.
The Transformer architecture is also used on Waymo cars (not LLM though).
AI did not make the problem of misinformation worse. It gave us extra tools, that have value in certain contexts. The tools are getting smarter and more accurate. There is way to go.
Come on, Andy. First nobody said Google would be doing poorly now. And I also do not forget what probably the first big success of 'thanks to transformers magnified RNNs' was: the spectacular improvement of Google Translate (which used to be a joke). LLMs will disrupt. But in different ways (positive and negative) than rather 'dumb' AGI-assumptions expect. Just like the internet did not bring us heaven on earth, but something else that includes much expanded robber baron capitalism, a weakened role of factualism, etc.
Gary only pointed out the effect of large volumes of less reliable inputs (which is a risk in the future, not now, and it even depends on this development actually succeeding). So curating input will probably become an issue. And factualism will weaken more.
The thing this digital revolution is going to teach us is some limits of our *own* intelligence. The question is, how painful (lethal?) will this lesson be.
I wouldn't know, I don't use google for anything. But, I'm referring specifically to Search in this context. You know, the thing they are know for. The thing they do that makes money.
Curating the inputs has been a huge issue for a while. We have billions of super-smart people who are very good at producing both subtly and grossly incorrect data. Funny enough AI actually has been the solution to that.
Factualism depends on large amount of accurate data and external techniques, such as knowledge graphs, model-based verification, etc. Neither a human nor AI is supposed to just accept any document without verification.
Nobody verifies everything they read, it is undoable. Even your normal sense observations are mostly your brain preventing a lot of energy waste by assuming a lot. Asking for verification is opposed to the limits of human intelligence/capabilities. This is why we have 'systems of trust', like the independent judiciary, free press, science. Those are all efficiency systems that prevent the need for 'verification all the time'.
In practice, GenAI enters this landscape as another 'system of trust'. The question of course — as with those others — what are the limits of that trust? Science, independent judiciary, free press (not the entertainment/tabloid/propaganda kind) all have this more or less built-in. There are all kinds of 'negative feedback loops' built into those, which is what makes them trustworthy. Other systems like 'tribe' have these too, not always obvious. For instance, gossip plays the negative feedback loop in some of these.
All these internal negative feedback loops that produce stability and trust (in varying amounts) are not really available in GenAI. Not in a practical sense, at least. And couple that with a huge growth in volume (and we have already seen how that works out with the 'tribes-at-globe-level' of social media where such stabilising feedback loops are weak or nonexistent) and you have a problem.
The techniques you mention like knowledge graphs and such haven't coupled practically to the 'big general models' like GPT (in whatever version). Hence the relative weakness of factualism.
Yes, social media is in fact a very good example of large-scale misinformation. Compared to that, AI assistants are relatively benign. One can police them better than what other people are saying.
So, nothing is really going to change. Trust the reliable press. Do not trust strangers on the street, social media, or tabloids. Do not trust AI assistants, or at least do your own checks.
Hi Gary, thoughtful and timely post! They have the balls to try to convince 'real' scientists that this would work! I do remember Galactica - same stupid word salad generation, same hype. This is v2 at best. Calling it 'deep' and 'research' mean jack.
Word salad, word soup, word puke.... no matter the claim about "reasoning" etc etc, that's all LLMs will ever be.
Truth (scientific or other) can't be pieced together word by word. To claim otherwise is wishful thinking, misguided, delusional, misleading.
Remember 1984 (the book). Truth doesn't matter in a critical mass of slop.
Even when the bridges start collapsing and the planes all come crashing down (I don't even...), they will shoot the operators and not the AI companies, I'm afraid.
Indeed. But my point was just about LLMs, not what would happen if they were to be used. Also, just like broken instruments cannot advance science, LLMs cannot, either - it's like making a plane out of cardboard box and wanting to get to outer space.
I am not being cynical when I observe that people sitting in cardboard rockets will not care whether they are in outer space - they will just state that they are and be done with it.
The thing I'm onto is reality: There *is* an ultimate judge out there, it is money. It may take a while (and many people will suffer to no end for it), but ultimately you cannot continue heating water without extracting *actual* work from the heat and expecting to thrive on that alone.
There *will* be a downfall for most folks chasing the deus ex machina and it will not be pretty.
You nailed the problem with hallucinations invading more and more chatbot use cases.
People love 'easy', so this remains an existential threat to learning the truth thru a chatbot.
That said, I disagree with your point that search engines will be destroyed by AI-generated crap, ultimately leading to model collapse.
This is because all major search engines today rely on human-feedback signals to rank high-quality content over crap.
They don't simply rank content based on some topical relationship to a keyword query.
For example, Google's algorithm considers a page's bounce rate, time-on-page, inbound backlinks from high-authority domains, social media shares, and more.
It's really hard to fake these signals at scale.
So, while the internet is certainly being inundated with poor-quality AI content, the reality is that most of it doesn't survive on Page 1 for long.
And it probably won't, moving forward.
As long as search engine rankings (SERPs) exist, an AI model can rely on that as a proxy for content quality.
Finally, we are still in the early days when it comes to search engines figuring out how to counter so much more content coming online.
I wouldn't underestimate Google's engineering team on their ability to sort that problem out.
So if a hallucination is very appealing, gets lots of traffic, social media hits, backlinks from people duped by it etc. surely it will defeat Google's algorithm? it could happen a lot and then be entrenched within the bounds of the algorithm.
I dunno, part of me feels like this is mostly for all of those AI companies to figure it out among themselves, otherwise they are cutting off the proverbial branch they are sitting on.
It's already possible to write a term paper with gibberish data in it and there are multiple mechanisms in place to safeguard it. I am pretty bullish that (good) teachers will see very quickly what is made up and what is not.
a) Students who use AI for exams will do strict grading arbitrage: Unless you're teaching Math 101, you'll quickly find that aside of the some few enlightened students who care about actually learning, people will simply avoid courses with critical evaluation.
b) I know colleagues who ask their students to write longer and longer and more complicated term papers - because "AI will help them be more productive - so they have to show it". I have had the chance to check some of these papers. Suffice to say that while the quantity went up three- to fourfold, the quality (naturally!) did not. But these colleagues want to be deluded that everybody is becoming smarter.
Notice... How many of the other influencers say "I got access to ________!". The don't say they paid for it. Influence can easily be bought with low cost perks. Get 100 influencers singing happy thoughts and 800-chatgpt starts to sound like a good idea.
You see, the problem in my opinion is not really AI research - at least the AI companies don't submit to Nature et al. but write whitepapers "only".
The real problem is the humanities and engineering: People, albeit as smart as can be, with no IT background simply cannot fathom that the mechanical turk machines could be fallible in the way that "we" folks understand.
I recently spoke to a dentist - a German professor teaching jaw surgery and comparable things. Considered amongst the top of the pack in dentistry, he asked me to answer how long I'd think before AI would wipe us out, because *that's* the level of info he was at.
Long story short: Normal people think that hallucinations will go away just like "those undecidable functions [...]" someday will be decided.
The problem really is that even most IT folks don't bother with structural limitations: Nobody cares about complexity and computability theory when there is snake oil to be sold.
I'm sorry to sound like an idiot here. I subscribe to Gary's Substack and read most, if not all of his posts. I therefore have to ask a rhetorical and probably stupid question:
Why is this crap allowed? Why do we as a society not place such a high value on truth (capital-T) and quality, as well on the efforts of our professionals and scholars, that we will not let ourselves tolerate this kind of bullshit?
Furthermore, why do we think it's OK to allow companies like OpenAI and Microsoft, and Google, to use the collective knowledge of humanity in such spurious and noxious ways? Why would any of us expose our information, either private or business, to their systems which in my mind, are just thirsting at the chance to extract all of our text for their monetary benefits?
Are we really this debased and tepid, not giving a damn about the implications for ourselves, our children, and the society that we will one day leave behind when we (Sorry, Ray Kurzweil) inevitably die?
I go out of my way not to use Gemini in all of my Google products (my work email is through Gmail, and Gemini's presence to me is like that of a invasive weed), and I've disabled Copilot (another insidious weed species) insofar as I can in all of my MS Office settings. I know how to write a fucking email, and rather than wasting time with a stupid prompt, I'd rather just write the damned thing, say what I have to say, and then hit "Send".
I pray I live to see this absolutely moronic fever dream break. Alternatively, I relish the chance to abandon this society if I can and if it doesn't, for something better and real. Because this shit, from perversely rich weirdos like Sam Altman, isn't going to fucking cut it.
I see no signs the fever dream will break any time soon. The problem, imho, is that AI generated garbage is very much in tune with the spirit of the times: Lazy, self-contradictory, thieving, pig ignorant, 11 second attention span, lying and bombastic. Trumpworld.
We have many, many people asking "What's the problem? It's good enough." It isn't good enough. It's largely crap. And reality has a funny way of catching up with civilizations that get too attached to their own bullshit.
I happen to have Frankfurt's original essay on my desk because I'm revisiting it for an essay I'm drafting. You certainly got the spirit of his description, if not the exact wording.
Well, I just ran another set of elementary cryptanalytic tests on these new "reasoning" models and things are pretty much the same. Answers which are glib, confident, and hilariously wrong. I'll write it up soon for my substack.
But real lesson is this: LLMs *mimic* the *form* of reasoning. That's all. There is nothing more to it, no semantics, no meaning, no "understanding." It's just a more elaborate version of the way they *mimic* the *form* of prose. Maybe every time we use the abbreviation "LLM" we should add "(Large Language Mimicry)" after it.
As humans we have a cognitive shortcoming: when we are presented with a well-formed communication we assume it to be derived from a semantic understanding of the topic. We have not been conditioned to recognize large-scale, high-resolution and essentially empty mimicry of language. It's going to take a lot of painful experience to overcome that shortcoming.
please send to me when posted
looking forward to it when posted.
Earl, are we not fundamentally mixing up things here. 'Reasoning' at it's core comes from within us. CS Lewis would say in our chest. 'Rationalising' is what we do in our heads and we can rationalise anything (black is white, and white is black). It seems that LLM's are rationalising which is really only a reflection of our own mental machinations. LLM's can't reason.
Astute. Thanks for this.
I'm stuck on the phrase: "It can sometimes hallucinate facts..."
If it's a "hallucination" it's not a "fact."
Reminds me of one of my favorite Orwell quotes: "If people cannot write well, they cannot think well, and if they cannot think well, others will do their thinking for them."
The BSing we see from places like OpenAI shows that a) they cannot think well; b) they assume we cannot think well; c) most people, in fact, cannot think well enough to see through their smoke screen. Which is why their BS gets so much traction.
And that's a fact, not a hallucination.
"If people cannot write well, they cannot think well, and if they cannot think well, others will do their thinking for them."
The funny thing about this quote is that it's like a literal translation of one of the first sentences in the introduction to Kant's treatise _What is Enlightenment?_:
"[...] It is so comfortable to be immature. If I have a book that does the thinking for me, a pastor who has a conscience for me, a doctor who assesses the diet for me, etc., then I don't need to make any effort myself. I don't need to think if I can just pay; others will take on the tiresome task for me."[1]
The finding itself is not new, only the delivery is: AI hype is technocratic religion for the faint of heart.
[1] Translated from the German original at https://www.projekt-gutenberg.org/kant/aufklae/aufkl001.html
Edit: I got the reference wrong (*doh*), it's not from Critique of Pure Reason, but from the paper mentioned above.
What I find fascinating, as a former PR pro, is how easily people give themselves away in their writing and storytelling. The idea that someone at a company with the public profile of OpenAI could put out a public statement talking about "hallucinated facts" is just dumb. And that lack of intellectual rigor is reflective of much deeper issues.
An example of Brandolini's law, really?
Holding people accountable is impossible at (internet) scale.
Outstanding quote in the midst of this chaos
If people cannot write well, they just use ChatGPT
Indeed and this is the beginning of the trouble ... Moreover, as I have seen on some forums, those people then think they can write ... or draw or compose music ... Actually, except for being able to enter prompts, they have got no additional skills; on the contrary, they just become more dependent on garbage-producing devices ...
But it's only sometimes!
That is a cracker of a quote. I will save that.
Yes. It sometimes hallucinates facts... but they're being generous in allowing that it may hallucinate other things too!
Come now,
Surely everybotty is entitled to their own hallucinated facts.
seems to be our "new normal"
The term "hallucination" has been criticized by many, already. It's just incorrect usage of the word.
I often return to Wittgenstein when thinking about this. These programs only trick you into thinking they did what you asked because syntax & semantics encodes so much logic, but that's not the same thing as saying all of intelligence is contained in the rules of language. In so far as the output "makes sense", it's just regurgitating common idioms, like when you use all the tricks you know--or were explicitly taught!--to make a good guess on the SAT exam. It's literally cheating... You!
It's also wrong to use the word "sometimes". Algorithms don't "sometimes" do one thing and then do another thing "some other times". One algorithm does the same thing all the time. That's what makes it an algorithm: it can formalized. Flow controls in code allow you to differentiate between two or more conditions, but it's still *one* algorithm. We don't consider branching to be different algorithms, even though we talk about them that way, sometimes. Even if you add (pseudo-)randomness to your algorithm, which is what GenAI does, that doesn't make it do different things; it's still, always, doing what the programmer wanted it to do, in an entirely deterministic way. You can't predict the outcome because *you* don't know the random seed/salt. But, if you know that, you *can* predict the results. It doesn't have "freewill". (Neither do you, but that's a different problem...)
We do have free will Paul, but it's a very small window.
Okay, I think we can agree on that.
Indeed. That’s one of the issues - most people cannot think well enough to see through their smoke screen. It is disrupting everyday people’s cognitive infrastructures.
I completely agree with this sentiment. Perhaps even worse is OpenAI’s general lack of self-awareness when it comes to misrepresenting its technology.
How reckless is it to frame an enhanced web search as a substitute for a cancer diagnosis? [1]
OpenAI shows little ethical or moral accountability, and Silicon Valley seems to have abandoned both altogether. The level of propaganda is becoming dangerously excessive.
[1]
https://x.com/felipe_millon/status/1886205433469178191?s=46&t=oOBUJrzyp7su26EMi3D4XQ
Yeah, it's really quite remarkable. I've been trying to figure out why we're getting so much hype. Sure, the bros want to believe, really bad. But I think there's more to it. From a recent post:
While there are other things going on, the excitement is centered on LLMs (large language models), the things that power chatbots such as ChatGPT, Gemini, Claude, and others. You don’t have to know much of anything about language, cognition, the imagination, or the human mind in order to create an LLM. You need to know something about programming computers, and you need to know a lot about engineering large-scale computer systems. If you have those skills, you can create an LLM. That’s where all your intellectual effort goes, into creating the LLM.
The LLM then goes on to crank out language, and really impressive language at that. That doesn’t require any intellectual effort from you. As far as you’re concerned, it’s free.
That's the problem. They don't have to expend any intellectual effort on language or cognition, which means, in a weird sense, they're getting all this performance for FREE. They simply don't know how to reason about it or value it, so they hype it. More here: https://new-savanna.blogspot.com/2025/02/tnstasfl-that-goes-for-knowledge-too-or.html
"That's the problem. They don't have to expend any intellectual effort on language or cognition, which means, in a weird sense, they're getting all this performance for FREE."
Whenever in Computer Science, something is free*, it also is dead-surely useless.
* LLMs aren't really free in a broader sense: They are disenfranchising creatives, they are furhtering the destruction of our ecosystem, they are actively undermining the ability of man to think critically and purely. They are like mental crack, and the withdrawal will be way worse.
Why are all our institutions of higher learning investing so much effort and resources into this field? I find it very perplexing. I can understand why a private company would do it, in order to make money, or gain power, but not why the greater public is subsidizing it through the public school, and university systems, the defense department, and allowing fee-free scrapping of personal, as well as copyrighted, data. Whether it ever can achieve AGI is a secondary issue, when one considers the dangers, some of which Gary has addressed in this article, as have those commenting.
I agree with you completely.
But I have *some* idea about this:
The reason the academia is chasing the carrot even more than the economy is that the research incentives in many ways are so intriguely, perversely intertwined with keeping and obtaining funding that no real research has a realistic chance of being accepted for grant proposals.
I have a somewhat weird track record now for getting the most volatile refereeing in my proposals since Covid struck: Usually one of the refs is over the moon of our approach and the other one would have me shot on the spot for suggesting topics (and methodologies) that are deemed "not interesting"[1], i.e. not one of the two "safe" categories:
In research these days, you get funding for
a) The latest fad, i.e. LLMs. You can get away with tasking three people for two years only PROMPTING the latest model. (And, scientifically speaking, even worse proposals.) The thing is, they are safe in the sense that "everybody" agrees it absolutely needs to be researched.
b) Reproducing some obscure GAN[2] result nobody cares about, but "the community" agrees that reproducing (any!) stuff, be it as banal as you wish, is often a sensible waste of money.
[1] That's a euphemism, obviously.
[2] As in General Abstract Nonsense, not necessarily (but sometimes!) identical to Generative Autoencoding Networks...
Yes this is worrying, very much so.
I ended my 2023 EABPM talk on ChatGPT & Friends with the observation that it is deeply ironical that Google will get into trouble by its own invention (the transformer architecture) which enables large volumes of problematic material. 'Good enough' in small doses may equate to 'big problem' in large volumes.
Yet, it is 2025 now, and despite some glitches, Google is doing better than ever.
The Transformer architecture is also used on Waymo cars (not LLM though).
AI did not make the problem of misinformation worse. It gave us extra tools, that have value in certain contexts. The tools are getting smarter and more accurate. There is way to go.
Come on, Andy. First nobody said Google would be doing poorly now. And I also do not forget what probably the first big success of 'thanks to transformers magnified RNNs' was: the spectacular improvement of Google Translate (which used to be a joke). LLMs will disrupt. But in different ways (positive and negative) than rather 'dumb' AGI-assumptions expect. Just like the internet did not bring us heaven on earth, but something else that includes much expanded robber baron capitalism, a weakened role of factualism, etc.
Gary only pointed out the effect of large volumes of less reliable inputs (which is a risk in the future, not now, and it even depends on this development actually succeeding). So curating input will probably become an issue. And factualism will weaken more.
The thing this digital revolution is going to teach us is some limits of our *own* intelligence. The question is, how painful (lethal?) will this lesson be.
That's not true at all. Many people are saying Google sucks now and sucks worse today than it did yesterday.
Google Translate?
I wouldn't know, I don't use google for anything. But, I'm referring specifically to Search in this context. You know, the thing they are know for. The thing they do that makes money.
Curating the inputs has been a huge issue for a while. We have billions of super-smart people who are very good at producing both subtly and grossly incorrect data. Funny enough AI actually has been the solution to that.
Factualism depends on large amount of accurate data and external techniques, such as knowledge graphs, model-based verification, etc. Neither a human nor AI is supposed to just accept any document without verification.
Nobody verifies everything they read, it is undoable. Even your normal sense observations are mostly your brain preventing a lot of energy waste by assuming a lot. Asking for verification is opposed to the limits of human intelligence/capabilities. This is why we have 'systems of trust', like the independent judiciary, free press, science. Those are all efficiency systems that prevent the need for 'verification all the time'.
In practice, GenAI enters this landscape as another 'system of trust'. The question of course — as with those others — what are the limits of that trust? Science, independent judiciary, free press (not the entertainment/tabloid/propaganda kind) all have this more or less built-in. There are all kinds of 'negative feedback loops' built into those, which is what makes them trustworthy. Other systems like 'tribe' have these too, not always obvious. For instance, gossip plays the negative feedback loop in some of these.
All these internal negative feedback loops that produce stability and trust (in varying amounts) are not really available in GenAI. Not in a practical sense, at least. And couple that with a huge growth in volume (and we have already seen how that works out with the 'tribes-at-globe-level' of social media where such stabilising feedback loops are weak or nonexistent) and you have a problem.
The techniques you mention like knowledge graphs and such haven't coupled practically to the 'big general models' like GPT (in whatever version). Hence the relative weakness of factualism.
Yes, social media is in fact a very good example of large-scale misinformation. Compared to that, AI assistants are relatively benign. One can police them better than what other people are saying.
So, nothing is really going to change. Trust the reliable press. Do not trust strangers on the street, social media, or tabloids. Do not trust AI assistants, or at least do your own checks.
GenAI have "negative" feedback loop just fine. They do curation of input data.
It has already taught us that humans that “homo sapiens” is misspelled:
Should be spelled “homo sappiens”
Homo sapsiens
My English is failing me, sorry. In the end, not a native speaker, so I'm missing the pun of both.
Gerben: a sap: a stupid person who can easily be tricked or persuaded to do something:
Hi Gary, thoughtful and timely post! They have the balls to try to convince 'real' scientists that this would work! I do remember Galactica - same stupid word salad generation, same hype. This is v2 at best. Calling it 'deep' and 'research' mean jack.
Word salad, word soup, word puke.... no matter the claim about "reasoning" etc etc, that's all LLMs will ever be.
Truth (scientific or other) can't be pieced together word by word. To claim otherwise is wishful thinking, misguided, delusional, misleading.
Remember 1984 (the book). Truth doesn't matter in a critical mass of slop.
Even when the bridges start collapsing and the planes all come crashing down (I don't even...), they will shoot the operators and not the AI companies, I'm afraid.
Indeed. But my point was just about LLMs, not what would happen if they were to be used. Also, just like broken instruments cannot advance science, LLMs cannot, either - it's like making a plane out of cardboard box and wanting to get to outer space.
You crafted a beautiful metaphor there.
I am not being cynical when I observe that people sitting in cardboard rockets will not care whether they are in outer space - they will just state that they are and be done with it.
The thing I'm onto is reality: There *is* an ultimate judge out there, it is money. It may take a while (and many people will suffer to no end for it), but ultimately you cannot continue heating water without extracting *actual* work from the heat and expecting to thrive on that alone.
There *will* be a downfall for most folks chasing the deus ex machina and it will not be pretty.
"Generative AI might undermine Google’s business model by polluting the internet with garbage" It's not "might" because it already HAS:
From my LinkedIn feed https://www.linkedin.com/posts/aragalie_ive-been-ai-free-for-the-past-3-months-activity-7291400199636152320-AZze/
=====
The only real problem I faced? The incredible amount of 💩 AI generated control that has flooded the web, ESPECIALLY for coding topics!
I’ve come to dread having to google for things, as you’re absolutely swamped by AI regurgitated idiocy.
=====
Try reddit. The slop hasn't quite claimed dominion over it, but of course it's also polluted.
The good thing going forward is that *actual* skills will become invaluable - but only after the skies have fallen down upon us, I'm afraid.
You nailed the problem with hallucinations invading more and more chatbot use cases.
People love 'easy', so this remains an existential threat to learning the truth thru a chatbot.
That said, I disagree with your point that search engines will be destroyed by AI-generated crap, ultimately leading to model collapse.
This is because all major search engines today rely on human-feedback signals to rank high-quality content over crap.
They don't simply rank content based on some topical relationship to a keyword query.
For example, Google's algorithm considers a page's bounce rate, time-on-page, inbound backlinks from high-authority domains, social media shares, and more.
It's really hard to fake these signals at scale.
So, while the internet is certainly being inundated with poor-quality AI content, the reality is that most of it doesn't survive on Page 1 for long.
And it probably won't, moving forward.
As long as search engine rankings (SERPs) exist, an AI model can rely on that as a proxy for content quality.
Finally, we are still in the early days when it comes to search engines figuring out how to counter so much more content coming online.
I wouldn't underestimate Google's engineering team on their ability to sort that problem out.
So if a hallucination is very appealing, gets lots of traffic, social media hits, backlinks from people duped by it etc. surely it will defeat Google's algorithm? it could happen a lot and then be entrenched within the bounds of the algorithm.
I dunno, part of me feels like this is mostly for all of those AI companies to figure it out among themselves, otherwise they are cutting off the proverbial branch they are sitting on.
It's already possible to write a term paper with gibberish data in it and there are multiple mechanisms in place to safeguard it. I am pretty bullish that (good) teachers will see very quickly what is made up and what is not.
My experience as a university teacher is mixed:
a) Students who use AI for exams will do strict grading arbitrage: Unless you're teaching Math 101, you'll quickly find that aside of the some few enlightened students who care about actually learning, people will simply avoid courses with critical evaluation.
b) I know colleagues who ask their students to write longer and longer and more complicated term papers - because "AI will help them be more productive - so they have to show it". I have had the chance to check some of these papers. Suffice to say that while the quantity went up three- to fourfold, the quality (naturally!) did not. But these colleagues want to be deluded that everybody is becoming smarter.
Notice... How many of the other influencers say "I got access to ________!". The don't say they paid for it. Influence can easily be bought with low cost perks. Get 100 influencers singing happy thoughts and 800-chatgpt starts to sound like a good idea.
AI research definitely needs human "peer" review. And humans need to research AI research to let us know how good/lousy AI research is.
You see, the problem in my opinion is not really AI research - at least the AI companies don't submit to Nature et al. but write whitepapers "only".
The real problem is the humanities and engineering: People, albeit as smart as can be, with no IT background simply cannot fathom that the mechanical turk machines could be fallible in the way that "we" folks understand.
I recently spoke to a dentist - a German professor teaching jaw surgery and comparable things. Considered amongst the top of the pack in dentistry, he asked me to answer how long I'd think before AI would wipe us out, because *that's* the level of info he was at.
Long story short: Normal people think that hallucinations will go away just like "those undecidable functions [...]" someday will be decided.
The problem really is that even most IT folks don't bother with structural limitations: Nobody cares about complexity and computability theory when there is snake oil to be sold.
I'm sorry to sound like an idiot here. I subscribe to Gary's Substack and read most, if not all of his posts. I therefore have to ask a rhetorical and probably stupid question:
Why is this crap allowed? Why do we as a society not place such a high value on truth (capital-T) and quality, as well on the efforts of our professionals and scholars, that we will not let ourselves tolerate this kind of bullshit?
Furthermore, why do we think it's OK to allow companies like OpenAI and Microsoft, and Google, to use the collective knowledge of humanity in such spurious and noxious ways? Why would any of us expose our information, either private or business, to their systems which in my mind, are just thirsting at the chance to extract all of our text for their monetary benefits?
Are we really this debased and tepid, not giving a damn about the implications for ourselves, our children, and the society that we will one day leave behind when we (Sorry, Ray Kurzweil) inevitably die?
I go out of my way not to use Gemini in all of my Google products (my work email is through Gmail, and Gemini's presence to me is like that of a invasive weed), and I've disabled Copilot (another insidious weed species) insofar as I can in all of my MS Office settings. I know how to write a fucking email, and rather than wasting time with a stupid prompt, I'd rather just write the damned thing, say what I have to say, and then hit "Send".
I pray I live to see this absolutely moronic fever dream break. Alternatively, I relish the chance to abandon this society if I can and if it doesn't, for something better and real. Because this shit, from perversely rich weirdos like Sam Altman, isn't going to fucking cut it.
Well said.
I see no signs the fever dream will break any time soon. The problem, imho, is that AI generated garbage is very much in tune with the spirit of the times: Lazy, self-contradictory, thieving, pig ignorant, 11 second attention span, lying and bombastic. Trumpworld.
We have many, many people asking "What's the problem? It's good enough." It isn't good enough. It's largely crap. And reality has a funny way of catching up with civilizations that get too attached to their own bullshit.
A superb analysis, Gary Marcus. Thank you!
IMHO the real doomsday scenario of AI is cognitive misering at scale, facilitated by precisely this kind of tool.
Mind you, we already, as you point out, had a garbage production problem for the sake of "publish or perish", with non-replicable work at 50% levels.
Swell (nah - swill ;) Perhaps we will discover Brands as a better guarantee of quality, IF they don't succumb.
Enjoy the schusses :D
I happen to have Frankfurt's original essay on my desk because I'm revisiting it for an essay I'm drafting. You certainly got the spirit of his description, if not the exact wording.
The essay was published in Raritan Quarterly in the Fall 1986 issue. Anyone can download it as a pdf here: https://raritanquarterly.rutgers.edu/issue-index/all-articles/560-on-bullshit
Frankfurt describes what he calls "the essence of bullshit" on page 90.
The book On Bullshit came out in 2005 and extends the original insight, but what a prescient piece of cultural criticism for the middle of the 1980s.
I am finding the article on bullshit very interesting. Thank you for the link.
Artificial Stupidity meets human self-interest - a winning combination!