You might be interested in this side note. There is an ancient question in Judaism, “Does God, blessed be he, pray?”
According to the Babylonian Talmud, specifically in Tractate Berakhot 7, the answer is given:
The Talmudic discussion begins with Rabbi Yoḥanan (a 3rd-century sage) asking how we know that the Holy One, Blessed be He, prays. He points to a verse in Isaiah 56:7: "I will bring them to My holy mountain and make them joyful in My house of prayer." The Rabbi notes that it doesn't say "their" house of prayer, but "My" house of prayer—implying that God has a prayer of His own.
The Talmud then asks, "What does God pray?" The answer is attributed to Rav (one of the greatest Babylonian sages):
"May it be My will that My mercy may suppress My anger, and that My mercy may prevail over My other attributes [of strict justice], so that I may deal with My children with the attribute of mercy and stop short of the limit of strict justice."
This passage is famous because it addresses a fundamental tension in Jewish thought: God as the True Judge (Dayan HaEmet) versus God as the Merciful Father (Avinu HaRachaman).
The idea that God "prays" is a piece of aggadah (narrative/metaphorical teaching). It suggests that even the Divine "struggles" to balance the need for accountability with the desire for compassion. It teaches that the world cannot survive on strict justice alone; it requires Lifnim Mishurat Hadin—going beyond the letter of the law.
Thanks to the AI tool, Gemini, for checking my references and getting the quote “right” and in English.
Stephen, the broligarchs see gods in the mirror, are libertarian in thought, and pagan in action. If they're on their knees, it's not to pray but to supplicate to their orange benefactor.
I'm going to stick my neck out here and say that mainstream tech people will just keep ignoring any paper that doesn't align with the magical thinking that LLMs can reason.
It is well-understood that statistical pattern matching is imitation without understanding.
Neurosymbolic approaches would suffer from precisely the same problem. They go through motions given by rules. The motions may be logically correct, but there is no real understanding for when the rules apply or how to build new abstractions.
Understanding has to be brought in somehow. When people work, understanding comes first from intuition, and second from observing results and correcting.
So, human intelligence is a mix of guesswork, prior experience, and adaptation. Current AI systems will do they same. Maybe they won't be as good as humans, but there is no sign of them hitting any hard limits.
Arguably, "understanding" is a consequence of consciousness. At best, something in the way of hubris to think we can program a machine to exhibit or manifest that feature -- "golems" writ large.
"Consciousness" is a high bar to clear. If a submarine swims well, it need not prove it is conscious. It is a matter of simulating well enough the world to function properly.
Not to say that it's impossible to "instantiate" consciousness in rudimentary structures, but seems necessary to start off with a model or description of an underlying process. Apropos of which, that reminds me of being quite amused by some paper wasps which apparently exhibited an "understanding" of transitive inference:
NYTimes: Wasps Passed This Logic Test. Can You?; The insects frequently found in your backyard appear to be the first invertebrate known to be capable of the skill of transitive inference.
I think one can get lost in philosophical arguments. It is like saying one must be a bird to be able to fly.
So far, our machines have become better and better while fully avoiding the problem of consciousness. We need more accurate software that that can catalog and predict the world better, and which can recover from failure.
Yes, need "model or description of an underlying process". But that has nothing to do with consciousness.
I use an Excel spreadsheet because it is a tool that is more accurate than I. Symbolic programming gives us 100% certainty. I’m not trusting ANYTHING that uses a mix of guesswork, experience and adaptation. That’s why tools exist.
Perhaps the definition of Neurosymbolic has some variability. If a Neurosymbolic platform is underpinned by a robust semantic layer of deeply disambiguated knowledge, and explicit meaning then that may provide the layer of real understanding needed.
The first two paragraphs are obviously correct. The latter two are the entire crux of the matter. the appeal to "intuition" here is empty, and comes in too late: we have no substantive account of "intuition", nor any account of what social or biological precursors it requires, nor its correct developmental account even in we eventual understanders. Likewise for "observing results" and "correction"; these are sophisticated developmental achievements, not genetic endowments. Infants do not have these abilities, and one of the greatest hinderances to progress in understanding ourselves (never mind to replicate ourselves in built systems) is our habit of retrojecting mature human capacities into early stage humans of whom we know little for sure except that they will surely die if not attended to in highly specific ways by mature conspecifics with an intensity and over a duration that is unparalleled elsewhere in the animal kingdom.
FWIW, talk of "world models" is also empty and comes in too late, inasmuch as the key bit about a world model for an understander who uses one (for the rare occasions on which we may use them) is the understanding of them as what they are: models; of the world. But these are sophisticated intentionally-laden states that themselves presuppose a whole mess of other capacities, including (plausibly) self-consciousness (consciousness of oneself as an understander), a normative orientation towards the recalcitrant world as what, e.g., our beliefs states are answerable to, etc.
Melanie Mitchell writes about this stuff (https://aiguide.substack.com/p/on-brian-cantwell-smith-and-the-promise), the topic of her piece, Brian Cantwell Smith, also did, as did John Haugeland, to whom Cantwell Smith's book is dedicated, as does Murray Shanahan, though Shanahan may himself fall prey to the action-theoretic version of the problems you point out with pattern matching and "neurosymbolics": bump around un-understandingly enough in the world, and somehow understanding is supposed to emerge from mere quantity of such interactions (cf. "scale is all you need").
The emergence of human mindedness over phylogenetic timescales--such that we can now almost exceptionlessly pull it off over ontogenetic time frames--is one of the deep questions about humanity and for human self-understanding. AI--in all its manifest failures and utterly surprising apparent successes--is a great testbed for thinking more deeply about the richness and social embeddedness of the capacities that get packed (at least in English) into one little word: mind. But we are definitely wasting the opportunity if we use it to reduce what we mature human can do to "just" this or "a lot" of that, usually in sets of three: intuiting, observing, correcting; guesswork, experience, adaptation; etc., etc., etc.
"no substantive account of "intuition", nor any account of what social or biological precursors it require"
I very much doubt intuition requires any biology. It requires acting in the world, seeing results, and observing patterns. One can develop intuition without deep understanding or without logic.
When it comes to understanding, it is mostly about prediction. Does a fluid simulator understands water? I don't think so. Can it work well in enough in estimating how an action on a fluid will result in what effect? I think so.
you can have biology per se. That wasn't a meat-chauvinist point. It was about non-trivial preconditions. In us, understanding and intuition arise in a very complex functionally and teleologically organized system whose continued existence is scaffolded by conspecifics who already have achieved the end state and actively work to cultivate it in their offspring. That is a developmental and social claim, not a metaphysical claim about carbon.
Strikingly, your last paragraph makes precisely my point: give your fluid simulator all the accuracy you want, it don't understand water. (I won't quibble about whether a fluid simulator should be called a "predictor" in the same sense as a person might be a predictor of X or Y based on reasons they understood as justifications for their prediction.)
Understanding isn't a prerequisite for something describable as "getting across the street." Tumbleweeds do that. But you weren’t offering a behavioristic criterion for understanding. you were offering a three-part story about how understanding as we have it arises: "understanding comes first from intuition, and second from observing results and correcting. So, human intelligence is a mix of guesswork, prior experience, and adaptation."
And now you've added another story about how intuition arises: "acting in the world, seeing results, and observing patterns." But the problem before was that your story about how understanding as we have it arises presupposes that the very capacities it is trying to explain are already in place. And we have the same problem now: the descriptors in your story about how intuition arises already presuppose the understanding you are trying to explain. The terms you help yourself to--"acting in the world [as such]", "seeing results [as such]", and "observing patterns [as such]" are not bare behavioristic descriptions. They are intentionally laden descriptions that presuppose understanding.
We're just pushing the bump in the rug around and hoping we can underdescribe it out of existence.
"give your fluid simulator all the accuracy you want, it don't understand water"
Yet, a fluid simulator is highly effective.
We don't need a machine to understand things like people do. If you send a robot to collect eggs from the chicken coop, finds them all, and breaks no eggs, who cares if the robot understands eggs or chickens?
We want machines that can do work. The problem with current statistical machines is that they do not simulate or predict things properly. Language is not enough.
However, if a machine has a sense of vision, of touch, is able to recognize things (statistically if you wish), is able to observe when it does wrong, and can eventually solve problems properly, it doesn't matter if it understands things.
>However, if a machine has a sense of vision, of touch, is able to recognize things (statistically if you wish), is able to observe when it does wrong, and can eventually solve problems properly, it doesn't matter if it understands things.<
Is that really the case? Put yourself in the place of the machine and replace the visual scene with a 2D array of numbers. (Like the screens in the tv series "Severance", but far larger.) Can you really understand what the numbers are telling you and how to react? Presented with enough 2D number arrays, you might well be able to infer how best to react to them, but you would have no understanding of their meaning. It requires far more knowledge about the world to understand what you are seeing.
Now I would argue that for simple animals, understanding is not necessary. This is clear with animal mimicry to avoid predation. The predator just reflexively reacts to a pattern that mimics another predator or dangerous animal or thing. That is sufficient for its pattern recognition to improve its chance of surviving and reproducing. It also explains why animals can be fooled with fake stimuli, like red spots for herring gull chicks to peck at to get food from their parents.
Of course, most animals are not going to need careful reasoning either. But to be fair, we humans are pretty bad at that, too. I accept that faulty reasoning using System 1 thinking has better survival value than correct, but slow, System 2 thinking. React too slowly, and a predator may end your existence and gene reproduction. It is why we jump at the slightest sounds when in the dark.
"Put yourself in the place of the machine and replace the visual scene with a 2D array of numbers. "
Given how well the Waymo self-driving car does, and likely it saw plenty of crazy things, image recognition works.
Of course, next steps is to pull some info about the stuff you see that you can reason with. Current AI chatbots are getting quite good at that.
Of course the current AI understanding is not good, and especially predictions. I just don't see a show-stopper. Slow system 2 relies on doing careful steps and various inspections, sanity checks, invoking tools, etc. I don't see why that would not be enough.
That is not how hydrodynamical simulations work. They bootstrap off mathematical approximations of how fluids work in bulk and then smear out the second order effects by approximating the first.
The understanding is embedded in the mathematics that builds the modelling process. The computer doesn't understand it because that is not its job. Its job is to process, numerically, a worked numerical model with minimal errors.
It's an entire field of numerical theory. You do not understand any of this.
The money has been spent. The stakes have been set. We cannot turn the clock back to 2022, before ChatGPT existed. What is done is done.
Maybe the only thing left to trust is human wisdom. Not faster generation, not louder narratives, but judgment, restraint, and care. I am not certain. Just my two cents.
👍 Indeed. Kinda think LLMs are a matter of trying to square the circle; you can't get there from here. But what the hell do I know? I'm just one of those "nattering nabobs of negativism" ... 😉🙂
🙂 And BIG money too by the sound of it. Though, as I think Gary put it, what's a trillion or two among friends, or circular financiers ... 😉🙂
But, maybe of some relevance, I'm trying to remember the scene from Hitchhiker's about a Disaster Area concert, and the high-financing required to invest the astronomical sums the band acquired from ticket sales ...
Look at the Dow. It is up beyond all reason, given the moderate performance of non-AI stocks. We're heading for something unpleasant—I'm not quite sure what yet, or how bad it will be—and not just because of AI, but because of an overall culture of stagnation, greed, corruption and, yes, deception.
Really? I think gravity has been established as a proven fundamental. At least in so far as how it affects humans on a day-to-day basis. We might not fully understand its nature yet, but at least we're able to make predictions and establish voluminous technologies that work within the framework of what we do understand.
If only LLMs were that predictable and useful.
And btw: Waymo maybe safer within a narrow set of driving constraints, but it is FAR less safe when confronted with unexpected and unpredictable situations, which happens out in the real world all the time and has been demonstrated amply. So that's a bad example. And before you come back at me with more numerical statistics, the average person can drive for years and never be involved in an accident, too.
We are talking about how people function. Nobody ever proved that people are reliable, or faithful, or competent. All you can do is to test them, and teach them when they fail.
It was demonstrated amply that Waymo in practice is a lot safer than people. Of course it is being deployed only when the areas are well-mapped.
Humans in the real world do very badly when encountering the unexpected as well.
I do not say Waymo beats people across the board. I say they have other advantages, and in practice they have been shown to do better.
Isn't this an example of a lack of understanding of the situation? There are going to be an infinite number of cases where prior experience (training examples) does not cover the case presented.
Waymo said in a post on its blog: “The Waymo Driver braked hard, reducing speed from approximately 17 mph to under 6 mph before contact was made.”
“To put this in perspective, our peer-reviewed model shows that a fully attentive human driver in this same situation would have made contact with the pedestrian at approximately 14 mph,”
No car can stop instantly, and nobody has an instant reaction either.
A fully attentive human driver would have been able to infer the presence of children simply based off peripheral vision further down the road, because that is how humans go down roads.
The bulk of safety accidents are caused by tiredness and inattentiveness, not reaction time. Humans are massively more observant of their surroundings than any machine learning model and vastly better at spatial predictions.
Reality is, I would never have made contact with the pedestrian at all.
Yet I disabled the proximity alarms in my car because, even with the sensitivity set to maximum, they kept going off *after* I'd already reacted, which I found extremely annoying and even dangerously distracting.
Friends of Gary - have we somewhat shot ourselves in the foot by not calling ourselves by our true name? Shouldn't we demand to be called "AI Realists" rather than "AI Sceptics"?
From my AGI paper (currently under preparation): "the fact that LLMs demonstrate systematic reasoning failures across multiple categories \cite{song2026largelanguagemodelreasoning} is entirely unsurprising given that reasoning in natural language is inherently problematic due to its ambiguity, a fundamental weakness that was fully understood by Leibniz as long ago as 1676 when he proposed his \textbf{characteristica universalis} \cite{sep-leibniz-exoteric}"
I am under the impression that while Sutskever and LeCun say world models are required, they haven't given up on the idea that neural nets alone will be able to create these world models by themselves (as humans seem to do the same when growing up).
Getting token-statistics to mimic the results of understanding was the start of LLMs. Getting token-statistics to mimic the *form* of reasoning as an indirection is the way 'reasoning models' work. It is indirect approximation instead of direct approximation. Very, very costly and if that is what is needed it kills much of the economic side of LLMs.
Note, that neural nets had a fundamental issue with understanding has been known for a long time, even if such warnings (like yours) have been ignored. In the 1992 MIT Edition "What Computers (Still) Can't Do" Hubert Dreyfus writes about an experiment with neural nets trying to recognise tanks in aerial photographs and it turned out they had learned to discriminate between sunny days and clouded days, as the photos with and without tanks had been made on different days as it took time to place and remove all the tanks...
"I am under the impression that while Sutskever and LeCun say world models are required, they haven't given up on the idea that neural nets alone will be able to create these world models by themselves (as humans seem to do the same when growing up)." Yes, this is a fundamental error. A world model, including intuitive physics, solid objects etc. is available to humans from birth. Delivered by evolution. c/- Elizabeth Spelke 'What Babies Know - volume 1'
I wonder if this is true. The book looks interesting.
I wonder, because I have observed babies doing things like endlessly plucking at their sleeve. 40 minutes long, over and over again. Grasping, pulling, letting go. In that case, I could not escape the impression that that was "learning physical common sense", how fabric reacts, what a sleeve is, etc.. Anyway, whatever evolution brings, the mechanism is what is in the brain and that is neurons (and more than that, it seems, but with comparable physical behaviours). The world model will probably be learned, not available at birth. What is available at birth are a couple of basic behaviours (feeding is one) that develop over time (in part as myelin sheets develop in the brain in volume from back to front, enabling more neurons to function well). Some stuff gets developed before birth (basic hearing physiology for instance).
Kenneth: Long ago there was a similar saying about shale oil. Something like, "It will be feasible when the price of oil goes up $5 a barrel. No matter what the price of oil is at the time."
In that case it _did_ eventually become economically viable. Though not until an entirely new approach (fracking) was invented. :)
I had heard elsewhere a dump of jobs in the crosshairs of LLMs—I think a more important distinction is whether the llm can persuade a person of things that might be plausible. ChatGPT doesn’t have to replace tech researchers and engineers—it need only persuade execs that it can replace them. Obviously, it has convinced many of them.
The more searing recognition is this: LLMs can probably replace the so called leaders who believe the hype and cut their people. After all, a computer can do that, right?
All I can say is if LLM investors don't take Gary's advice and "face reality and start focusing on alternatives to LLMs," they sure has hell better not expect the taxpayers to bail them out. You sociopaths have all been warned by your betters years ago—and thanks, Gary, for continuing to remind them in spite of the slings and arrows they aimed at you.
Doesn't this get a bit tiresome. There is a much strongerr point. Large texts (above about 20 pages) use indexing, which LLMs can't follow.
owner managed branch of an ADI has the meaning given by section 12.
For tipping off offences, see section 123.
Subsection (1) does not apply to the following proceedings:
a) criminal proceedings for an offence against section 123, 136 or 137;
(b) section 175 proceedings for a contravention of subsection 41(2) or 49(2).
(aa) paragraph 121(3)(da);
an appropriate authorisation under subsection 126(1).
LLMs can't understand new ideas - they haven't been done to death on the internet,
LLMs can't do classified documents or self-contained documents like legislation.
They can't do idiom or allusion - it was a walk in the park, they raised the bar on semiconductors.
Ask to see a specification for a data centre, see how far an LLM gets with it.
They are being sold to simple people for simple uses, meanwhile the government and large corporations waste tens billions of dollars making mistakes on big projects, like F-35 (immediately ordering a replacement tells you how much of a stuff-up it was), Constellation ships. Presumably the Chinese aren't so silly, and without a get-rich-quick motivation, can take a much more serious view of the field.
They will keep focusing on LLMs. They have a 'friend' in the White House. He will bail them out when they fail. They didnt give him all those bribes err I mean donations out of the goodness of their heart.
Holy, shmoly. The article is 13 pages article, 33 pages references, and 20 pages appendices. And those 13 pages article, much of the text is text that is a reference. That may be hard for an LLM to use...
As a practitioner I can tell you that even my limited silo of historical text analysis the flaws are massive. 40 years ago I was lucky enough to fall in with Itty Bitty, Home Brew, and others of legend.
I can tell you the state of tools, methods, unit testing, linters, IDE, message watching, trace, conforming, scope, debug..are non-existent. If I had not learned the discipline of software design and development back then when we had to roll our own tools I would not know how to save myself now. Most working devs never have.
This leads to the issues Mister Marcus notes. Eventually quality and repeatability have to be there or people will walk away.
There is no way the incremental pace we are going at in terms of quality and support will arrive in time to ensure broad adoption.
Agents? If they are running as well as the dev environments do you should be concerned.
Yesterday Opus 4.6 decided to do something so monumentally dumb that it filled its entire VM with str passes that should have been local which lead to total failure of a project. I was forced to piece it back together over many more hours-from screenshots and notes. This is just ridiculous.
They need a software dev culture inside these companiesthat addresses these incredible MCP lapses and errors. And that is not the focus. The energy is focused on model madness and not capability and process. On growth at any cost.
Where does this lead? Eventually to not a lot of paying customers to try to support the massive spend spawning out of any kind of control. Which may leave scant funding for anything else we actually need as a society.
From a total non expert neophyte: why is absence of AGI equivalent to a useless model? Why can’t an LLM that is merely better than current iterations and far less prone to mistakes still provide very valuable and even transformative performance that can be a major business asset and discovery-accelerating research tool. Alphafold is not AGI. Even if LLMs do not lead to something that needs to be hunted down and laser blasted Blade Runner style they still seem like a big deal. Not close to being worthy of current valuations but pretty fucking cool.
1. The problem is not that genAI is useless. The problem is that it is not useful enough to justify trillions of dollars in investment, let alone telling every white collar worker in America that they must immediately learn to use LLMs or risk the bread line.
2. Alphafold is good. 5 years in, however, it has resulted in zero new drugs. Best guess from its leader is that it may reduce the time to drug discovery by 5%. That is significant. But it isn’t stop training new scientists because AI will do the R&D now good.
"We don't pray since we're gods but, yeah, magical solutions are coming soon. Like, real soon!"
-- The Soggy Bottom Boys
Who says that gods don’t pray?
You might be interested in this side note. There is an ancient question in Judaism, “Does God, blessed be he, pray?”
According to the Babylonian Talmud, specifically in Tractate Berakhot 7, the answer is given:
The Talmudic discussion begins with Rabbi Yoḥanan (a 3rd-century sage) asking how we know that the Holy One, Blessed be He, prays. He points to a verse in Isaiah 56:7: "I will bring them to My holy mountain and make them joyful in My house of prayer." The Rabbi notes that it doesn't say "their" house of prayer, but "My" house of prayer—implying that God has a prayer of His own.
The Talmud then asks, "What does God pray?" The answer is attributed to Rav (one of the greatest Babylonian sages):
"May it be My will that My mercy may suppress My anger, and that My mercy may prevail over My other attributes [of strict justice], so that I may deal with My children with the attribute of mercy and stop short of the limit of strict justice."
This passage is famous because it addresses a fundamental tension in Jewish thought: God as the True Judge (Dayan HaEmet) versus God as the Merciful Father (Avinu HaRachaman).
The idea that God "prays" is a piece of aggadah (narrative/metaphorical teaching). It suggests that even the Divine "struggles" to balance the need for accountability with the desire for compassion. It teaches that the world cannot survive on strict justice alone; it requires Lifnim Mishurat Hadin—going beyond the letter of the law.
Thanks to the AI tool, Gemini, for checking my references and getting the quote “right” and in English.
Stephen, the broligarchs see gods in the mirror, are libertarian in thought, and pagan in action. If they're on their knees, it's not to pray but to supplicate to their orange benefactor.
> "... implying that God has a prayer of His own. "
Praying to another god above Him -- or Her or It? 😉🙂
Wikipedia: Great fleas have little fleas upon their backs to bite 'em,
And little fleas have lesser fleas, and so ad infinitum. ....
https://en.wikipedia.org/wiki/Siphonaptera_(poem)
Rather amused by a translation into more "scientific" terms:
Big whirls have little whirls
That feed on their velocity,
And little whirls have lesser whirls
And so on to viscosity ...
I'm going to stick my neck out here and say that mainstream tech people will just keep ignoring any paper that doesn't align with the magical thinking that LLMs can reason.
👍 "Sunk costs" probably covers much of that ...
https://en.wikipedia.org/wiki/Sunk_cost
It is well-understood that statistical pattern matching is imitation without understanding.
Neurosymbolic approaches would suffer from precisely the same problem. They go through motions given by rules. The motions may be logically correct, but there is no real understanding for when the rules apply or how to build new abstractions.
Understanding has to be brought in somehow. When people work, understanding comes first from intuition, and second from observing results and correcting.
So, human intelligence is a mix of guesswork, prior experience, and adaptation. Current AI systems will do they same. Maybe they won't be as good as humans, but there is no sign of them hitting any hard limits.
> "Understanding has to be brought in somehow."
Arguably, "understanding" is a consequence of consciousness. At best, something in the way of hubris to think we can program a machine to exhibit or manifest that feature -- "golems" writ large.
"Consciousness" is a high bar to clear. If a submarine swims well, it need not prove it is conscious. It is a matter of simulating well enough the world to function properly.
True. Though "function properly" may be contingent on being conscious ... 😉🙂 That seems to be just a case of moving the goal-posts. Wonder if you ever read Lewis Carroll's: https://en.wikipedia.org/wiki/What_the_Tortoise_Said_to_Achilles
Seems to be the same sort of "infinite regress".
Not to say that it's impossible to "instantiate" consciousness in rudimentary structures, but seems necessary to start off with a model or description of an underlying process. Apropos of which, that reminds me of being quite amused by some paper wasps which apparently exhibited an "understanding" of transitive inference:
NYTimes: Wasps Passed This Logic Test. Can You?; The insects frequently found in your backyard appear to be the first invertebrate known to be capable of the skill of transitive inference.
https://www.nytimes.com/2019/05/09/science/paper-wasps-logic-test.html?unlocked_article_code=1.LFA.Nh4n.nDtRQIINsvVV&smid=url-share
I think one can get lost in philosophical arguments. It is like saying one must be a bird to be able to fly.
So far, our machines have become better and better while fully avoiding the problem of consciousness. We need more accurate software that that can catalog and predict the world better, and which can recover from failure.
Yes, need "model or description of an underlying process". But that has nothing to do with consciousness.
I use an Excel spreadsheet because it is a tool that is more accurate than I. Symbolic programming gives us 100% certainty. I’m not trusting ANYTHING that uses a mix of guesswork, experience and adaptation. That’s why tools exist.
Perhaps the definition of Neurosymbolic has some variability. If a Neurosymbolic platform is underpinned by a robust semantic layer of deeply disambiguated knowledge, and explicit meaning then that may provide the layer of real understanding needed.
The first two paragraphs are obviously correct. The latter two are the entire crux of the matter. the appeal to "intuition" here is empty, and comes in too late: we have no substantive account of "intuition", nor any account of what social or biological precursors it requires, nor its correct developmental account even in we eventual understanders. Likewise for "observing results" and "correction"; these are sophisticated developmental achievements, not genetic endowments. Infants do not have these abilities, and one of the greatest hinderances to progress in understanding ourselves (never mind to replicate ourselves in built systems) is our habit of retrojecting mature human capacities into early stage humans of whom we know little for sure except that they will surely die if not attended to in highly specific ways by mature conspecifics with an intensity and over a duration that is unparalleled elsewhere in the animal kingdom.
FWIW, talk of "world models" is also empty and comes in too late, inasmuch as the key bit about a world model for an understander who uses one (for the rare occasions on which we may use them) is the understanding of them as what they are: models; of the world. But these are sophisticated intentionally-laden states that themselves presuppose a whole mess of other capacities, including (plausibly) self-consciousness (consciousness of oneself as an understander), a normative orientation towards the recalcitrant world as what, e.g., our beliefs states are answerable to, etc.
Melanie Mitchell writes about this stuff (https://aiguide.substack.com/p/on-brian-cantwell-smith-and-the-promise), the topic of her piece, Brian Cantwell Smith, also did, as did John Haugeland, to whom Cantwell Smith's book is dedicated, as does Murray Shanahan, though Shanahan may himself fall prey to the action-theoretic version of the problems you point out with pattern matching and "neurosymbolics": bump around un-understandingly enough in the world, and somehow understanding is supposed to emerge from mere quantity of such interactions (cf. "scale is all you need").
The emergence of human mindedness over phylogenetic timescales--such that we can now almost exceptionlessly pull it off over ontogenetic time frames--is one of the deep questions about humanity and for human self-understanding. AI--in all its manifest failures and utterly surprising apparent successes--is a great testbed for thinking more deeply about the richness and social embeddedness of the capacities that get packed (at least in English) into one little word: mind. But we are definitely wasting the opportunity if we use it to reduce what we mature human can do to "just" this or "a lot" of that, usually in sets of three: intuiting, observing, correcting; guesswork, experience, adaptation; etc., etc., etc.
"no substantive account of "intuition", nor any account of what social or biological precursors it require"
I very much doubt intuition requires any biology. It requires acting in the world, seeing results, and observing patterns. One can develop intuition without deep understanding or without logic.
When it comes to understanding, it is mostly about prediction. Does a fluid simulator understands water? I don't think so. Can it work well in enough in estimating how an action on a fluid will result in what effect? I think so.
you can have biology per se. That wasn't a meat-chauvinist point. It was about non-trivial preconditions. In us, understanding and intuition arise in a very complex functionally and teleologically organized system whose continued existence is scaffolded by conspecifics who already have achieved the end state and actively work to cultivate it in their offspring. That is a developmental and social claim, not a metaphysical claim about carbon.
Strikingly, your last paragraph makes precisely my point: give your fluid simulator all the accuracy you want, it don't understand water. (I won't quibble about whether a fluid simulator should be called a "predictor" in the same sense as a person might be a predictor of X or Y based on reasons they understood as justifications for their prediction.)
Understanding isn't a prerequisite for something describable as "getting across the street." Tumbleweeds do that. But you weren’t offering a behavioristic criterion for understanding. you were offering a three-part story about how understanding as we have it arises: "understanding comes first from intuition, and second from observing results and correcting. So, human intelligence is a mix of guesswork, prior experience, and adaptation."
And now you've added another story about how intuition arises: "acting in the world, seeing results, and observing patterns." But the problem before was that your story about how understanding as we have it arises presupposes that the very capacities it is trying to explain are already in place. And we have the same problem now: the descriptors in your story about how intuition arises already presuppose the understanding you are trying to explain. The terms you help yourself to--"acting in the world [as such]", "seeing results [as such]", and "observing patterns [as such]" are not bare behavioristic descriptions. They are intentionally laden descriptions that presuppose understanding.
We're just pushing the bump in the rug around and hoping we can underdescribe it out of existence.
"give your fluid simulator all the accuracy you want, it don't understand water"
Yet, a fluid simulator is highly effective.
We don't need a machine to understand things like people do. If you send a robot to collect eggs from the chicken coop, finds them all, and breaks no eggs, who cares if the robot understands eggs or chickens?
We want machines that can do work. The problem with current statistical machines is that they do not simulate or predict things properly. Language is not enough.
However, if a machine has a sense of vision, of touch, is able to recognize things (statistically if you wish), is able to observe when it does wrong, and can eventually solve problems properly, it doesn't matter if it understands things.
>However, if a machine has a sense of vision, of touch, is able to recognize things (statistically if you wish), is able to observe when it does wrong, and can eventually solve problems properly, it doesn't matter if it understands things.<
Is that really the case? Put yourself in the place of the machine and replace the visual scene with a 2D array of numbers. (Like the screens in the tv series "Severance", but far larger.) Can you really understand what the numbers are telling you and how to react? Presented with enough 2D number arrays, you might well be able to infer how best to react to them, but you would have no understanding of their meaning. It requires far more knowledge about the world to understand what you are seeing.
Now I would argue that for simple animals, understanding is not necessary. This is clear with animal mimicry to avoid predation. The predator just reflexively reacts to a pattern that mimics another predator or dangerous animal or thing. That is sufficient for its pattern recognition to improve its chance of surviving and reproducing. It also explains why animals can be fooled with fake stimuli, like red spots for herring gull chicks to peck at to get food from their parents.
Of course, most animals are not going to need careful reasoning either. But to be fair, we humans are pretty bad at that, too. I accept that faulty reasoning using System 1 thinking has better survival value than correct, but slow, System 2 thinking. React too slowly, and a predator may end your existence and gene reproduction. It is why we jump at the slightest sounds when in the dark.
"Put yourself in the place of the machine and replace the visual scene with a 2D array of numbers. "
Given how well the Waymo self-driving car does, and likely it saw plenty of crazy things, image recognition works.
Of course, next steps is to pull some info about the stuff you see that you can reason with. Current AI chatbots are getting quite good at that.
Of course the current AI understanding is not good, and especially predictions. I just don't see a show-stopper. Slow system 2 relies on doing careful steps and various inspections, sanity checks, invoking tools, etc. I don't see why that would not be enough.
That is not how hydrodynamical simulations work. They bootstrap off mathematical approximations of how fluids work in bulk and then smear out the second order effects by approximating the first.
The understanding is embedded in the mathematics that builds the modelling process. The computer doesn't understand it because that is not its job. Its job is to process, numerically, a worked numerical model with minimal errors.
It's an entire field of numerical theory. You do not understand any of this.
What other solutions are there?
The money has been spent. The stakes have been set. We cannot turn the clock back to 2022, before ChatGPT existed. What is done is done.
Maybe the only thing left to trust is human wisdom. Not faster generation, not louder narratives, but judgment, restraint, and care. I am not certain. Just my two cents.
The progress is quite good and the issues fixable. I wrote a longer response as well.
The issues have not proven to be fixable. That's the fundamental point here.
👍 Indeed. Kinda think LLMs are a matter of trying to square the circle; you can't get there from here. But what the hell do I know? I'm just one of those "nattering nabobs of negativism" ... 😉🙂
Shhhhhhh...there's money on the line.
🙂 And BIG money too by the sound of it. Though, as I think Gary put it, what's a trillion or two among friends, or circular financiers ... 😉🙂
But, maybe of some relevance, I'm trying to remember the scene from Hitchhiker's about a Disaster Area concert, and the high-financing required to invest the astronomical sums the band acquired from ticket sales ...
Look at the Dow. It is up beyond all reason, given the moderate performance of non-AI stocks. We're heading for something unpleasant—I'm not quite sure what yet, or how bad it will be—and not just because of AI, but because of an overall culture of stagnation, greed, corruption and, yes, deception.
The problem is that you think in terms of "proven" and "fundamental". Nothing in the world works that way.
The way things work is towards greater careful automation and refining the methods.
This is how Waymo cars now drove 150 million miles with higher safety than people.
"Nothing in the world works that way."
Really? I think gravity has been established as a proven fundamental. At least in so far as how it affects humans on a day-to-day basis. We might not fully understand its nature yet, but at least we're able to make predictions and establish voluminous technologies that work within the framework of what we do understand.
If only LLMs were that predictable and useful.
And btw: Waymo maybe safer within a narrow set of driving constraints, but it is FAR less safe when confronted with unexpected and unpredictable situations, which happens out in the real world all the time and has been demonstrated amply. So that's a bad example. And before you come back at me with more numerical statistics, the average person can drive for years and never be involved in an accident, too.
Gravity is a fundamental force of nature.
We are talking about how people function. Nobody ever proved that people are reliable, or faithful, or competent. All you can do is to test them, and teach them when they fail.
It was demonstrated amply that Waymo in practice is a lot safer than people. Of course it is being deployed only when the areas are well-mapped.
Humans in the real world do very badly when encountering the unexpected as well.
I do not say Waymo beats people across the board. I say they have other advantages, and in practice they have been shown to do better.
So you're saying that humans are not a fundamental force of nature?
You may wish to reevaluate that assessment, lol.
>This is how Waymo cars now drove 150 million miles with higher safety than people.<
And yet a Waymo car hit a child. https://www.theguardian.com/technology/2026/jan/29/us-regulators-investigate-waymo-struck-child
Isn't this an example of a lack of understanding of the situation? There are going to be an infinite number of cases where prior experience (training examples) does not cover the case presented.
From that article:
Waymo said in a post on its blog: “The Waymo Driver braked hard, reducing speed from approximately 17 mph to under 6 mph before contact was made.”
“To put this in perspective, our peer-reviewed model shows that a fully attentive human driver in this same situation would have made contact with the pedestrian at approximately 14 mph,”
No car can stop instantly, and nobody has an instant reaction either.
A fully attentive human driver would have been able to infer the presence of children simply based off peripheral vision further down the road, because that is how humans go down roads.
The bulk of safety accidents are caused by tiredness and inattentiveness, not reaction time. Humans are massively more observant of their surroundings than any machine learning model and vastly better at spatial predictions.
Reality is, I would never have made contact with the pedestrian at all.
Yet I disabled the proximity alarms in my car because, even with the sensitivity set to maximum, they kept going off *after* I'd already reacted, which I found extremely annoying and even dangerously distracting.
Happy to read. 🤠🤠
That's somewhere else in the page. I don't think we have all the answers. I think things are not as bleak.
Friends of Gary - have we somewhat shot ourselves in the foot by not calling ourselves by our true name? Shouldn't we demand to be called "AI Realists" rather than "AI Sceptics"?
AI Sagans (people who recognize that extraordinary AI claims require extraordinary evidence).
Sign me up for that, please.
Thanks for bringing one of Carl Sagan's contributions back to mind.
Thanks for your restacks.
Useful thoughts need to be collected many times in many places.
From my AGI paper (currently under preparation): "the fact that LLMs demonstrate systematic reasoning failures across multiple categories \cite{song2026largelanguagemodelreasoning} is entirely unsurprising given that reasoning in natural language is inherently problematic due to its ambiguity, a fundamental weakness that was fully understood by Leibniz as long ago as 1676 when he proposed his \textbf{characteristica universalis} \cite{sep-leibniz-exoteric}"
I am under the impression that while Sutskever and LeCun say world models are required, they haven't given up on the idea that neural nets alone will be able to create these world models by themselves (as humans seem to do the same when growing up).
The socalled 'reasoning models' indeed do not reason at all. What they do is add a (very expensive) indirection (https://ea.rna.nl/2025/02/28/generative-ai-reasoning-models-dont-reason-even-if-it-seems-they-do/)
Getting token-statistics to mimic the results of understanding was the start of LLMs. Getting token-statistics to mimic the *form* of reasoning as an indirection is the way 'reasoning models' work. It is indirect approximation instead of direct approximation. Very, very costly and if that is what is needed it kills much of the economic side of LLMs.
Note, that neural nets had a fundamental issue with understanding has been known for a long time, even if such warnings (like yours) have been ignored. In the 1992 MIT Edition "What Computers (Still) Can't Do" Hubert Dreyfus writes about an experiment with neural nets trying to recognise tanks in aerial photographs and it turned out they had learned to discriminate between sunny days and clouded days, as the photos with and without tanks had been made on different days as it took time to place and remove all the tanks...
"I am under the impression that while Sutskever and LeCun say world models are required, they haven't given up on the idea that neural nets alone will be able to create these world models by themselves (as humans seem to do the same when growing up)." Yes, this is a fundamental error. A world model, including intuitive physics, solid objects etc. is available to humans from birth. Delivered by evolution. c/- Elizabeth Spelke 'What Babies Know - volume 1'
I wonder if this is true. The book looks interesting.
I wonder, because I have observed babies doing things like endlessly plucking at their sleeve. 40 minutes long, over and over again. Grasping, pulling, letting go. In that case, I could not escape the impression that that was "learning physical common sense", how fabric reacts, what a sleeve is, etc.. Anyway, whatever evolution brings, the mechanism is what is in the brain and that is neurons (and more than that, it seems, but with comparable physical behaviours). The world model will probably be learned, not available at birth. What is available at birth are a couple of basic behaviours (feeding is one) that develop over time (in part as myelin sheets develop in the brain in volume from back to front, enabling more neurons to function well). Some stuff gets developed before birth (basic hearing physiology for instance).
" Give us money. Lots of money. Lots and lots of money. Scale Scale Scale. AGI is coming next year!"
Sounds just like the nuclear fusion people. Energy production from nuclear fusion is coming next decade. Not this decade, the next one.
Kenneth: Long ago there was a similar saying about shale oil. Something like, "It will be feasible when the price of oil goes up $5 a barrel. No matter what the price of oil is at the time."
In that case it _did_ eventually become economically viable. Though not until an entirely new approach (fracking) was invented. :)
I had heard elsewhere a dump of jobs in the crosshairs of LLMs—I think a more important distinction is whether the llm can persuade a person of things that might be plausible. ChatGPT doesn’t have to replace tech researchers and engineers—it need only persuade execs that it can replace them. Obviously, it has convinced many of them.
The more searing recognition is this: LLMs can probably replace the so called leaders who believe the hype and cut their people. After all, a computer can do that, right?
All I can say is if LLM investors don't take Gary's advice and "face reality and start focusing on alternatives to LLMs," they sure has hell better not expect the taxpayers to bail them out. You sociopaths have all been warned by your betters years ago—and thanks, Gary, for continuing to remind them in spite of the slings and arrows they aimed at you.
Doesn't this get a bit tiresome. There is a much strongerr point. Large texts (above about 20 pages) use indexing, which LLMs can't follow.
owner managed branch of an ADI has the meaning given by section 12.
For tipping off offences, see section 123.
Subsection (1) does not apply to the following proceedings:
a) criminal proceedings for an offence against section 123, 136 or 137;
(b) section 175 proceedings for a contravention of subsection 41(2) or 49(2).
(aa) paragraph 121(3)(da);
an appropriate authorisation under subsection 126(1).
LLMs can't understand new ideas - they haven't been done to death on the internet,
LLMs can't do classified documents or self-contained documents like legislation.
They can't do idiom or allusion - it was a walk in the park, they raised the bar on semiconductors.
Ask to see a specification for a data centre, see how far an LLM gets with it.
They are being sold to simple people for simple uses, meanwhile the government and large corporations waste tens billions of dollars making mistakes on big projects, like F-35 (immediately ordering a replacement tells you how much of a stuff-up it was), Constellation ships. Presumably the Chinese aren't so silly, and without a get-rich-quick motivation, can take a much more serious view of the field.
They will keep focusing on LLMs. They have a 'friend' in the White House. He will bail them out when they fail. They didnt give him all those bribes err I mean donations out of the goodness of their heart.
Holy, shmoly. The article is 13 pages article, 33 pages references, and 20 pages appendices. And those 13 pages article, much of the text is text that is a reference. That may be hard for an LLM to use...
Anyway, a lot of funny examples at the end.
Because LLMs strip provenance leading to knowledge decay. citations are good and we’ve gotten desensitized to this
As a practitioner I can tell you that even my limited silo of historical text analysis the flaws are massive. 40 years ago I was lucky enough to fall in with Itty Bitty, Home Brew, and others of legend.
I can tell you the state of tools, methods, unit testing, linters, IDE, message watching, trace, conforming, scope, debug..are non-existent. If I had not learned the discipline of software design and development back then when we had to roll our own tools I would not know how to save myself now. Most working devs never have.
This leads to the issues Mister Marcus notes. Eventually quality and repeatability have to be there or people will walk away.
There is no way the incremental pace we are going at in terms of quality and support will arrive in time to ensure broad adoption.
Agents? If they are running as well as the dev environments do you should be concerned.
Yesterday Opus 4.6 decided to do something so monumentally dumb that it filled its entire VM with str passes that should have been local which lead to total failure of a project. I was forced to piece it back together over many more hours-from screenshots and notes. This is just ridiculous.
They need a software dev culture inside these companiesthat addresses these incredible MCP lapses and errors. And that is not the focus. The energy is focused on model madness and not capability and process. On growth at any cost.
Where does this lead? Eventually to not a lot of paying customers to try to support the massive spend spawning out of any kind of control. Which may leave scant funding for anything else we actually need as a society.
I and my paltry quarter century adjacent to software dev are inclined to agree.
From a total non expert neophyte: why is absence of AGI equivalent to a useless model? Why can’t an LLM that is merely better than current iterations and far less prone to mistakes still provide very valuable and even transformative performance that can be a major business asset and discovery-accelerating research tool. Alphafold is not AGI. Even if LLMs do not lead to something that needs to be hunted down and laser blasted Blade Runner style they still seem like a big deal. Not close to being worthy of current valuations but pretty fucking cool.
1. The problem is not that genAI is useless. The problem is that it is not useful enough to justify trillions of dollars in investment, let alone telling every white collar worker in America that they must immediately learn to use LLMs or risk the bread line.
2. Alphafold is good. 5 years in, however, it has resulted in zero new drugs. Best guess from its leader is that it may reduce the time to drug discovery by 5%. That is significant. But it isn’t stop training new scientists because AI will do the R&D now good.
Just my gut feeling. The very existence of hallucinations implies there is no reasoning and/or understanding.
https://substack.com/@d90no2/note/c-209034939?r=4bbwgs&utm_medium=ios&utm_source=notes-share-action
You may be able to formalize this notion.
Humans sometimes hallucinate. Does this imply that there is no reasoning and/or understanding?
Humans hallucinate due to poisons in their blood stream.
Very different mechanisms. The machines don’t get tired.