153 Comments
User's avatar
Jonas Barnett's avatar

I'm going to stick my neck out here and say that mainstream tech people will just keep ignoring any paper that doesn't align with the magical thinking that LLMs can reason.

Steersman's avatar

👍 "Sunk costs" probably covers much of that ...

https://en.wikipedia.org/wiki/Sunk_cost

Y Thn's avatar

That’s for investors. For your regular man at the wheel, it’s a damn lucrative job-scam for now even when he quietly agrees with the likes of Marcus.

D Stone's avatar

"We don't pray since we're gods but, yeah, magical solutions are coming soon. Like, real soon!"

-- The Soggy Bottom Boys

Stephen Wilkus's avatar

Who says that gods don’t pray?

You might be interested in this side note. There is an ancient question in Judaism, “Does God, blessed be he, pray?”

According to the Babylonian Talmud, specifically in Tractate Berakhot 7, the answer is given:

The Talmudic discussion begins with Rabbi Yoḥanan (a 3rd-century sage) asking how we know that the Holy One, Blessed be He, prays. He points to a verse in Isaiah 56:7: "I will bring them to My holy mountain and make them joyful in My house of prayer." The Rabbi notes that it doesn't say "their" house of prayer, but "My" house of prayer—implying that God has a prayer of His own.

The Talmud then asks, "What does God pray?" The answer is attributed to Rav (one of the greatest Babylonian sages):

"May it be My will that My mercy may suppress My anger, and that My mercy may prevail over My other attributes [of strict justice], so that I may deal with My children with the attribute of mercy and stop short of the limit of strict justice."

This passage is famous because it addresses a fundamental tension in Jewish thought: God as the True Judge (Dayan HaEmet) versus God as the Merciful Father (Avinu HaRachaman).

The idea that God "prays" is a piece of aggadah (narrative/metaphorical teaching). It suggests that even the Divine "struggles" to balance the need for accountability with the desire for compassion. It teaches that the world cannot survive on strict justice alone; it requires Lifnim Mishurat Hadin—going beyond the letter of the law.

Thanks to the AI tool, Gemini, for checking my references and getting the quote “right” and in English.

D Stone's avatar

Stephen, the broligarchs see gods in the mirror, are libertarian in thought, and pagan in action. If they're on their knees, it's not to pray but to supplicate to their orange benefactor.

Steersman's avatar

> "... implying that God has a prayer of His own. "

Praying to another god above Him -- or Her or It? 😉🙂

Wikipedia: Great fleas have little fleas upon their backs to bite 'em,

And little fleas have lesser fleas, and so ad infinitum. ....

https://en.wikipedia.org/wiki/Siphonaptera_(poem)

Rather amused by a translation into more "scientific" terms:

Big whirls have little whirls

That feed on their velocity,

And little whirls have lesser whirls

And so on to viscosity ...

Oleg Alexandrov's avatar

It is well-understood that statistical pattern matching is imitation without understanding.

Neurosymbolic approaches would suffer from precisely the same problem. They go through motions given by rules. The motions may be logically correct, but there is no real understanding for when the rules apply or how to build new abstractions.

Understanding has to be brought in somehow. When people work, understanding comes first from intuition, and second from observing results and correcting.

So, human intelligence is a mix of guesswork, prior experience, and adaptation. Current AI systems will do they same. Maybe they won't be as good as humans, but there is no sign of them hitting any hard limits.

Pete Murray's avatar

The first two paragraphs are obviously correct. The latter two are the entire crux of the matter. the appeal to "intuition" here is empty, and comes in too late: we have no substantive account of "intuition", nor any account of what social or biological precursors it requires, nor its correct developmental account even in we eventual understanders. Likewise for "observing results" and "correction"; these are sophisticated developmental achievements, not genetic endowments. Infants do not have these abilities, and one of the greatest hinderances to progress in understanding ourselves (never mind to replicate ourselves in built systems) is our habit of retrojecting mature human capacities into early stage humans of whom we know little for sure except that they will surely die if not attended to in highly specific ways by mature conspecifics with an intensity and over a duration that is unparalleled elsewhere in the animal kingdom.

FWIW, talk of "world models" is also empty and comes in too late, inasmuch as the key bit about a world model for an understander who uses one (for the rare occasions on which we may use them) is the understanding of them as what they are: models; of the world. But these are sophisticated intentionally-laden states that themselves presuppose a whole mess of other capacities, including (plausibly) self-consciousness (consciousness of oneself as an understander), a normative orientation towards the recalcitrant world as what, e.g., our beliefs states are answerable to, etc.

Melanie Mitchell writes about this stuff (https://aiguide.substack.com/p/on-brian-cantwell-smith-and-the-promise), the topic of her piece, Brian Cantwell Smith, also did, as did John Haugeland, to whom Cantwell Smith's book is dedicated, as does Murray Shanahan, though Shanahan may himself fall prey to the action-theoretic version of the problems you point out with pattern matching and "neurosymbolics": bump around un-understandingly enough in the world, and somehow understanding is supposed to emerge from mere quantity of such interactions (cf. "scale is all you need").

The emergence of human mindedness over phylogenetic timescales--such that we can now almost exceptionlessly pull it off over ontogenetic time frames--is one of the deep questions about humanity and for human self-understanding. AI--in all its manifest failures and utterly surprising apparent successes--is a great testbed for thinking more deeply about the richness and social embeddedness of the capacities that get packed (at least in English) into one little word: mind. But we are definitely wasting the opportunity if we use it to reduce what we mature human can do to "just" this or "a lot" of that, usually in sets of three: intuiting, observing, correcting; guesswork, experience, adaptation; etc., etc., etc.

Oleg Alexandrov's avatar

"no substantive account of "intuition", nor any account of what social or biological precursors it require"

I very much doubt intuition requires any biology. It requires acting in the world, seeing results, and observing patterns. One can develop intuition without deep understanding or without logic.

When it comes to understanding, it is mostly about prediction. Does a fluid simulator understands water? I don't think so. Can it work well in enough in estimating how an action on a fluid will result in what effect? I think so.

Pete Murray's avatar

you can have biology per se. That wasn't a meat-chauvinist point. It was about non-trivial preconditions. In us, understanding and intuition arise in a very complex functionally and teleologically organized system whose continued existence is scaffolded by conspecifics who already have achieved the end state and actively work to cultivate it in their offspring. That is a developmental and social claim, not a metaphysical claim about carbon.

Strikingly, your last paragraph makes precisely my point: give your fluid simulator all the accuracy you want, it don't understand water. (I won't quibble about whether a fluid simulator should be called a "predictor" in the same sense as a person might be a predictor of X or Y based on reasons they understood as justifications for their prediction.)

Understanding isn't a prerequisite for something describable as "getting across the street." Tumbleweeds do that. But you weren’t offering a behavioristic criterion for understanding. you were offering a three-part story about how understanding as we have it arises: "understanding comes first from intuition, and second from observing results and correcting. So, human intelligence is a mix of guesswork, prior experience, and adaptation."

And now you've added another story about how intuition arises: "acting in the world, seeing results, and observing patterns." But the problem before was that your story about how understanding as we have it arises presupposes that the very capacities it is trying to explain are already in place. And we have the same problem now: the descriptors in your story about how intuition arises already presuppose the understanding you are trying to explain. The terms you help yourself to--"acting in the world [as such]", "seeing results [as such]", and "observing patterns [as such]" are not bare behavioristic descriptions. They are intentionally laden descriptions that presuppose understanding.

We're just pushing the bump in the rug around and hoping we can underdescribe it out of existence.

Oleg Alexandrov's avatar

"give your fluid simulator all the accuracy you want, it don't understand water"

Yet, a fluid simulator is highly effective.

We don't need a machine to understand things like people do. If you send a robot to collect eggs from the chicken coop, finds them all, and breaks no eggs, who cares if the robot understands eggs or chickens?

We want machines that can do work. The problem with current statistical machines is that they do not simulate or predict things properly. Language is not enough.

However, if a machine has a sense of vision, of touch, is able to recognize things (statistically if you wish), is able to observe when it does wrong, and can eventually solve problems properly, it doesn't matter if it understands things.

Pete Murray's avatar

yes, of course; we can retreat from attempting to build a system that understands. There are many things that we want or need done that do not require understanding the world as the world that it is with the variety of things in it that the world contains and consists of. Perhaps your egg collector (ovoid collector?) is a plausible example of such a thing. Nothing I've said denies that many things can be done without understanding what you are doing.

But notice that this is in fact a retreat, both from your original story about how understanding arises (since you now say we don't need understanding), and from the research program of AI as it was launched at Dartmouth in 1956. That program was to build a thing that thinks in the sense that we think, where the sense in which we think essentially involves understanding: understand the world as a world, and the things and facts that constitute the world as the things and facts that they are, etc.

If we want to bail on that and just focus on building understandingless "machines that can do work," that's fine. AI is not an obligatory research program. But then it's worth being clear about what has been abandoned, and what remains. What remains are attempts to build sophisticated forms of automation, not the implementation of mindedness or understanding. Persisting in calling understandingless automation systems "artificial intelligence" invites wholesale misunderstanding what is going on in those systems, how they should be interacted with, what they can and can't be relied upon to do, and what circumstances it makes sense to deploy them, both ethically and practically. It is, of course, effective marketing to call such systems "artificially intelligent"--i.e., genuinely thinking systems, but built rather than naturally occurring--but that marketing fact is irrelevant to the issues we're discussing.

I agree with you that current statistical machines do not simulate or predict things properly. I disagree with you about the source of the problem. You locate it in the restriction to language-only training data. But the models don't even have access to language. They have access to mathematically vectorized text-string tokens that have a historical connection to the practices that language consists in: the questions about the world, statements about the world, explanations, corrections, apologies, etc. But that is all. There is nothing linguistic about LLMs, other than the historical source of the numbers whose statistical relations the models represent. Feed it any non-linguistically sourced dataset and it will do exactly the same thing: compute statistical relations among vectorized inputs. Knowing English (or Hindi, Russian, etc.), being able to speak a language, is precisely the kind of understanding-laden activity you are setting aside when you move to building understandingless “machines that can do work.” If even one LLM could speak English, the original AI project launched at Dartmouth would be complete. But speaking English would require, e.g., understandingly making statements, questions, explanations, corrections, apologies, etc.

Of course, it is an empirical issue whether adding "multi-modality" to such statistical systems makes them better understandingless "machines that can do work." Probably yes. Scale doesn't do nothing, and adding vectorized datasets derived from pixel arrays ("a sense of vision") and from pressure-activated sensors ("touch"), and temporally synchronizing those datasets is a massive increase in data and scale. No doubt some improvements relative to some objectives will result.

But we should not get confused about what this amounts to. None of this constitutes progress towards building a system capable of such understanding-rich states as actually seeing an egg and feeling its shape. That is, it constitutes zero movement toward AI. Again, that is not an obligatory project. But we should not confuse building increasingly capable understandingless "machines that do work"—a perfectly worthwhile engineering goal—with building artificial intelligence.

Oleg Alexandrov's avatar

To add to the other comment, the AI field is littered with failed well-intended attempts. The near term goal is for AI companies not go bankrupt, find a business model, get a profit, then figure out what to do next.

I think AGI will be a massive collection of empirically developed systems, rather than the product of one grand discovery.

Oleg Alexandrov's avatar

This is not as much of a retreat from building full AGI as deciding to work incrementally.

For the foreseeable future AI will function as an augmenter. It will be able to go the next level once the the problems with the current logic are determined and the methods are refined or replaced.

This is how everything has functioned in the history of tech.

Alex Tolley's avatar

>However, if a machine has a sense of vision, of touch, is able to recognize things (statistically if you wish), is able to observe when it does wrong, and can eventually solve problems properly, it doesn't matter if it understands things.<

Is that really the case? Put yourself in the place of the machine and replace the visual scene with a 2D array of numbers. (Like the screens in the tv series "Severance", but far larger.) Can you really understand what the numbers are telling you and how to react? Presented with enough 2D number arrays, you might well be able to infer how best to react to them, but you would have no understanding of their meaning. It requires far more knowledge about the world to understand what you are seeing.

Now I would argue that for simple animals, understanding is not necessary. This is clear with animal mimicry to avoid predation. The predator just reflexively reacts to a pattern that mimics another predator or dangerous animal or thing. That is sufficient for its pattern recognition to improve its chance of surviving and reproducing. It also explains why animals can be fooled with fake stimuli, like red spots for herring gull chicks to peck at to get food from their parents.

Of course, most animals are not going to need careful reasoning either. But to be fair, we humans are pretty bad at that, too. I accept that faulty reasoning using System 1 thinking has better survival value than correct, but slow, System 2 thinking. React too slowly, and a predator may end your existence and gene reproduction. It is why we jump at the slightest sounds when in the dark.

Oleg Alexandrov's avatar

"Put yourself in the place of the machine and replace the visual scene with a 2D array of numbers. "

Given how well the Waymo self-driving car does, and likely it saw plenty of crazy things, image recognition works.

Of course, next steps is to pull some info about the stuff you see that you can reason with. Current AI chatbots are getting quite good at that.

Of course the current AI understanding is not good, and especially predictions. I just don't see a show-stopper. Slow system 2 relies on doing careful steps and various inspections, sanity checks, invoking tools, etc. I don't see why that would not be enough.

User's avatar
Comment deleted
Feb 11
Comment deleted
Oleg Alexandrov's avatar

I have a PhD in applied math and numerical simulations is what I do for living.

That's exactly what we need for AI. Understanding at the level of modeling. This is where the problems are. The current LLM do not model their environment. This is well-understood. That is also my point. We need an AI to get world knowledge somehow.

User's avatar
Comment deleted
Feb 11Edited
Comment deleted
Oleg Alexandrov's avatar

"For one, that hydrodynamics is, itself, an approximation of reality and thus, does not constitute a world model"

Of course. What you are saying here is that all models are approximate. That on its own is obvious.

My point is very simple. Humans, when we function in the world, we have an approximate understanding. Numerical models also represent things approximately.

An AI needs to be able to have a good enough approximation of the current set up to function properly.

Steersman's avatar

> "Understanding has to be brought in somehow."

Arguably, "understanding" is a consequence of consciousness. At best, something in the way of hubris to think we can program a machine to exhibit or manifest that feature -- "golems" writ large.

Oleg Alexandrov's avatar

"Consciousness" is a high bar to clear. If a submarine swims well, it need not prove it is conscious. It is a matter of simulating well enough the world to function properly.

Steersman's avatar

True. Though "function properly" may be contingent on being conscious ... 😉🙂 That seems to be just a case of moving the goal-posts. Wonder if you ever read Lewis Carroll's: https://en.wikipedia.org/wiki/What_the_Tortoise_Said_to_Achilles

Seems to be the same sort of "infinite regress".

Not to say that it's impossible to "instantiate" consciousness in rudimentary structures, but seems necessary to start off with a model or description of an underlying process. Apropos of which, that reminds me of being quite amused by some paper wasps which apparently exhibited an "understanding" of transitive inference:

NYTimes: Wasps Passed This Logic Test. Can You?; The insects frequently found in your backyard appear to be the first invertebrate known to be capable of the skill of transitive inference.

https://www.nytimes.com/2019/05/09/science/paper-wasps-logic-test.html?unlocked_article_code=1.LFA.Nh4n.nDtRQIINsvVV&smid=url-share

Oleg Alexandrov's avatar

I think one can get lost in philosophical arguments. It is like saying one must be a bird to be able to fly.

So far, our machines have become better and better while fully avoiding the problem of consciousness. We need more accurate software that that can catalog and predict the world better, and which can recover from failure.

Yes, need "model or description of an underlying process". But that has nothing to do with consciousness.

Steersman's avatar

Re "philosophical arguments", one of my favourite quips on the topic is Nietzsche's view about many "philosophers": "they muddy the waters to make them seem deep." Which is not at all to say that the entire field is worthless.

"I am now convinced that theoretical physics is actually philosophy." Max Born

Re "catalog and predict", I've been quite impressed by Google's Gemini for its quick overviews and summaries of a topic in response to rudimentary questions. But I've periodically wondered what applications are "envisioned" that would justify spending a trillion bucks on data centers. Though I also remember -- way back before the dawn of the personal computer -- that some IBM executive had said that he didn't see any market for mainframes much beyond 5 or 10 a year.

As for "model or description", I'm reminded of neuroscientist Giulio Tononi's "Integrated information theory" from some 20 years ago:

https://spectrum.ieee.org/a-bit-of-theory-consciousness-as-integrated-information

https://en.wikipedia.org/wiki/Integrated_information_theory

Moot exactly how that "integration" might manifest or exhibit consciousness -- mostly out of my salary range -- but one might surmise that bodies, organisms which go rather far down the evolutionary ladder like the paper wasps above, do that intrinsically.

Something of a far cry from AI programs that are scattered from hell to breakfast; pure distributed process with no "body" to integrate the information about the world. Maybe when some AI programs are put in some of Musk's robots? Though I remember seeing a recent YouTube video of a robot trying to load a dishwasher and which wound up falling over; does not compute. Some distance to go yet ... 😉🙂

Rtype's avatar

I use an Excel spreadsheet because it is a tool that is more accurate than I. Symbolic programming gives us 100% certainty. I’m not trusting ANYTHING that uses a mix of guesswork, experience and adaptation. That’s why tools exist.

Oleg Alexandrov's avatar

Yeah, Excel is fine for problems that fit in a table. Same for symbolic programming. Each has its own narrow niche.

What we need is AI that can use tools at least as reliably as people. We humans also use guesswork, but we know how to validate stuff.

Dean Prelazzi's avatar

Perhaps the definition of Neurosymbolic has some variability. If a Neurosymbolic platform is underpinned by a robust semantic layer of deeply disambiguated knowledge, and explicit meaning then that may provide the layer of real understanding needed.

Oleg Alexandrov's avatar

Yes, but deeply disambiguated knowledge is still a human construct. The machine can follow that, but does not understand that. Any time it runs into knowledge that does not fit in its ontology it will fail. It also can't update the ontology, but only say adjust some parameter values for each defined type.

Simple John's avatar

Friends of Gary - have we somewhat shot ourselves in the foot by not calling ourselves by our true name? Shouldn't we demand to be called "AI Realists" rather than "AI Sceptics"?

Amy A's avatar

AI Sagans (people who recognize that extraordinary AI claims require extraordinary evidence).

PlasticFish's avatar

Sign me up for that, please.

Simple John's avatar

Thanks for bringing one of Carl Sagan's contributions back to mind.

Thanks for your restacks.

Useful thoughts need to be collected many times in many places.

Aaron Turner's avatar

From my AGI paper (currently under preparation): "the fact that LLMs demonstrate systematic reasoning failures across multiple categories \cite{song2026largelanguagemodelreasoning} is entirely unsurprising given that reasoning in natural language is inherently problematic due to its ambiguity, a fundamental weakness that was fully understood by Leibniz as long ago as 1676 when he proposed his \textbf{characteristica universalis} \cite{sep-leibniz-exoteric}"

Xian's avatar

What other solutions are there?

The money has been spent. The stakes have been set. We cannot turn the clock back to 2022, before ChatGPT existed. What is done is done.

Maybe the only thing left to trust is human wisdom. Not faster generation, not louder narratives, but judgment, restraint, and care. I am not certain. Just my two cents.

Oleg Alexandrov's avatar

The progress is quite good and the issues fixable. I wrote a longer response as well.

Robert Keith's avatar

The issues have not proven to be fixable. That's the fundamental point here.

Steersman's avatar

👍 Indeed. Kinda think LLMs are a matter of trying to square the circle; you can't get there from here. But what the hell do I know? I'm just one of those "nattering nabobs of negativism" ... 😉🙂

Robert Keith's avatar

Shhhhhhh...there's money on the line.

Steersman's avatar

🙂 And BIG money too by the sound of it. Though, as I think Gary put it, what's a trillion or two among friends, or circular financiers ... 😉🙂

But, maybe of some relevance, I'm trying to remember the scene from Hitchhiker's about a Disaster Area concert, and the high-financing required to invest the astronomical sums the band acquired from ticket sales ...

Robert Keith's avatar

Look at the Dow. It is up beyond all reason, given the moderate performance of non-AI stocks. We're heading for something unpleasant—I'm not quite sure what yet, or how bad it will be—and not just because of AI, but because of an overall culture of stagnation, greed, corruption and, yes, deception.

Oleg Alexandrov's avatar

The problem is that you think in terms of "proven" and "fundamental". Nothing in the world works that way.

The way things work is towards greater careful automation and refining the methods.

This is how Waymo cars now drove 150 million miles with higher safety than people.

Robert Keith's avatar

"Nothing in the world works that way."

Really? I think gravity has been established as a proven fundamental. At least in so far as how it affects humans on a day-to-day basis. We might not fully understand its nature yet, but at least we're able to make predictions and establish voluminous technologies that work within the framework of what we do understand.

If only LLMs were that predictable and useful.

And btw: Waymo maybe safer within a narrow set of driving constraints, but it is FAR less safe when confronted with unexpected and unpredictable situations, which happens out in the real world all the time and has been demonstrated amply. So that's a bad example. And before you come back at me with more numerical statistics, the average person can drive for years and never be involved in an accident, too.

Oleg Alexandrov's avatar

Gravity is a fundamental force of nature.

We are talking about how people function. Nobody ever proved that people are reliable, or faithful, or competent. All you can do is to test them, and teach them when they fail.

It was demonstrated amply that Waymo in practice is a lot safer than people. Of course it is being deployed only when the areas are well-mapped.

Humans in the real world do very badly when encountering the unexpected as well.

I do not say Waymo beats people across the board. I say they have other advantages, and in practice they have been shown to do better.

Robert Keith's avatar

So you're saying that humans are not a fundamental force of nature?

You may wish to reevaluate that assessment, lol.

Alex Tolley's avatar

>This is how Waymo cars now drove 150 million miles with higher safety than people.<

And yet a Waymo car hit a child. https://www.theguardian.com/technology/2026/jan/29/us-regulators-investigate-waymo-struck-child

Isn't this an example of a lack of understanding of the situation? There are going to be an infinite number of cases where prior experience (training examples) does not cover the case presented.

Oleg Alexandrov's avatar

From that article:

Waymo said in a post on its blog: “The Waymo Driver braked hard, reducing speed from approximately 17 mph to under 6 mph before contact was made.”

“To put this in perspective, our peer-reviewed model shows that a fully attentive human driver in this same situation would have made contact with the pedestrian at approximately 14 mph,”

No car can stop instantly, and nobody has an instant reaction either.

PlasticFish's avatar

Yet I disabled the proximity alarms in my car because, even with the sensitivity set to maximum, they kept going off *after* I'd already reacted, which I found extremely annoying and even dangerously distracting.

User's avatar
Comment deleted
Feb 11
Comment deleted
Patricio Rodriguez's avatar

And yet, Waymo have humans in the loop for tele-assistance.

Oleg Alexandrov's avatar

Humans provide occasional advice. They do not tele-operate the cars. This is the safe middle ground till the tech improves.

Xian's avatar

Happy to read. 🤠🤠

Oleg Alexandrov's avatar

That's somewhere else in the page. I don't think we have all the answers. I think things are not as bleak.

Kenneth Lerman's avatar

" Give us money. Lots of money. Lots and lots of money. Scale Scale Scale. AGI is coming next year!"

Sounds just like the nuclear fusion people. Energy production from nuclear fusion is coming next decade. Not this decade, the next one.

Paul Czyzewski's avatar

Kenneth: Long ago there was a similar saying about shale oil. Something like, "It will be feasible when the price of oil goes up $5 a barrel. No matter what the price of oil is at the time."

In that case it _did_ eventually become economically viable. Though not until an entirely new approach (fracking) was invented. :)

Kenneth Lerman's avatar

We need more than reasoning. We need AI to be able to explain its reasoning.

If a medical student tells his teacher that he thinks the patient should be given a particular antibiotic, the first thing the teacher will ask is, "Why do you think that?"

If I tell my manager that I think we should use ANTLR4 to solve a particular program development problem, he will ask me why I would chose that.

The rule based expert systems from the past could do that. They would tell you what rules were used to produce their conclusion. That enabled the programmer (or rule developer, if you prefer) to add, change, or remove rules to improve their performance.

Tarek Kettaneh's avatar

LLMs are simply not moving towards AGI; they cannot. They are just sophisticated statistical algorithms that usually guess pretty well what the next word will be, but with NO understanding of what they "say". No amount of data centers and thousands of chips can replace human innovation. The best AI engines today cannot solve a single of the mathematical conjuctures that have remained unproven for a century. Amen

Oleg Alexandrov's avatar

There is a wide gulf between "they do not understand" and "prove a conjecture open for a century". The focus should be on utility of easy and medium difficulty problems.

Gerben Wierda's avatar

I am under the impression that while Sutskever and LeCun say world models are required, they haven't given up on the idea that neural nets alone will be able to create these world models by themselves (as humans seem to do the same when growing up).

The socalled 'reasoning models' indeed do not reason at all. What they do is add a (very expensive) indirection (https://ea.rna.nl/2025/02/28/generative-ai-reasoning-models-dont-reason-even-if-it-seems-they-do/)

Getting token-statistics to mimic the results of understanding was the start of LLMs. Getting token-statistics to mimic the *form* of reasoning as an indirection is the way 'reasoning models' work. It is indirect approximation instead of direct approximation. Very, very costly and if that is what is needed it kills much of the economic side of LLMs.

Note, that neural nets had a fundamental issue with understanding has been known for a long time, even if such warnings (like yours) have been ignored. In the 1992 MIT Edition "What Computers (Still) Can't Do" Hubert Dreyfus writes about an experiment with neural nets trying to recognise tanks in aerial photographs and it turned out they had learned to discriminate between sunny days and clouded days, as the photos with and without tanks had been made on different days as it took time to place and remove all the tanks...

Terry Raby's avatar

"I am under the impression that while Sutskever and LeCun say world models are required, they haven't given up on the idea that neural nets alone will be able to create these world models by themselves (as humans seem to do the same when growing up)." Yes, this is a fundamental error. A world model, including intuitive physics, solid objects etc. is available to humans from birth. Delivered by evolution. c/- Elizabeth Spelke 'What Babies Know - volume 1'

Gerben Wierda's avatar

I wonder if this is true. The book looks interesting.

I wonder, because I have observed babies doing things like endlessly plucking at their sleeve. 40 minutes long, over and over again. Grasping, pulling, letting go. In that case, I could not escape the impression that that was "learning physical common sense", how fabric reacts, what a sleeve is, etc.. Anyway, whatever evolution brings, the mechanism is what is in the brain and that is neurons (and more than that, it seems, but with comparable physical behaviours). The world model will probably be learned, not available at birth. What is available at birth are a couple of basic behaviours (feeding is one) that develop over time (in part as myelin sheets develop in the brain in volume from back to front, enabling more neurons to function well). Some stuff gets developed before birth (basic hearing physiology for instance).

CNP Slagle's avatar

I had heard elsewhere a dump of jobs in the crosshairs of LLMs—I think a more important distinction is whether the llm can persuade a person of things that might be plausible. ChatGPT doesn’t have to replace tech researchers and engineers—it need only persuade execs that it can replace them. Obviously, it has convinced many of them.

The more searing recognition is this: LLMs can probably replace the so called leaders who believe the hype and cut their people. After all, a computer can do that, right?

Oaktown's avatar

All I can say is if LLM investors don't take Gary's advice and "face reality and start focusing on alternatives to LLMs," they sure has hell better not expect the taxpayers to bail them out. You sociopaths have all been warned by your betters years ago—and thanks, Gary, for continuing to remind them in spite of the slings and arrows they aimed at you.

Jim Brander's avatar

Doesn't this get a bit tiresome. There is a much strongerr point. Large texts (above about 20 pages) use indexing, which LLMs can't follow.

owner managed branch of an ADI has the meaning given by section 12.

For tipping off offences, see section 123.

Subsection (1) does not apply to the following proceedings:

a) criminal proceedings for an offence against section 123, 136 or 137;

(b) section 175 proceedings for a contravention of subsection 41(2) or 49(2).

(aa) paragraph 121(3)(da);

an appropriate authorisation under subsection 126(1).

LLMs can't understand new ideas - they haven't been done to death on the internet,

LLMs can't do classified documents or self-contained documents like legislation.

They can't do idiom or allusion - it was a walk in the park, they raised the bar on semiconductors.

Ask to see a specification for a data centre, see how far an LLM gets with it.

They are being sold to simple people for simple uses, meanwhile the government and large corporations waste tens billions of dollars making mistakes on big projects, like F-35 (immediately ordering a replacement tells you how much of a stuff-up it was), Constellation ships. Presumably the Chinese aren't so silly, and without a get-rich-quick motivation, can take a much more serious view of the field.

Bryan McCormick's avatar

As a practitioner I can tell you that even my limited silo of historical text analysis the flaws are massive. 40 years ago I was lucky enough to fall in with Itty Bitty, Home Brew, and others of legend.

I can tell you the state of tools, methods, unit testing, linters, IDE, message watching, trace, conforming, scope, debug..are non-existent. If I had not learned the discipline of software design and development back then when we had to roll our own tools I would not know how to save myself now. Most working devs never have.

This leads to the issues Mister Marcus notes. Eventually quality and repeatability have to be there or people will walk away.

There is no way the incremental pace we are going at in terms of quality and support will arrive in time to ensure broad adoption.

Agents? If they are running as well as the dev environments do you should be concerned.

Yesterday Opus 4.6 decided to do something so monumentally dumb that it filled its entire VM with str passes that should have been local which lead to total failure of a project. I was forced to piece it back together over many more hours-from screenshots and notes. This is just ridiculous.

They need a software dev culture inside these companiesthat addresses these incredible MCP lapses and errors. And that is not the focus. The energy is focused on model madness and not capability and process. On growth at any cost.

Where does this lead? Eventually to not a lot of paying customers to try to support the massive spend spawning out of any kind of control. Which may leave scant funding for anything else we actually need as a society.

PlasticFish's avatar

I and my paltry quarter century adjacent to software dev are inclined to agree.

William Bowles's avatar

The question that's not being asked is, why? What is the objective of first, duplicating and then replacing human reasoning with software? Who benefits? What are the benefits of replacing humans with machines? The answer lies in the history of automation, whether it be NCMs or machine learning, namely reducing the cost of labour that capitalism demands as the rate of profit falls. But will AI do this? Perhaps in the short term but in the long term, it will lead to society where the only jobs that are not automated will be tthose that are either not profitable to automate or not possible to be automated. An deeply unequal society of a handful of billionaires, a larger handful of technocrats and a vast army of precariats. The US is well on the way to this reality.

Bryan McCormick's avatar

Not pertinent here perhaps but the high level departures from Anthropic of Behnam Neyshabur, Harsh Mehta (research) and Mrinank Sharma (safety, catastropheI and the ominous note from Sharma: "the world is in peril" from a series of interconnected crises [there is] a growing difficulty in letting organizational values govern actions amid commercial pressures. In other words - research and safety at the peak moment for the company walked out. Guess what company is about to get Enshittified next.