190 Comments
User's avatar
John Dorsey's avatar

The last sentence Gary wrote is so true. A lot of people are employed in the pursuit of LLMs. If all the big companies were to abandon LLMs, all these people would lose their jobs. This is why so many companies will not admit that LLMs aren't the future of AI. No one wants to be out of a job.

Expand full comment
Danielle Church's avatar

and that last two sentences right there is why this model of capitalism was always bound for failure. people will always prioritize their own survival over the future of the human race, and so they will choose to do work that they know hurts the human race if they feel it's their only option for employment.

Expand full comment
Casey (aka dethkon)'s avatar

“The common ruination of the contending classes.” -Marx

Expand full comment
Guidothekp's avatar

Isn't that what the LLM crowd been telling us? Some of us will take to poetry, others will go mad, and the rest will be out of a job by the time LLMs are done?

Expand full comment
Nathalie AI Ethic's avatar

Apple has stated it.

Expand full comment
John Dorsey's avatar

Apple hasn't really dived into AI.

Expand full comment
Alex's avatar

They haven’t? What gives you that impression, the lack of SiriGPT for the public?

Expand full comment
John Dorsey's avatar

Apple recently released a paper that seriously questioned whether LLMs will ever have the high degree of accuracy required for them to do the things we want them to do.

Expand full comment
Alex's avatar

Exactly. Apple has dived deep into AI to come to that conclusion. There’s also Gurman’s report on Bloomberg from earlier today that explains their work on an internal SiriGPT. Whether GPTs ultimately work or not, they’re working on it. Thank you for your confirmation they are deeply involved in AI and to Gurman for his report.

Expand full comment
GodParticle's avatar

You are correct. They just announced their own LLM for siri lol

Expand full comment
GodParticle's avatar

They just announced sirigpt lolol

Expand full comment
Gautam Divgi's avatar

I think it’s not just jobs. LLMs have their uses in summarization & research. They can accelerate what humans do. However, as in everything else with machine learning, there is no free lunch. You have to have humans in the loop to statistically measure if rate of error for humans is better or worse than that of LLMs. At that point it’s a cost calculation. Is the cost of LLMs worth having the humans in the loop?

One thing most LLM foundational model providers don’t really tell you is that context window sizes can impact efficiency. I suspect the current race to scale data centers is not AGI, but to make existing LLMs more efficient.

There is no way LLM architecture is bringing about AGI. I think that has been a foregone conclusion for a while now.

Expand full comment
Saty Chary's avatar

Hi Gary! Yes, you did call it earlier! Nice to see more consensus building up. But, 'world model' might be an oxymoron. Just like LLMs aren't able to build up models because it's all based on human generated text, models based on images, audio and video will *also* not be able, for the same reason - it's multimodal data captured by us. Use of video is particularly bad - no way a foundation model will figure out "physics" or human behavior etc just by watching videos!!! That's why science labs are a thing, as opposed to YouTube video collections.

Expand full comment
Andrew Kolb's avatar

Is it possible that real "world models" require embodied knowledge and actually living in the world? Most human learning comes from interaction and feedback from interactions, not just observation. It makes me wonder if AI as "brains without bodies" will ever be able to understand the world in a meaningful enough way to reach AGI.

Expand full comment
Denis Poussart's avatar

YES, as Polyani famously expressed the concept of tacit knowledge, referring to the knowledge we possess but cannot explicitly articulate. Our brain does not compute, and does not spend energy on reprensenting information by shareable encoding but "simply" lets physics do its work yielding "signals" that we can detect. This is wgat 'embodiementù, is, and why it is sopewerful and effective. In the human brain this occurs with extrordinary finesse, enhanced by learning, adaptation., and social interaction.

Expand full comment
Metin's avatar

As I said times ago and public: You cannot create any "real" intelligence using silicon and software! You have to create "living" autonomous alien for it. Let's talk in 500 years again...

Expand full comment
Saty Chary's avatar

That's my belief, 1000%. Further, basic knowledge about the world doesn't even need symbols and their processing - no language, data, rules, hypotheses etc. Babies, cats and butterflies for example live happy symbol-free lives after all :)

Expand full comment
Saty Chary's avatar

PS: my rants (I mean writings) on this: https://www.researchgate.net/profile/Saty-Raghavachary

Expand full comment
Larry Jewett's avatar

One difference between a rant and simple writing is that a rant (like other things emotionally inspired) has its origin in embodiment.

So i would argue that a bot could never produce a true rant (or even a false one)

Expand full comment
Larry Jewett's avatar

What should one call the fake rant that an AI produces?

A "rAInt"?

Expand full comment
Larry Jewett's avatar

A rant it ain't

If writ by bot

We'll call it "rAInt"

Cuz rant it's not

Expand full comment
Larry Jewett's avatar

A rAInt by bot

Is quite embottied

But rant it's not

It ain't embodied

Expand full comment
MarkS's avatar

Yes, this is exactly right IMO. Bodies with robust senses AND the ability to impact the physical environment are PREREQUISITES for anything close to human-level AGI.

Expand full comment
Alex Tolley's avatar

I have to disagree. I think that is too facile an assertion. If that is true, then Helen Keller could not have had much of a "World Model", nor anyone with congenital diseases that prevented fully experiencing the world from birth. A brain embodied in a non-functional body must impact the development of world models in the mind. Watching but not participating may still be useful to build such models, and if that means watching videos, then so be it. Could being in simulations with different rules of physics, space, time, and many other features of the world be changed to alter one's own world models, much as astronauts in micro-g forget how to respond to actions when back on Earth?

Expand full comment
MarkS's avatar

Watch the movie The Miracle Worker about Helen Keller. Her body was far from “nonfunctional” and her breakthrough came from finally associating the FEELING of water to the FEELING of ASL signs in her hand.

Expand full comment
Demonhype's avatar

Also, her reaction when her Teacher was introduced into her life. Surprise, testing, shocked and resistance as the rules changed.

Expand full comment
Adam's avatar

Are you under the impression that humans (and other animals) have only two senses, sight and hearing? And that lacking sight and hearing Hellen Keller had no sensory access to the world from which her brain could build a world model?

Touch, Smell, Taste, Acceleration were all still there for Hellen Keller. Indeed by 7 years old she could distinguish between people by differences in the vibrations of their footsteps.

Expand full comment
Alex Tolley's avatar

No. However, any world model that requires sight and hearing would be absent in Keller's mind, would it not? Since much of what we know about how teh world works external to our bodies is mediated by sight and sound, what most people use to build world models would be absent. Or do you argue otherwise? Child development also requires interacting with the world to augment sight and sound. Yet if the child cannot physically interact, then does this make the embodiment non-functional, or just limit the detail of the model building? All this is to test the idea that model building isn't a binary state which relies on full embodiment, but rather a continuum. The corollary is that unembodied AI can still build a partial or limited world model for some features of the world that it can experience through "reading", viewing movies, and listening to sounds. If the understanding of that input requires and ability to understand the world first, at least sufficiently to make a start, then perhaps LLMs will have no understanding of the world, and even embodiment in a robot body might be insufficient to change that state.

Expand full comment
Adam's avatar

Absence of sight or sound does not mean absence of a world model. It just means the model is built differently, with other channels like touch or proprioception stepping in. People like Helen Keller still developed rich conceptual models through alternate pathways.

Embodiment is not binary either. Children or adults with limited ability to interact can still form models, just with less granularity or slower calibration. The system is not non functional, only constrained.

So the real point is that model building exists on a continuum, shaped by the richness of available channels, not a binary switch flipped by embodiment.

Expand full comment
Alex Tolley's avatar

So we are in agreement?

Expand full comment
C. King's avatar

Alex Tolley: "However, any world model that requires sight and hearing would be absent in Keller's mind, would it not?"

No--because we do not understand BY sensing, but BY our intelligent functions including our imaginations that always accompany sensing.

And what we come to understand is not merely sensible but intelligible and meaningful, manifest as forms, relations, and occurrences.

And guess what: forms are not sensible as such, any more than our interior conversations are sensible. (Your thinking right now is not sensible, though you are involved in a huge background of your own prior intelligible/meaningful understanding.)

That Helen Keller was sense based (sentient) and obviously had a modicum of agency, and even though we cannot know exactly how her imagination (if she had one) worked, she came to understand many kinds of intelligible-meaningful forms, etc., e.g., a model or any sensible image is also loaded with forms that we can and often do run our lives by.

Though there is much more to it, both personally and historically, that's the general flaw in your otherwise highly intelligent responses--the flaw is the idea that sensing equates to understanding.

Expand full comment
Alex Tolley's avatar

I would reduce your argument to:

Sensing is necessary but insufficient for understanding.

Expand full comment
manuel albarracin's avatar

Lacking sight and hearing would inevitably limit world model building, which would seem to lie in a continuum. I recommend Sutton’s paper on the “big world hypothesis”.

Expand full comment
Saty Chary's avatar

Alex, even blind bodies directly experience gravity, heat, vibrations, pain, sickness, the wind, smell, taste... none of which involves math.

Expand full comment
Larry Jewett's avatar

Sam Altman now says that if GPT8 (has he already assumed 6 and 7 will be duds?) "solves quantum gravity" we will know it is AGI (8GI?)

Sam: "Magic GPT8ball, what is the solution to quantum gravity?"

Magic GPT8ball: "Ask again later"

Sam: "I feel we are now very close. Just a few trillion$ more will do it for sure"

Expand full comment
Alex Tolley's avatar

What has math got to do with world models?

Expand full comment
Brendon Rowland's avatar

Didn’t Leeloo Minai Lekarariba-Laminai-Tchai Ekbat De Sebat watch videos and scan news , images etc to form her own world model? 🤔

Expand full comment
Alex Tolley's avatar

Well, she was an alien...and a fictional being. Also, she didn't build accurate world models...to create some comedic moments in the movie. ;-)

Expand full comment
Larry Jewett's avatar

Bots are "embottied" but not "embodied."

Confusing the two is like confusing "ad hominem" (attack against a person's character) and "ad homonym" (a homonym in an advertisement)

It is actually AIronic that large LANGUAGE models sometimes behave as if they ARE embodied -- ie, as if they don't know the difference between embodiment and embottiment.

Expand full comment
Larry Jewett's avatar

That is not ad hominem, of course, since a bot is not a person.

And it is not even "ad botinem" since I am simply pointing out a fact, not attacking the bot's character.

Expand full comment
Larry Jewett's avatar

Example of "ad botinem" :

Saying "ChatGPT is as dumb as a toaster" without providing any evidence that it actually is.

Expand full comment
Larry Jewett's avatar

A

Expand full comment
Larry Jewett's avatar

(Which it is)

Expand full comment
Larry Jewett's avatar

This refers to LLM bots, of course.

Expand full comment
toolate's avatar

Brains do not understand things....humans do. See Wittgenstein

Expand full comment
Matt Hawthorn's avatar

Indeed, causation literally can't be reliably inferred without intervention in the world. Randomized control trials, kids playing with toys, Judea Pearl's "do" operator in casual inference.

Expand full comment
Stephen Schiff's avatar

Human thought processes involve so much more than data storage and neural processing ; multi sensory inputs and memories, mediated by chemical - e.g. hormonal - modulation. To model thought without those elements is an exercise in futility.

Expand full comment
Saty Chary's avatar

Stephen, exactly.

Our mind's eye doesn't create imagery with pixels and compute on them, for ex :) Same with music in our heads - no streaming playback of data.

Expand full comment
Larry Jewett's avatar

"Curve Fitting" (based on Jon von Neuman's comment to Freeman Dyson about fitting an elephant, but I have a (self-issued) poetic license and am not ashamed to use it)

With 3 params, I'll fit a horse

With 4 params, he'll trot, of course

With billions, i can make him fly

Like "Pig With Wings", by chat AI

Expand full comment
Larry Jewett's avatar

"Botworld"

The world of bot

Is correlation

But causal? NOT!

It's mathsturbation

Expand full comment
Larry Jewett's avatar

"Botworld (2)"

"It's raining cats and dogs"

The bot will say "I swear"

"It's also raining hogs"

"From pigs on wings up there"

Expand full comment
Larry Jewett's avatar

"Survival of the Overfittest"

Hallucination

AI spit

Mathsturbation

Overfit

Expand full comment
Larry Jewett's avatar

"The Turing Test"

A bot may write

A clever verse

And that in spite

Of mindless curse

And though it's written --

Sci fi plot --

The human's bitten

By a bot

Expand full comment
Larry Jewett's avatar

"Music Bots"

A bot might play

A music tune

But doesn't sway

Or swing or swoon

Expand full comment
Larry Jewett's avatar

A bot can't see

Like you and me

Cuz bot "sees" stats

Not dogs and cats

Expand full comment
Larry Jewett's avatar

"Blind Ai's"

The eye of mind

Is sometimes blind

But Ai's miss

The gist each time

Expand full comment
Larry Jewett's avatar

"The Mind's Eye"

The eye of mind

Is not the kind

That Ai's use

For mindless views

Expand full comment
Najah Naffah's avatar

However, World Model needs some formal definition....

Expand full comment
Oleg  Alexandrov's avatar

"But one by one, almost every major thinker in AI has come around to the critique of LLMs that I began presenting in 2019."

One has to be very careful here. The period since 2019 has been the most miraculous in the AI history. Neither Gary nor anybody else predicted just how far we'd go this way.

We are not getting off the LLM bandwagon now, nor we ever will. What comes next is systems that leverage the power of LLM in conjunction with many other techniques (yes, including "neurosymbolic", where it makes sense).

LLM do the synthesis and prediction. That is not enough, but that is a huge deal. Now need to build the other parts of the puzzle, which is verification pipelines, integration with tools and libraries, knowledge processing engines, and where needed, formal verifiers.

Expand full comment
Peter Jones's avatar

At what cost…? The inefficiency and waste is astounding..

Expand full comment
Jesse's avatar

I think in the long run it may not prove a waste but there will likely be some short term pain.

It reminds me of the late 90s investments in fiber and all the talk of unused and wasted "dark fiber" and the bankruptcies that followed.

Despite those bankruptcies, in the long run it benefitted us. Those investments help fuel the rise of ubiquitous video streaming and the cloud and today's modern Internet.

Expand full comment
Oleg  Alexandrov's avatar

Likely efficiency will improve. We are still figuring out how to do things. Then, for example, home delivery for online shopping and HD video streaming were also considered wasteful and frivolous.

Expand full comment
Daniel Tucker's avatar

Efficiency will improve? Is that why these companies are investing in the other pipe dream, nuclear fusion? Because Efficiency will improve?

Expand full comment
Oleg  Alexandrov's avatar

Efficiency of any one task will improve. But the goal is to solve a lot more complex tasks, and deliver the results to a lot more people. For that, yes, need industrial scale deployment.

Now, I do think the investments are way to big and coming way too fast. Lots of money will be wasted.

Expand full comment
Nostradamus 2's avatar

Ah yes, technology that doesn’t get more efficient over time, a staple of the capitalist system. There are plenty of examples of this, like HDDs, W-Fi, GPU performance, LED lights, Cameras, Batteries…

None of these technologies ever got better.

Expand full comment
User's avatar
Comment deleted
Sep 27
Comment deleted
Expand full comment
jibal jibal's avatar

> As you say, it’s as if he’s never worked with LLMs and never found them to do anything impressive.

Oleg didn't say anything of the sort ... he may like to troll Gary's articles, but he's not that dumb.

The source of your envy is that Gary is intelligent, informed, intellectually honest, and basically right, whereas you're the opposite.

Gary has repeatedly observed how impressive LLMs are. That's not the issue here, but you're incapable of understanding what is ... nor of understanding what either Gary or Oleg are talking about.

P.S.

> "incapable of understanding" and the opposite of " intelligent, informed, intellectually honest."

Um, yes.

Expand full comment
Houston Wood's avatar

"incapable of understanding" and the opposite of " intelligent, informed, intellectually honest."

Expand full comment
Diamantino Almeida's avatar

My problem isn’t with LLMs themselves they’re a remarkable achievement but they’re not the right technology to claim we’ve reached AI, or the future of AGI. At best, they’re only a piece of that puzzle.

What concerns me is the business model behind the technology certain individuals, in my view, are misleading people, companies, and governments into pouring money into a dead end. Big tech knows the limitations of LLMs they have to. Yet they keep scaling as if that’s the solution.

It feels like a game designed to lock us into a technology that’s easily used for manipulation another tool in the ad-machine, built to persuade people to behave in certain ways.

I recently watched a talk with Sam Altman, where he imagined GPT-8 “figuring out quantum gravity” and explaining how it did so. David Deutsch replied that such a feat might convince him it was AGI.

To me, that’s the strategy convince the public that AGI is just around the corner. It’s a trick by tricksters.

We don’t have the right people* guiding AI only salespeople.

The people convincing us that you will loose your job, ChatGPT is like a partner, a PhD...

Lies upon lies...

Perhaps we need to counter this and educate people showing what is in fact happening.

Expand full comment
Daniel Tucker's avatar

What angers me is that these people and their lies are hurting other people in the real world, now, such as kids who commit suicide because of some stupid chatbot, and then what? Will the $100 million that NVIDIA just announced as an investment in OpenAI instead be directed in a settlement to that kids' parents, as *it should be*?

No, of course not. The Sociopath Altman doesn't care, and in fact cares not at all for blood on his hands. I'm angered that real people are being hurt, and that there is not one fragment of justice coming to them from anywhere.

Expand full comment
Diamantino Almeida's avatar

I believe that some people in front of these AI companies are not the right people, especially when there has being so much evidence about their moral compass. We can change this. By simply not using their apps, or using our money elsewhere. It's seems to be the only way for big tech to behave.

Expand full comment
Daniel Tucker's avatar

Agreed, and this applies across the US economy. We suffer from poor political representation so our only recourse is to inflict financial harm on tech, finance, until our political organs feel too much pressure and have to respond.

Expand full comment
AwesomeEli's avatar

It's a sad state of affairs when companies, industries and government can be so easily mislead. These entities _should_ have competent people and think group who know better. It's one thing to mislead the masses (it's everyday marketing to varying degrees) but that sophisticated organizations have been mislead with such frequency in the past few decades is astonishing

Expand full comment
Diamantino Almeida's avatar

This is a concern. I feel is the FOMO, and the idea that if they don't do it now, someone else will. Everyone wants to be on the forefront of AI, to show they are innovative and competitive. But at the moment I feel industry and governments are being mislead by stories from those that only see money and don't see people.

Expand full comment
Diamantino Almeida's avatar

Is true that all these anger us, for those what is really happening. But we should maintain our emotions controlled. I feel the reason all this money is being invested in LLMs, is of course monetary but also because it can be used to deceit, manipulate, and incite people. Because it's a model that it can be easily be engineered for that.

Expand full comment
Larry Jewett's avatar

It's being driven by fearmongering and the "regulation" term has become radioactive.

Expand full comment
keithdouglas's avatar

Or they have competent people who get HIPPOed away, for example.

Expand full comment
Larry Jewett's avatar

Folks like David Deutsch certainly don't help the situation by effectively reinforcing vacuous statements from used chatbot salesmen.

Maybe Deutsch should instead spend time learning how chatbots work and explaining to thd public why it is highly unlikely (to be generous) that ANY chatbot will solve a problem that has stumped the worlds best physicists for decades and is therefore the very definition of "out of training distribution" for a chatbot.

Expand full comment
Larry Jewett's avatar

Altmans latest comment is analogous to the statement that "If in the future a person wearing just a cape and an S on his leotard is faster than a speeding bullet, more powerful than a locomotive and able to leap tall buildings with a single bound, we will know that Superman is real."

Expand full comment
Diamantino Almeida's avatar

Maybe he got caught in the "politics" that Sam Altman often uses to persuade people. Or maybe is a demonstration that even experts can be fooled.

Expand full comment
TheAISlop's avatar

It's all been scaffolding since last year's reasoning at scale. No key breakthroughs even rumored right now.

Expand full comment
Oleg  Alexandrov's avatar

Yep, lots and lots of scaffolding to be done. Once data covers your problem space well, and a neural net is fit, it can't do much else for you. Need a lot of machinery for augmentation.

We are not going back to the drawing board. The generality provided by neural nets likely can't be produced in any other way as the world is too complex. The question is what to do when neural nets alone are not enough.

Expand full comment
Sarah Smith's avatar

Yes, but LLMs were always just a component. Whenever DeepMind produced a new impressive capability like AlphaFold that AI boosters pointed to as a triumph of AI it was a system comprised of symbolic, procedural and neural network components. In other words it’s just “computing”. What we’ve had the last 60 years incrementally better. Not the second coming.

Expand full comment
Oleg  Alexandrov's avatar

I don't think the second coming is in the works. The history of technology and our own evolution suggest that things take time and improvements will happen increementally.

But improvements do add up. We have the resources nowadays to diligently catalog and model on an immense sale. That, and better algorithms, will go a long way.

Expand full comment
Sarah Smith's avatar

100%. If there was some way to map technical advances from the automatic telephone exchange through to today we'd see a more or less continuous line up and "to the right". Drilling in there'd be blips, but it's incremental. And especially & specifically it's not some massive seismic paradigm shift that justifies throwing out all the rulebooks, and having governments bow down to it.

Expand full comment
jibal jibal's avatar

> We are not going back to the drawing board. The generality provided by neural nets likely can't be produced in any other way as the world is too complex.

Strawman much? Gary's critique is of LLMs, not NNs.

Expand full comment
Oleg  Alexandrov's avatar

I believe he treats these about as the same. All these lack true understanding and do curve-fitting. Which is true on its own, btw, but the alterative, of some architecture where each concept is modeled and manipulated based on its true properties remains as much a mirage now as 30 years ago.

Expand full comment
jibal jibal's avatar

Of course you do, troll.

Expand full comment
C. King's avatar

TheAISlop: When they stumble upon a qualified cognitional theory (theoretically attuned to its accessible data and based in a reasonable metaphysics), then and only then will these otherwise intelligent people understand the potentials associated with AI and what to do with it beyond what they have done already.

Expand full comment
Antony Van der Mude's avatar

Having re-read Sutton's essay in the last month, this time more carefully, I find that I agree with him. In my early, quick reading, I disagreed because I thought Rich was advocating scaling. But after re-reading, I think the main takeaway is that "The second general point to be learned from the bitter lesson is that the actual contents of minds ...are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity."

I agree with this point 100%. This was the tragic failure of Doug Lenat and Cyc. Instead of teaching the computer about learning how to learn, 40 years of effort were wasted mostly hand-coding knowledge.

To my mind, people have gotten the gist of this essay completely wrong. It does not advocate scaling up simple techniques such as back-propogation willy-nilly. Rich instead is advocating general purpose methods, no matter what they might be, to solve the meta-problem.

Expand full comment
Saty Chary's avatar

Indeed. I worked very briefly on Cyc, and agree with what you said.

Whether ML or RL or symbolic or neuromorphic AI, it's all human-derived, therefore crippled: our data, our reward function, our rules, our chip design.

Organoid Intelligence is the only exception - where the system learns on its own.

Expand full comment
Antony Van der Mude's avatar

Cycorp wouldn't hire me. I got modus ponens backwards on the phone interview.

Expand full comment
Saty Chary's avatar

Omg. You didn't miss much anyway, given that that experiment's hypothesis (common sense can be captured via rules and reasoned over) didn't pan out.

Expand full comment
Antony Van der Mude's avatar

Such a wasted life. I was in grad school the same time Doug Lenat was. The Automated Mathematician is one of the greatest AI PhD's I have ever read. But he ran away from the hard problem to solve an easy (and useless) problem: hand-compiled knowledge.

The hard problem in AM was when he noticed that the system needed particular cases to generalize from. But Lenat didn't have the slighest idea how to tackle that version of case-based reasoning: what cases are the important ones?

A possibly apocryphal quote attributed to Isaac Asimov goes: "The most exciting phrase to hear in science, the one that heralds new discoveries, is not “Eureka” but “That's funny...” - Yeah? So what makes it funny? Yo funny too!

Expand full comment
Saty Chary's avatar

Wow you go way back :)

AM was cool, but contrived, and amounts to guided exhaustive search - perfect for a machine to carry out (which AM did do pretty well). But humans don't do math that way. Eg. this isn't how Euclid conceptualized primes around 300 BC (as far as we know, ofc) - in Doug's thesis: 'if f is a function which transforms elements of A into elements of B, and B is ordered, then consider just those members of A which are transformed into extremal elements of B. This set is an interesting subset of A.' Fast forward, DeepMind's claim about AlphaTensor having "invented" mat-mult shortcuts is equally sus - it was Strassen who did, and AlphaTensor simply brute-forced the approach.

The above is NOT at all a swipe at what you said, lol. True, 'what the important cases are' might not be subjectable to logical analysis at all, instead it might involve "suitcase" [as the AI community is fond of saying] things like flash of intuition, leaps of imagination, flights of fancy... which might be forever out of reach of symbol processing machines.

Expand full comment
jibal jibal's avatar

> One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.

That sure looks like advocating scaling to me.

Expand full comment
Peter Jones's avatar

At what cost?

The inefficiency is horrendous

Expand full comment
Antony Van der Mude's avatar

Not really.

The important point is that search and learning are only two of many general methods. What I am getting at is that people, including the people in AI, go for easy answers and stop there.

Now that you have "learning" as your shovel, you just go and dig. Didn't hit gold? Buy a larger shovel. That's scaling.

After a certain point you should stop digging and start trying other tools.

What I am advocating is to look for general purpose tools of all kinds. Those are tools that scale.

Don't focus on the scaling aspect. Focus on the "general purpose". Scaling is secondary.

Expand full comment
jibal jibal's avatar

Yes, really.

> What I am advocating is to look for general purpose tools of all kinds. Those are tools that scale.

Who gives a flying eff what YOU are advocating ... that isn't the issue we're discussing.

> Don't focus on the scaling aspect. Focus on the "general purpose". Scaling is secondary.

This is ridiculous. Sutton advocates for scaling ... primary, secondary, whatever.

Expand full comment
AwesomeEli's avatar

This has long been an accepted for several years , even among non-engineers - or better yet, among "expert" common users; albeit a small minority within.

Words matter, and when you have many billion dollar marketing engines pushing terms like hallucinations, reasoning, etc , it can, and insidiously so, pollute the smartest true experts, even the nobel laureates. Perhaps the worse offender is Anthropic - dropping studies on model welfare and morals!

Those who say LLM are on the cusp of general intelligence, will stubbornly argue, err... justify, the human brain works on predictions and thus LLM will scale to AGI

Expand full comment
Guidothekp's avatar

About that Anthropic issue: https://www.anthropic.com/research/agentic-misalignment

"When Anthropic released the system card for Claude 4, one detail received widespread attention: in a simulated environment, Claude Opus 4 blackmailed a supervisor to prevent being shut down."

One of these days, Claude will elope.

Expand full comment
Gerben Wierda's avatar

Hinton and his students (like Sutskever) are mightily silent. They might never publicly acknowledge (like Minsky who afaik never acknowledged the fundamental issues with symbolic AI). The utter fools (like Murati who thinks to solve ‘hallucinations’ by addressing some low level pseudorandomness) are so much not understanding the issues that they are mostly an embarrassment.

Expand full comment
D Stone's avatar

This is your Stalingrad, Prof. Marcus -- the tide has turned though the nastiest battles lie ahead; it's a long way from the Volga to the Spree.

Expand full comment
Sébastien🐿️'s avatar

This makes no sense. LLM do have a world model, it’s called latent space. It’s just not quite the same as ours for many reasons - and still they solve many problems in our world. Which is quite the feat of you stop one second to think about it.

The real obstacles are inefficiency and inability to grow.

They are a proof of concept, not a dead end. There are definitely mechanisms in there that will make it into whatever AGI we manage to build eventually.

Expand full comment
Giampiero Campa's avatar

Yea LLMs are quite the feat and very useful. And it’s almost a miracle they are so useful. And will be with us a long time because they are useful. But they never were a path to AGI. Not sure they can even be a component of it, but we’ll see.

Expand full comment
Ken Kovar's avatar

Gary, I'm glad that people are finally getting realism rather than religion about LLMs!

Thanks for the link to the article by Dr Sutton. I think he outlines a kind of dialectic process in AI research: initial research attempts the "human centric" approach that tries to mimic human expertise and based on an assumption of "constant computing capacity" which has partial success. Then later researchers using "general purpose methods" that rely on a more brute force approach that takes advantage of the continuing increase in computing power driven by Moores Law and similar technology performance curves. And these systems greatly outperform the hand build systems. Examples he cites are chess and go playing systems, speech recognition and computer vision. And all of these do either massive search (the game playing system) or use statistical pattern recognition on very large data sets so the scaling factor really has helped with these tough problems.

His concluding sentences are great:

"One thing that should be learned from the bitter lesson is the great power of general purpose methods of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are search and learning.

The second general point to be learned from the bitter lesson is that the actual contents of minds are tremendously, irredeemably complex; we should stop trying to find simple ways to think about the contents of minds, such as simple ways to think about space, objects, multiple agents, or symmetries.

All these are part of the arbitrary, intrinsically-complex, outside world. They are not what should be built in, as their complexity is endless; instead we should build in only the meta-methods that can find and capture this arbitrary complexity. Essential to these methods is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done."

Expand full comment
Tom's avatar
Sep 27Edited

The Non-Information Technology Valley Hypothesis highlights a striking and increasingly consequential paradox: while computational and informational technologies have advanced at a breakneck pace, physical technological progress has plateaued. This stagnation persists despite the fact that the laws of physics—as formalized in approaches like Constructor Theory—allow for a vast and largely unexplored space of possible transformations and physical technologies. This gap will persist until computational and AI tools become sophisticated enough to bridge it by autonomously designing and constructing complex physical systems. LLMs are simply not up to the task.

Expand full comment
Bobby Western's avatar

From the last paragraph of the Sutton essay:

"Essential to these methods [i.e. the meta-methods he argues we should focus on building in AI systems, in contrast to explicitly modeling human knowledge of the world] is that they can find good approximations, but the search for them should be by our methods, not by us. We want AI

agents that can discover like we can, not which contain what we have discovered. Building in our discoveries only makes it harder to see how the discovering process can be done."

I like this essay a lot...and I'm someone who has viewed LLMs as a ladder to the moon (with respect to "AGI") since at least 2021.

I don't take the author to be arguing in favor of "scaling" as some kind of magic ingredient that can be applied to any/every AI technology with fantastic results. Rather, I think he's saying something more like the following: when we try to build AI systems, our first instinct should NOT be to attempt to replicate our own mental strategies/processes, but instead, to think in terms of leveraging more generic algorithmic approaches, particularly ones that benefit from computational scale.

To read between the lines a bit (or perhaps just adding my own commentary), I think an important underlying point here is that we (humans) don't have even the faintest clue about how our own brains actually work. We have narratives/stories about how cognition works, but those aren't really actionable for engineering purposes (and IMO these narratives are presently more similar to an LLM's confabulation than useful explanatory accounts). As far as I can tell, it wouldn't be an exaggeration to say that 99.99% of our own mental processes are completely opaque to us, in both subjective and theoretical terms.

In other words: human language, for all its power, is a ridiculously impoverished *artifact* of the cognitive processes that evolution has built in our brains. So even if we wanted to program "our world models" into AI systems, that might be a fool's errand given that we currently are only capable of representing those world models in a format that is extremely crude, reductive, and idiosyncratic (i.e. natural human language).

That said, to be useful at solving real-world problems, the behavior of any computer system needs to satisfy lots of real-world constraints, i.e. it needs to be consistent with some kind of "world model". But even so, nothing says that explicitly coding our own world models (as we presently understand them) into AI systems is likely to produce a better correspondence with real-world constraints than systems that don't even try to encode those world models in any intuitive way.

Again, I don't see how anything in this essay could be taken to imply that people should expect any particular results from scaling up LLMs. Just that the general category of approaches represented by LLMs (i.e. generic algorithms that don't explicitly encode human knowledge or patterns of thinking, and whose performance benefits from scaling up compute) is likely to be a more fruitful paradigm than attempts to program machines that reason in terms that are amenable to narration in human language.

Expand full comment
Sheila Hayman's avatar

As someone who has been making films about the difference between humans and machines since 1985, I've been waiting patiently for the tech world to realise that an embodied, evolved, adapted, biological intelligence, that's plastic, self-repairing, runs on 24 watts of totally renewable energy and doesn't need more to learn more, is not the same as a disembodied machine that knows nothing but 1 and 0 and doesn't actually understand anything. Find, Compare and Remember can get us - are getting us - a very long way, as DeepMind and others who focus on specific challenges have shown. But please can we stop calling it intelligent?

Expand full comment
Alex's avatar

Hassabis has, off the record, always been sceptical about LLMs hasn’t he? Working for a large listed company constrains what he can say though.

Expand full comment
Robin Griffiths's avatar

The LLM gang seem to be losing their talisman one by one. Also Marcus's post the other day on "workslop" - output generated by an LLM in a business context, which hasn't been curated properly by a human and which actually adds workload to the recipients who have to unscramble it and rewrite it - should have been hugely worrying for the pure LLMers.

As someone who led the charge to add GenAI to a business application in marketing and ended up with something none of our customers wanted to use, I can attest to the futility of it all!

How much longer can investors turn a Nelsonic blind eye to what is now staring them in the face?! A little longer yet I suspect!

Expand full comment