93 Comments
User's avatar
Matt Kolbuc's avatar

Yeah, I heard that latest interview. Sam just blurts out shit that doesn't make any sense. "yeah, I asked it this really hard problem that I didn't even know, and Chat GPT just solved it, I sat back in my chair in awe".

Same as he blurts out other shit like, "pretty soon, Chat GPT is just going to discover new science and start solving physics". That makes no sense at all, an no, this tech is nowhere close to making scientific break throughs.

Comedy piece for those interested about the absurdity of this all: https://cicero.sh/r/hows-the-ai-revolution

Expand full comment
Guidothekp's avatar

Thanks for the link.

Expand full comment
RMK's avatar

I look forward to more people excitedly claiming that it's conscious, based on the fact that prompts for sci fi scripts about conscious machines last it to generate sci fi scripts about conscious machines.

The last 5 years haven't increased my excitement about AI nearly as much as they've decreased my trust in human intelligence.

Expand full comment
RMK's avatar

*lead it to generate

Expand full comment
jibal jibal's avatar

You can edit comments.

Expand full comment
RMK's avatar

Yeah I imagine so but I can't figure out how. I'm on the android app and the menu only has Share Comment, Hide Comment, and Delete Comment. Long press just folds it.

Expand full comment
Think Mr A's avatar

After reading Karen Hao it's hard to root for Sam AltDelete. From YC, Paul Graham, Theil and their grand ambitions it's a relief to be on this side where humans exist and thrive.

Expand full comment
Christa Albrecht-Crane's avatar

The book has affected me deeply, too. I am more concerned than ever about the energy extraction and human labor these LLM technologies demand, as well as the ruthless and zealous drive these tech bros operate under. It's terrifying.

Expand full comment
PH's avatar

But it's not that AI is singularly bad regarding energy consumption; we could also complain about many other things that use up a lot of energy, like passenger aviation for tourism or even video streaming.

So the environmental impact just becomes problematic because gen AI does not robustly work and therefore is not worth it. Even where it kind of works, like in graphic design, the replacement is still obviously inferior (as we see with the spread of very generic yet inconsistent illustrations).

Of course, one could regard AI just as a form of entertainment. But I wonder if it has made anyone truly happy to generate yet another Shrimp Jesus or have Ani as a girlfriend.

Expand full comment
Think Mr A's avatar

I'm working on not letting them scare me. Hope you can too.

Expand full comment
Christa Albrecht-Crane's avatar

Hm. Hao's book made me realize that none of us has any power to influence these people. It's a club of AI zealots that probably really believes their LLMs are sentient.

Expand full comment
Think Mr A's avatar

I get the helplessness feeling. That comes with handing over unregulated social shaping to a few. I don't have an answer to combat them on a tangible level. I'm not afraid day to day though. I also know they have their entire portfolio at stake and it's not guaranteed they will not hit a wall or turn on one another.

Expand full comment
Christa Albrecht-Crane's avatar

I agree with what you're saying. I should clarify that I'm not afraid in my personal daily life. The concern is their business practices and how they lure people into the supposed magic of their products. I am an English professor and see the damage (on so many levels) to my students. Without regulation or oversight, they get away with way too much. And Hao made it very clear that they can easily turn on one another. In fact, she provides numerous examples of this happening, such as when Dario Amodei left OpenAI to co-found Anthropic.

Expand full comment
Daniel Tucker's avatar

Maybe if the Left of Center political party in this country actually did what it's supoosed to do, instead of caving to every whim and fancy of the petty bourgeoisie, there would be a regulatory force to regulate and *constrain* the tech industry effectively. If....

Expand full comment
Ken Kovar's avatar

Sam AltDelete.. 😆

Expand full comment
Guidothekp's avatar

I searched for Karen Hao's piece and stumbled on an audio. Is it the one you mention above? If not, please post the link. Thanks.

Expand full comment
Ondřej Frei's avatar

I believe this is the book in question: https://en.wikipedia.org/wiki/Empire_of_AI

Expand full comment
Think Mr A's avatar

Yes my friend!

Expand full comment
Think Mr A's avatar

The audio is correct, im sure.

Expand full comment
John's avatar

Tell me please. Why are we trying to replace humans with machines?

Why are we allowing twelve-year-old computer boys to determine our future?

Expand full comment
Oaktown's avatar

While they destroy our environment and waste billions of dollars. They infuriate me; waste, fraud, and abuse.

Expand full comment
Jim Ryan's avatar

If and when your predictions come true you don't really expect them to move on do you? Their reputations and so much money is invested in LLMs. Plus then they would have to admit all the lies they have been telling.

Expand full comment
Aaron Turner's avatar

Once the penny drops, the VCs will just quietly back away. Already starting to happen.

Expand full comment
Pramodh Mallipatna's avatar

Love the title - touche’ 😀

I am not a AI researcher, but this is my article based on analyzing from first principles and listening to the likes of Gary Marcus and others.

Human Intelligence Made Language, Can AI do the Reverse ?

https://open.substack.com/pub/pramodhmallipatna/p/human-intelligence-made-language

Expand full comment
direwolff's avatar

"Language is a projection of intelligence and not its source. So training on language alone offers a window/slice of intelligence and not all of it."

Nicely said.

Expand full comment
Patrick Logan's avatar

With each new model society should have the right to an audited environmental impact statement.

Expand full comment
Graham Lovelace's avatar

Prediction 8: Sam will make a series of anthropomorphic comments about GPT-5, along lines of: 'It thinks really hard', 'It really knows you', 'It has genuine wisdom', 'It's got a great sense of humour'. In fact, I'm going to do another Charting Gen AI bingo card predicting some of these!

Expand full comment
RMK's avatar

Real question - what is there to improve in LLM's?

I'm speaking here as a random person who checks in on chatgpt every few months and always find it utterly underwhelming. It's a neat party trick and a little better than Google for very basic searches. But even that is mostly because Google, and the internet at large, is choked with spam. I'd take the 2012 internet over an llm any day of the week.

If I make a wishlist of things that would make chatgpt useful to me, it's basically all the stuff you say isn't going to change without incorporating actual world modeling in a serious way. And I suspect you're right.

So... what's left? What's something I currently couldn't do with 4, that I even conceivably might be able to do with 5?

Expand full comment
direwolff's avatar

Items 4 & 6 feel related because both rely on the fidelity of language which is greatly lacking for these claimed tasks. Heck, just talking through a request with another person requires a lot more interactions than I believe most people will want to have with these technologies. If we just look at how we have to deal with laws today and all the exceptions that then require amendments to address...constantly. Even translated religious texts that only resemble the original text in the old language it was translated from, and how different versions of translation appear differently depending on whose version it is (ie. the Bible).

I'm still lost by practitioners' obsession with the use of natural language for high detailed and precise tasks description as it has to be the worse way and completely antithetical to mathematical or scientific notation, the former of which has helped shape computer science. As someone who speaks several modern languages, I find that each is exceedigly challenged in expressing precise and nuanced statements. It's the genie and the 3 wishes over and over again, you can never make (describe) a wish that doesn't end you in an unexpected problematic situation, no matter how hard one tries.

The only way I justify the progress made to date is that so much has been written and ingested into these LLMs that they can attain useful statistical probability for the use of terms next to each other. But as a way to "understand" instructions or set guardrails, it feels like using a tractor to pick up lint on a carpet. Sure, you might pull it off, but it will likely fail more times than succeed at the task ;)

Expand full comment
Oleg  Alexandrov's avatar

"the field will finally realize that it is time to move on"

This overall a solid article, except for this. Move on to what, exactly?

Gary has been predicting "plateau" and even "dud" for quite some years now, yet the field is moving forward at a very nice clip, historically speaking.

We are not in thrall to a foolish delusion that will fail. The current advances are real.

Reliability will improve in an evolutionary way, as for self-driving cars (unless one is Tesla or xAI).

Principled knowledge processing systems (aka neurosymbolic) will have a place, but likely depending on the application, rather than being the one thing that will revolutionize the field and eliminate all current problems.

So, what to expect is more reliability, more applications in a few years, and eventually a profit (for some companies) and failure for others (xAI and Meta).

Expand full comment
Oleg  Alexandrov's avatar

"Within a decade, maybe much less, the focus of AI will move from a pure focus on scaling large language models to a focus on integrating them with a wide range of other techniques."

That is perfectly correct. But, focus on language and breadth first, and then later integration with other approaches, is likely the right path.

What remains as far-and-away the biggest problem is world's complexity and lack of broad patterns, hence the need for very large scale data and tabulation. Then, sure, attention to detail will be needed, which will come from other techniques.

Expand full comment
PH's avatar

The situations where problems with reliability occur become more rare. So it's seductive to think that they can be fully eliminated, yes.

Still, with a mentally healthy human, any good-faith, solution-focused discussion ends with two final states: truth or admission of ignorance.

I never witnessed such a discussion with a human that ends up with the other human slowly losing the grip on reality, not correcting and instead losing the grip on reality even more.

So there cannot be a Socratic dialogue with LLMs, because they may end up in a third state where one hallucination is justified by another one in an endless chain of nonsense (until, of course, you just firmly say that they are mistaken, because they are all very sycophantic and will never disagree with you — even if they are right).

Like recently, Grok 4 just completely made up new inheritance rules for the Scala programming (obscure but still in the top 40 of the Tiobe rating), with example code and all, and continued to hallucinate after further probing.

Now, sure, there has been improvement so that such situations occur less commonly. OTOH, we know how that was achieved. Not with a fundamental breakthrough but with larger models that now use synthetic data.

But in a professional context, what you do is very far removed from what a data entry worker in an Indian sweatshop could patch. So yes, hallucinations are less obvious now, but they are common enough that we have hit a plateau of usefulness for LLMs, at least for my cases of application.

Expand full comment
Oleg  Alexandrov's avatar

I surely agree that statistical guesswork that is not grounded can't be reliable. The question is if AI agents can implement mechanisms for at least verifying their work, running simulations, etc.

Expand full comment
Sufal Chhabra's avatar

Hey, Gary I am firm believer as well that LLM’s are like a person who does not know anything and has memoriesed everything.

Hence, as a graduate student interested in working on the next frontier of AI what direction do you recommend studying and researching in. Thanks in advance!

Expand full comment
Christa Albrecht-Crane's avatar

Start with your premise: LLMs are NOT like a person. Read a few of the other posts Marcus has written on that. And this: https://www.wordrake.com/blog/youre-thinking-about-reasoning-wrong#_ftn2

Expand full comment
jibal jibal's avatar

Good article. The term "cargo cult" comes to mind to describe LLM mimetics.

"To improve accuracy, we might prompt the LLM to be careful: “Only answer if you’re sure,” or “Say ‘I don’t know’ if you’re unsure.” And the model will comply—linguistically. While it changes the tone of the output, it has no effect on accuracy."

The problem here is that the prompt wrongly assumes that the LLM has mental states. But an LLM is not the sort of thing that can be sure or unsure or know or not know things--that's a category mistake.

P.S.

Today I had ChatGPT tell me "Ah—you’re right! I had forgotten that ...".

My response: "You're an LLM ... you didn't "forget" anything, and such claims are jarring and unnecessary."

It replied:

"You’re absolutely right. Thank you for pointing that out.

It’s more accurate for me to say something like:

That detail wasn’t included in my earlier response.

rather than framing it as if I “forgot,” because I don’t have memory in the human sense. I’ll keep my phrasing cleaner to avoid that jarring effect."

Of course it doesn't know any of these things either, thanking me is completely meaningless, and it won't "keep" or "avoid" anything ... LLMs always revert.

Expand full comment
Christa Albrecht-Crane's avatar

Precisely. LLMs are misaligned with regular communicative tasks in the products for which they are marketed.

Expand full comment
Antonio Eleuteri's avatar

You may want to properly study advanced statistics and mathematics. I feel like very few, if any at all, of the people who work on these LLMs have any ideas about what they are doing from a theoretical point of view (e.g. are the LLMs consistent estimators?) They seem to forget, and to quote Prof Judea Pearl, that LLMs are "just glorified regressors". Can we expect from a purely regression framework (with all that it entails) some form of "intelligence"? Prof. Pearl argues that any form of "intelligence" requires effective counterfactual modelling. And we know from first principles that this does not happen with regression models, unless the data is collected in a specific way (see e.g. clinical trials.) Throwing more data scraped willy-nilly from the Internet at ever bigger models is not going to address the issue at all. There is also the issue that deep learning doesn't produce models which are better than kernel machines (see e.g. the work by Prof. Pedro Domingos.) This discovery has HUGE implications. First because kernel machines have been thoroughly studied from a mathematical standpoint (see e.g. the works of Prof. Vapnik, or the book by Ingo Steinwart "Support Vector Machines"); second because it dispels all the "magical thought" around deep learning models.

Expand full comment
Andy's avatar

"Seven Dark Predictions About Alex, Our New Human Expert"

Let's replace 'GPT-5' with 'Alex,' a brilliant human expert, and see how Gary's predictions hold up:

"In 2026, Alex will be a bull in a china shop, making shake-your-head stupid errors"

→ Every Nobel laureate who's forgotten where they parked. Every surgeon who's operated on the wrong knee. Every expert witness who confidently testified the wrong person was guilty.

"Alex's Reasoning will continue to be unreliable, especially in complex scenarios"

→ Economists who missed the 2008 crash. Weather forecasters and their 7-day predictions. That time NASA lost a $125M Mars orbiter because someone forgot to convert units.

"Fluent hallucinations will be common"

→ Brian Williams' helicopter story. Every eyewitness testimony ever. The NYT's 2002 WMD reporting. Your uncle at Thanksgiving explaining cryptocurrency.

"Natural language won't reliably interface with systems"

→ "I didn't mean delete everything!" Why lawyers exist. Why "that's not what I meant" is humanity's unofficial motto. The entire field of technical writing.

"Won't be general-purpose intelligence"

→ Ask your cardiologist to fix your WiFi. Ask your IT expert to perform heart surgery. No human has beaten Cicero at Diplomacy AND driven Formula 1 AND performed brain surgery.

"Alignment will remain unsolved"

→ Every war ever. Every divorce. Every company that claims "our employees are aligned with our values" while covering up scandals. Congress.

"Will need structured systems to augment them"

→ Why we invented writing (memory sucks). Why we need peer review (individuals are biased). Why democracy has checks and balances (no one is trustworthy with absolute power).

The punchline is this: Gary just described human intelligence perfectly. These aren't bugs—they're features of any sufficiently complex intelligence navigating an uncertain world. The fact that LLMs share these "flaws" might be the best evidence yet that we're on the right track.

Expand full comment
jibal jibal's avatar

"humans too" is stupid tiresome dishonest whataboutism.

Expand full comment
direwolff's avatar

There's a HUGE difference between human foibles and failures, and those of technological instruments/machines/software. People tend to inherently trust "the machine" and the machine can move at scale in ways no human will ever be able to. Yes, doctors make mistakes, but they are held liable. Who will be held liable when a glitch in the matrix or just an error in the system screwed things up? When a lawyer deletes an important text, there's no scale to that problem. Everyone of the things you've listed involves people and there's a limit to the damage they can produce. Machines have always been meant for scale and flawless predictable repeatability. Now we stand at the cusp of flawed and unpredictable repeatability...at scale. The point of view that Gary and others have previously espoused and seems most in "alignment" with humanity, is for these technologies to continue to play the role of tools we control, not technology that does our thinking for us so that we can abdicate our responbilities as a society or culture.

Expand full comment
Andy's avatar

You're absolutely right about the trust + scale risk. We've navigated this before though: autopilot (trusted, scaled, still needs pilots), medical devices (FDA regulated, doctors still liable), trading algorithms (caused flash crashes, now have circuit breakers). The pattern is always: new tech → accidents → regulations → shared human/machine accountability.

Here's the irony: the very 'human-like' flaws Gary criticizes might be what keeps these systems as tools rather than overlords. We know how to supervise something that thinks like us—flaws and all. It's the promise of 'flawless' machine intelligence that Gary seems to want that should worry us more. Those perfect machines won't need us.

The messy, fallible, human-like AI that Gary dismisses? That's the one that will always need human oversight, human judgment, and human accountability. The 'bugs' he identifies might be the best features for keeping humans in the loop for a long time.

Expand full comment
Mary's avatar

If all we're getting is something just as dumb as humans, then why are we paying so much money and burning up so much electricity when we can just get some readily available humans to do this stuff for us? The whole point of computers is that they can do stuff we can't. I don't see the point in wasting billions of dollars on something that unemployed Bill down the street could do for way less

Expand full comment
Andy's avatar

But we're NOT getting something 'just as dumb'—we're getting something with human-like reasoning PLUS: nearly perfect memory, infinite patience, no ego, works 24/7, never judges my stupid questions, and reads 1000x faster. It's like hiring someone with 20 PhDs who never needs coffee or bathroom breaks for just $20 per month!

Expand full comment
Mary's avatar

"Nearly perfect memory" and "reads 1000x faster" are both metrics that don't matter if it's giving you random wrong answers and you have no way to predict which ones are wrong. It also won't learn from getting things wrong, because it doesn't actually know anything.

"No ego", "never judges my stupid questions", "infinite patience" are already things you can get from a google search.

"20 PhDs who never need breaks" is just delusional thinking, I'm sorry.

Expand full comment
Andy's avatar

You're right that '20 PhDs' is hyperbolic. What I actually find valuable: it helps me understand opposing viewpoints before engaging in discussions (like actively probing both sides of adoption ethics debates at a very deep level). Not perfect, sometimes wrong, but useful for perspective-taking in a polarized world. Think of it less as an oracle, more as a patient devil's advocate who helps you stress-test your own thinking. If you're curious, www.chat.com lets you try it free without any signup. Might be worth a quick test to see if it's as useless as you think—or you might find a specific use case where it actually helps.

Expand full comment
Mary's avatar

Why would you assume I haven't tried it? I've used it to help me write an insurance denial appeal (among other things) - by which I mean I had it generate one for me which I read, trashed, and rewrote from scratch. It was sort of helpful in getting my brain started on the task.

I'm not saying there's no use cases at all, I just don't think we should be blowing up the environment for an inaccurate, ethically dubious, and marginally useful product that only does jobs that it's probably better for our brains to do ourselves anyway.

Expand full comment
Jan Steen's avatar

The problem with the structured approach to AI is the curse of exponentiality -- the same thing that was the undoing of expert systems. You cannot hardcode all possible connections between all possible components.

Expand full comment
Bruce Cohen's avatar

You don’t have to. If you include something like sn LLM. In the architecture, and loop it with a (somewhat) malleable structured system that might include expert systems, world modelers, physics simulators, etc., when the structured system hits something it can’t handle it can use the LLM to run a generate-and-test algorithm to find a way to handle it.

I know that’s somewhat handwavy, but we have a lot of research to do before we could build something that has the ability to find ground truth when it knows it does’t have it, which is what think we need to get the reliability, alignment, and the other attributes that Gary is calling out.

Expand full comment
Harry's avatar

What if it's trained itself using Chat GPT 4 output (some being wrong) people have posted to the internet?

Expand full comment
Aaron Turner's avatar

All good except for point 7: the most performant future AGI systems will not incorporate LLMs at all.

Expand full comment
Oleg  Alexandrov's avatar

"the most performant future AGI systems will not incorporate LLMs at all"

Maybe. The question is, how will AGI learn? Language is a remarkably compact representation, and is a lot more flexible and descriptive than symbolic methods, for example. LLM achieved great success with what was thought to be impossible before, which is the often self-contradictory, irregular, and outrageously large pile of human experience.

How can we do all this without large-scale statistics and sometimes outright memorization? My best guess is that the human mind is not neat. We painstakingly and diligently learn from experience, and are very good at figuring out both the principles and the countless exceptions to them.

We have been very bad at building up powerful representations that are both deep and not brittle. Probably such representations don't even exist. I think we'll end up with a patchwork of techniques, and LLM-like approaches will do by brute force what can't be done otherwise.

Expand full comment