128 Comments
User's avatar
David Cotton's avatar

I just wonder how many times Gary is going to have to write similar articles before the LLM investment bubble comes crashing down.

Nothing has changed, all the same ML / LLM flaws are still there, still we get hallucinations, basic errors, still they're not "AI" in any meaningful sense, still they don't scale and won't lead to AGI.

Gary Marcus's avatar

hype springs eternal

Positively Paying It Forward's avatar

Sensibilities of how to invest in AI are largely dependent on who's money is being invested.

I suggest that the $$ that continue to pour into AI are investment houses not wanting to be left behind, but at the same time, ready to bail at any time, not wanting to be the last one out the door.

Give it time (I bet it'll crash starting Sept/Oct), as it will tell.

Paul Topping's avatar

Although I totally agree with this paper, I think what LLMs do will eventually be shown to have a lot of value. It is not at all AGI but a human language interface to the world's content and tasks. I am particularly enthusiastic about MCP which allows us to wire up any action we can think of to an LLM. This is a powerful concept that we are only just beginning to exploit. This is all separate from the AGI effort which has been derailed by all the attention paid to LLMs.

Scott Burson's avatar

I agree with Gary about LLMs not being AGI, but I no longer believe the investment bubble thesis. People are learning how to get value out of these things, despite their limitations. (For a full exposition of the bullish case, see @NateBJones on YouTube.)

The way I would put it is that you have to use LLMs in closed-loop rather than open-loop systems. These are terms from control theory and cybernetics, that have perhaps somewhat lost currency, but that we need to reacquaint ourselves with.

A closed-loop system is one involving feedback. It's turning out that if we set LLMs up in contexts that give them feedback, they work much better. The obvious example is coding; we can require the generated code to pass certain tests, which the model can run itself, and then simply have it iterate until it succeeds. Defining the required tests is nontrivial, but often easier than writing the code ourselves. In the legal arena, one can imagine an external citation checker. And so on.

Of course, for the models to satisfy the constraints placed on them by the feedback mechanisms requires more computation — a LOT more.

So I think you're going to be surprised. The suitability of models to work in agentic mode took a big jump just a couple of weeks ago. Demand is exploding, and these apparently insane levels of capex could well turn out to be actually inadequate.

khimru's avatar

It's like saying that dot com bubble haven't happened simply because today you CAN go on pets.com and buy that damn food for your dog.

If course you can do that TODAY — but you couldn't for decades… and the same thing would happen to LLMs: we would definitely find a lot of value in what they can do, but it just wouldn't happen any time soon. Bubble would have to burst, first, prices (and energy consumption) of the whole thing would need to come down 10x or 100x… and only then we would get something sustainable.

P.S. And yes, precisely the fact that even with already insane capex that was spent till now we have “inadequate amount of it” shows us that bubble would burst: it was already unsustainable with just LLMs if agents need MORE compute then it would never work profitably — and if there are no profit in some technology then it's an “investment bubble”, simply by definition. It doesn't matter if that technology would power the whole world, 20 or 100 years from now, if there are no profit TODAY they show would stop.

Gabriel's avatar

Wait, are you telling me that if we just throw enough CAPEX at it, we can create a closed-loop of AI agents running a single unit test called isAGI() and let it iterate until it returns true?

Checkmate, AGI skeptics!

Sara's avatar

“In the legal arena, one can imagine an external citation checker. And so on.“

I’m only going to reply to this one sentence and leave the rest of the comment for others.

LLMs regularly make up legal citations and many, many lawyers in multiple courts at both state and federal levels have been found in contempt of court and gotten fined for using them. So, could an LLM that was specifically trained on a database of court cases citation check? Of course (I am sure LexisNexis and Westlaw are already working on this). Would I trust any of it without double checking their work? No. LLMs are tools and as such should never be operated without (human) supervision.

Scott Burson's avatar

I'm sure someone could build a reliable, non-LLM-based citation checker. Probably, LexisNexis has already done it — I see they do have an AI product, and if you're selling a software product to lawyers and it screws up, it's a safe bet you're going to get sued.

Why some lawyers haven't gotten the memo yet that you can't just use ChatGPT is a puzzle, but I'm sure word is getting around.

Sara's avatar

I just saw a comment on another website I follow and WestLaw also has an AI - seems to be based on LLMs. The commenter was not impressed with it as it wasn’t trustworthy.

As for your comment on the memo, I completely agree. But I keep going to CLEs and they keep mentioning new cases where the citations weren’t checked and the response from the lawyer(s) always seems to be: I didn’t know an LLM could do that. I’m actually hoping the fines increase in these cases as the answer, at this point, is clearly lying to the court.

Jason's avatar

Maybe the first big writedown of GPU related capex at a large company will be the last piece of straw.

George Burch's avatar

Unfortunately discussing system failures can be ignored until it is disrupted by a better system. What is constantly said to not exist is a hybrid symbolic solution. This post describes a semantic/symbolic system that is an operational substrate.

http://intellisophic.net/2025/09/12/the-fundamental-innovation-orthogonal-corpus-indexing-oci/

E. Syla's avatar

In fact they are 'AI' in the only meaningful sense. This is the only real and intelligible conception of AI, not what a few computer scientists and sci-fi writers 70+ years ago fantasized about.

Richard Self's avatar

A very timely reminder of what is wrong with so much of the current TechBro advocacy of the singularity and the arrival of AGI (whatever that means).

jibal jibal's avatar

The notion of "the singularity" is based on multiple fundamental errors in reasoning. Consider: a) an optimal intelligence that can make all possible valid inferences (and no invalid inferences) from all available evidence cannot improve upon itself, so there can't be an "exponential feedback loop" or similar nonsense claimed by singularian buffoons like Nick Bostrom--exponential curves don't have a maximum. b) Even such an optimal intelligence is not a genie or an oracle--it cannot solve unsolvable problems (e.g., as I noted in a discussion about a Bostrom paper on HN the other day, it cannot find the greatest prime or solve the halting problem, and as someone else noted, it cannot do a linear search in O(1) time) and it cannot speed up the time that it takes to obtain additional evidence through experimentation--it cannot bypass the scientific method. Yet in his paper, Bostrom (who advocated risking a 97% chance of annihilation of humanity for a 3% chance of developing ASI) claimed that an ASI would make such things as curing Alzheimers "imminent" and suggested that it could be done by replacing damaged neurons (showing that he knows nothing about Alzheimers or how the brain works).

Patricio Rodriguez's avatar

It's nice to see somebody articulate I thought I also had regarding physical constraints and to some degree knowledge constraints and even reality constraints.

We know there are some limits (think of energy not being able to be created nor destroyed) But we don't know how many more we don't know.

What if we are at peak understanding of certain things or like there are some hard limits like speed of light or something else. The whole singularity thing is like the perpetual motion machine, sounds smart but it goes against the laws of thermodynamics.

keithdouglas's avatar

In principle there could be a hyperintelligence that solves the halting problem for turing machines, which would be nice. But it would be subject to its own halting problem. But yes, the "singularity" idea is garbage (except in the theory of differential equations, of course. :))

jibal jibal's avatar

I don't know what principle you have in mind, but Turing proved that there is no procedure that solves the halting problem (determining, for any Turing Machine, whether it halts). No amount of intelligence can turn a formally unsolvable problem into a solvable one. (Quantum computers don't help here either.)

Here's something I just ran across that seems somehow apropos: https://existentialcomics.com/comic/641

Adam's avatar
Feb 17Edited

My 9 yo found the AI fails in that video absolutely hilarious. He started coming up with his own.

Using Opus 4.6.

Prompt: I drove to pick up my child from school. I now have my child getting home is only 100m away. Should I walk or should I drive home with my child from school?

Response: Walk. 100 meters is basically a one-minute stroll — it's not worth starting the car for that. Your kid will probably enjoy the short walk too.

My kiddo found the idea of abandoning the car at school to be absolutely hilarious. He was roaring with laughter. 😆

Freddie deBoer's avatar

I will repeat myself: if AGI was here, there would be no argument about whether it had arrived.

Aaron Turner's avatar

Yes, we'll know for certain when genuine AGI is here because unemployment in advanced economies will be stubbornly > 30%. Unless of course it's properly aligned with human preferences, in which case it would refuse to put people out of work by stealing their job.

Matt Hawthorn's avatar

Or it would agree to put out of a job only those who truly wished to pursue other interests, and only if its tech overlords committed to using its immense abilities for the common good of humanity rather than extractive profit maximization.

Aaron Turner's avatar

A properly aligned agentic AGI would be fully autonomous; it's only "overlord" would be its final goal. And, yes, in order for that final goal to be written in such a way that ithe AGI was genuinely aligned with human preferences (a complex but not impossible problem), its original designers would have to be motivated by the best interests of the human species as a whole rather than some subset thereof. In such a scenario, the AGI would not treat the people who built it any more favourably than anyone else.

PH's avatar

The problem is that countless sci-fis popularized the trope that in case AGI would arise, there would be tons of Luddites who would simply refuse to believe that and refer to some ineffable metaphysical essence, like “souls.”

Of course, this is not the case at all now, and there is a very pronounced empirical inferiority of LLMs to humans. Even that is an understatement. Which also explains the lack of disruption in the job market.

But if you are a true believer, you can ignore that and instead just think it's like the sci-fis told you.

Ben Winchester's avatar

> The problem is that countless sci-fis popularized the trope that in case AGI would arise, there would be tons of Luddites who would simply refuse to believe that and refer to some ineffable metaphysical essence, like “souls.”

I fully expect there to be a reasonably-large subset of the population that refuses to acknowledge the AGI as a person or as "truly intelligent" on this basis. "Yes, it's 'smart' enough to take our jobs, but what does it know about the beauty of a sunset?" Or "it might look 'smart', but it doesn't have a relationship with our lord and savior, Jesus Christ".

So yeah. There will be that part of the population. But we can set those arguments aside, and still end up with the blunt truth that current LLMs have a "very pronounced empirical inferiority .. to humans".

manuel albarracin's avatar

What baffles me is the why of the dogmatic insistence by some that, contrary to all evidence, “AGI is here”.

I can see how it benefits Altman, Amodei etc. in sustaining their companies’ market valuation and aiding their attempts at regulatory capture, but aside from that, what’s the point?

Carl Mueller's avatar

The AGI debate is increasingly tiresome and distracting. It’s like insisting that nuclear fusion is the only thing worth discussing in climate change policy. If we ever reach true, self-improving AGI, all bets are off. Even researchers like Nick Bostrom and others studying superintelligence openly acknowledge we do not really know what happens in that regime. Most of the arguments ultimately reduce to which side of a speculative cost-benefit ledger you fall on, whether AGI becomes a panacea or our undoing. But that is not the world we are in right now.

Right now we have extraordinarily capable machine learning systems that are already automating and accelerating large swaths of cognitive work. Software engineering has materially transformed. Document drafting, analysis, reporting, modeling, spreadsheet work, research synthesis, and even elements of strategic planning are being compressed into AI-assisted workflows. When embedded into multi-step workflows, equipped with memory, metadata generation, tool use, retrieval systems, and human feedback loops, they become something far more powerful than simple autocomplete.

The “stochastic parrot” framing feels increasingly disconnected from operational reality. In isolation, these models are statistical token predictors. But that description undersells what is happening in practice. They are not merely autocompleting text. They are autocompleting intelligence output that enables iteration at a qualitatively higher level.

The shift is just as profound in information retrieval and search. Traditional search returns links. These systems return synthesized answers, structured summaries, comparisons, and next-step reasoning. They collapse discovery, synthesis, and drafting into a single interaction loop. That changes how knowledge work is performed at a structural level.

You can argue these systems are not general intelligence. That may be true. But they are already powerful enough to reshape labor markets, productivity curves, competitive dynamics, and even how information itself is accessed and processed. That reality, not speculative AGI timelines, should be at the center of political and economic discussion.

Bill Donahue's avatar

Well, it's true that they do return more information than traditional search returns. The problem though is that their synthesized answers, structured summaries, comparisons, and next-step reasoning are full of errors, untruths, and fictitious sources and authorities. And the only way to recognize that is if you already know a fair amount about whatever the topic of focus us. So, really, LLMs seem best at getting people who know nothing up to a very basic level of understanding, but after that they then just misinform people.

Carl Mueller's avatar

Under what evidential basis are you making that claim?

There isn’t a clear consensus right now on LLM or agentic search accuracy. Reported performance varies widely depending on what’s being measured (eg citation fidelity, factual claims), domain specificity, grounding method, and evaluation setup. In high-stakes environments, error rates are absolutely concerning. In other domains, they are already proving useful as search accelerators and synthesis tools.

We are still in the early stages of AI-mediated information retrieval. It’s reasonable to critique current reliability. It’s not reasonable to assume today’s failure modes are permanent structural limits. These systems are improving rapidly, especially when combined with retrieval grounding, citation constraints, and verification layers.

The more productive conversation is about where they are reliable today, where they are not, and how fast that boundary is moving, not about freezing the evaluation at what they can do now.

Bill Donahue's avatar

So, you take issue with what I've said about their error rates... and dispute it by conceding that "In high-stakes environments, error rates are absolutely concerning"?

Frankly, the only milieu I care about re LLMs is the potential improvements they may offer in work efficiency or productivity. I.e., "high stakes environments". As a scientist with real expertise in my field and actual skill, an LLM offers me next to nothing. Worse, probably, given the amount of time it takes to go through anything it generates and which requires fact-checking on literally everything in it. As someone who's also led a science institution, I think the risk of significant LLM use is simply too high for any individual or organization that is concerned about protecting reputation and a high quality of work.

LLMs are like asking a cocky, overconfident 21-year-old frat boy whose ego and self-confidence far exceed his knowledge and intelligence to do something important that requires technical experience. They're sentence generators that have been trained using massive datasets comprised of what's on the internet, most of which is of pretty crappy quality. Which is fine if I want to use it to generate a moderately creative piece of fiction in iambic pentameter, but that's about it.

Carl Mueller's avatar

I take no issue with pointing out their flaws. Criticism is healthy. And if high-intelligence, high-stakes reasoning domains are what you care most about, that’s fair.

What I find interesting is how often people assume their particular niche is immune to cognitive automation. In my professional experience working closely with LLMs, I don’t see any field that is immune to impact. The degree may vary. The timeline may vary. But the direction is consistent.

Dismissing these systems as parlor tricks or simply “statistical sentence generators” feels just as overconfident as the arrogant frat boy. Both positions underestimate what’s actually happening. They have improved at rapid rates both in accuracy and in constrained design. And their emergent associative properties shows capability far beyond the formal statistical underpinnings (more than most experts could have predicted).

In software engineering, the rote and mechanical portions of programming are already being absorbed by agentic coding workflows. Boilerplate, scaffolding, refactors, documentation, test generation, even architectural exploration are increasingly AI-assisted. Engineers who integrate these tools are dramatically increasing throughput. Those who refuse to adopt are being outpaced in the market. The human expert is still critical but a 5 year old technology is fundamentally shaping a major knowledge work sector profoundly. That should not be taken lightly.

Scott Burson's avatar

I still think "statistical sentence generators" or the more colorful "stochastic parrots" are accurate descriptions of what LLMs are. But whht is becoming evident is that statistical sentence generators, cleverly and appropriately designed, can be remarkably powerful and useful tools, much more so than one might have expected.

Some 45 years ago (!), a friend of mine wrote a Markov-chain chatbot and let his co-workers play around with it. It was of amusement value only, of course. It probably did about what one would expect from something called a statistical sentence generator. I would say, it bears the same relationship to a modern LLM that a bottle rocket bears to a Saturn V (or a Falcon Heavy, if you want a more up-to-date exemplar). But my point is, they're both rockets. An LLM is still a statistical sentence generator, just a very sophisticated one.

Lizzy Whited's avatar

With all due respect to Dr. Marcus, his commenters are increasingly showing themselves to be just as out of touch with the actual capabilities of the frontier models more and more. Anyone who isn't impressed with these tools just simply has not used them. Most of them decided three years ago they weren't very impressive and have continued to deny that anything has changed in the time since.

keithdouglas's avatar

It is precisely because they are very impressive in some respects I find them so dangerous. The impressive aspects (apparent fluency in natural language, for example) mask the emptiness behind and make it harder for people to realize this - and so we have to work doubly or more harder to make sure we don't harm people, etc.

Jonah's avatar

It's not very "respectful" toward Marcus to say that he is out of touch, is it? And it's certainly a simple inaccuracy, based on what I have been reading of his publications, to suggest that he has never used the models that he talks about. I am certainly astounded by the changes in the capabilities of transformer-based models and how those capabilities have expanded, but I am also astounded by the ways they frequently fall short. If I had been asked to speculate about the limitations of models with these capabilities, I would never have imagined their current failure modes, and I doubt that you or Marcus would have either.

Here's the unfortunate truth: in spite of the rapid rate of advancement of transformer models, the rate of advancement of commercial hyperbole has rapidly outpaced it. How else to describe claims of curing all diseases within, what would it be now, nine years? Eight? Marcus is right to point out that chatbots do not exhibit general human intelligence in the way that it has usually been conceived, and even more right to point out that they most definitely do not think like human beings despite being able to do many similar things.

Bill Donahue's avatar

Put another way, the only way for the insane cap-ex being dumped into LLMs today to pay off is if they provide significant advantages for "high-stakes environments".

Because the market for paying much for a digital tool that allows someone to do low-stakes anything isn't going to be very big, simply because the thing it's being used to do doesn't represent much of a risk financially or reputationally. Otherwise, it wouldn't be low-stakes.

Carl Mueller's avatar

I think you’re hyper-focusing on frontier, general-purpose lab models and the massive capex being deployed to monetize consumer and enterprise services.

That’s only part of the story.

You’re ignoring specialized systems like AlphaFold and GraphCast, along with other domain-specific models built on transformers, graph neural networks, and hybrid architectures. These are not chatbots. They operate in high-stakes scientific and operational environments and have already produced measurable, peer-reviewed impact.

Those systems demonstrate that transformer-era machine learning is not confined to autocomplete or marketing demos. It is advancing structural biology, weather prediction, materials science, and other technically demanding domains.

Yes, the current capital expenditure wave may contain speculative excess. That happens in every infrastructure buildout cycle. But bubbles can coexist with real technological shifts. Railroads had a bubble. The internet had a bubble. Both left durable infrastructure and transformed the economy.

Bill Donahue's avatar

"Hyper-focused"?? Gary's article is on skewering the LLM pumpers like Altman's insistence that AGI is about to arrive. Almost every day we see them insist that regular jobs will disappear within a year or two. And they've failed to achieve any of their predicted outcomes in their advance to AGI. Meanwhile, the vast investment in that folly - and the hidden spreading of its debt throughout financial markets and instruments - has no realistic financial model, and the house of cards that's being built is mind-bogglingly huge.

We're probably looking at a massive stock and economic crash, with vast financial losses to be experienced by almost everyone, and legal, regulatory, and environmental impacts of stranded infrastructure left abandoned everywhere.

So yes, that is what I think is far, far more important at this moment than the simple fact that AI has been enabling major technological advances with all kinds of knowledge and capacity benefits for decades and will continue to do so. Nobody - including Gary - is arguing otherwise.

Carl Mueller's avatar

Blaming “AI hype” for the entire capex cycle just feels hollow to me.

Most investors aren’t thinking, “AGI is next year, I better dump money into every AI startup I see.” They’re investing because there’s real value right now, mixed with some speculation about first-mover advantage on AI-powered capability. A large portion of capital is going into non-frontier startups and existing software companies, something like ~60%. These models are compressing workflows, writing code, accelerating research, automating support, analyzing documents. That’s tangible capability that startups and adaptive companies are integrating to deliver value today.

Do they cover 100% of workflows? No. Are there real reliability issues? Of course. But those gaps are being actively engineered around: guardrails, eval pipelines, retrieval layers, human-in-the-loop systems, better fine-tuning. This isn’t blind faith. It’s iterative engineering on systems that are already delivering utility.

A significant portion of this activity stands on its own, independent of AGI timelines. Yes, the frontier labs are making massive data center bets. And the model-as-a-service dependency creates real coupling… and real risk if some of those companies fail to justify the burn. That could be painful. But tens of billions are also flowing outside those labs into applied use cases, infrastructure, open source ecosystems, model optimization, and service-layer companies solving practical problems.

Every major technological shift creates a bubble phase. Railroads. Telecom. The internet. Investment and capex is how markets figure out where the durable value actually is. It’s messy. It can be brutal when it unwinds… but what’s the alternative? Sit on capital and wait for zero risk?

User adoption of ChatGPT and Anthropic products outpaced early Facebook and Google growth curves. That doesn’t happen purely because of narrative pumping. People are clearly finding real utility…even if the model-only companies aren’t profitable yet. And honestly, a huge percentage of users simply find these models entertaining. Entertainment alone is a massive market.

A correction is possible. Probably likely. But that doesn’t make the entire build-out folly. It just means capital is doing what capital has always done during platform shifts. Some players will collapse, but I doubt the underlying technology will.

And on Gary specifically… he seems consistently focused on the contrarian AGI narrative thread. To use a hypothetical analogy, it's as though he's arguing that the internet promised perfect decentralization, obsessing over where that fell short, and missing the massive value creation that happened anyway.

AGI is such an odd debate to center everything around. General intelligence compared to what? Humans? These systems are already superhuman in narrow domains. LLMs, search, specialized models…they outperform many humans at specific cognitive tasks every day. I have models that can take a well-structured codebase and make large, focused changes with impressive accuracy more often than not. Meanwhile Gary et. al. is acting like the entire space is delusional because CEOs hype AGI on earnings calls.

Patricio Rodriguez's avatar

"We are still in the early stages of AI-mediated information retrieval"

1) Not really early, most likely in the middle towards the end.

2) So LLMs are just databases with natural language interface? Cool cool cool.

Carl Mueller's avatar

We’re very much in the early stages. How long have production LLMs been in service? Transformer based LLMs did not see widespread adoption until roughly 2020. We’re barely five years into production LLm systems and the change we’ve seen in their capabilities in that short time have been dramatic.

And LLMs are far more than just a natural language interface for a database. That reductionism is lazy and uninteresting.

Patricio Rodriguez's avatar

Hype is lazy and uninteresting as well.

Carl Mueller's avatar

Where’s my hype? I literally created a chain of comments saying the hype focus on AGI is overworked. The real focus should be on what these technologies can do now, how we mitigate their disruptive effects, and how to best leverage them in ways maximize human productivity.

I can’t really discuss much with you if your default commentary is dismissiveness and a general hostile stance towards these technologies. You’re responding to an AI centric Substack authors comment board.

Lizzy Whited's avatar

Dismissing obvious, insane improvements in the field as "hype" just because you don't like it is also lazy and uninteresting. I guess all the software engineers who say it does their whole job better than them and the elite physicists who admit it's basically as smart as them are just lying for hype.

Chase Ashley's avatar

Too much is being made of whether something is or is not AGI. Seems like a silly definitional argument, with no practical significance. What matters is understanding the strengths and weaknesses of the models and knowing what they should be used for and how to best use them and, at least as importantly, knowing what they shouldn't be used for and how they shouldn't be used.

Bill Donahue's avatar

Which is probably why more consumers use them to find recipes than anything else.

fwd's avatar

My proposed rebranding of “AI”: ACME (autocomplete made extravagant). It is a product prone to being misused and backfiring. Or, maybe MACE (Autocomplete Masquerading as Expertise)

M. E. Black's avatar

It's sort of shocking that Nature publishes software industry hype articles. No mention, of course, of the recent research done by Tencent that found that the latest and "greatest" models fail on 3-of-4 tasks even when given the instructions to perform them when the tasks are out of the training data (you would think a "general intelligence" would be able to reason their way into a higher score, but of course, these models don't "reason", they approximate the examples of reasoning in their training corpora). There's no attempt by these researchers, who are dazzled by the outputs when they're right, to interrogate why these models need ever increasing amounts of training data, or to just flatly examine how the models themselves work, that is, by probabilistically reconstructing the contents of their training data, token by token, using a big statistical equation that links tokens (not words! not concepts!) together.

There may be some automation benefits to the statistical approximation of training data, but the fact that the software is just performing probabilistic retrieval of the contents of the training corpora suggests that there are probably easier and more reliable ways to automate things. Programmers are frequently telling us of the failure modes of this software, for example, my co-workers losing several hours of work supervising an "agent" that ended up just deleting the whole project. We have not seen any measurable speed up in feature delivery or software shipping, but we are seeing developers' talents atrophy, and people being misled by errors in model generated summaries, which likely negatively impact productivity.

Xian's avatar

AGI = A Guy Inside

Ihor Gowda's avatar

New York Times on using AI to parse Epstein files:

Artificial intelligence helps. The technology allowed the team to build tools to parse the Epstein files in just a couple of days. “That would normally take engineering teams weeks to build,” said Dylan Freedman, an editor on our A.I. projects team. And Andrew Chavez, a newsroom engineer, helped with “semantic search,” which lets journalists hunt for concepts rather than matching exact language in a document. But A.I. isn’t perfect. In fact, it’s really bad at news judgment, Dylan said: “A.I. can be sloppy and make mistakes that are inexcusable in journalism. It’s super industrious but not super intelligent.”

Gerben Wierda's avatar

AGI is currently a big fat red herring. The effects of GenAI can be profound (uncertainties are mostly economic) without it being anything remotely 'intelligent'. From AI-slop to other cheap results that are nonetheless valuable for the user. I estimate we will not see AGI for a very long time.

AGI distracts us from the discussions we should have, like on impersonating humans, intellectual property rights, anti-democratic power concentrations, and much, much more.

Bill Donahue's avatar

The easiest way to convince low-knowledge people that you've achieved a goal or benchmark has always been to simply redefine the goal or benchmark to more closely fit with what you've actually accomplished.

It's as true in the promotion of LLMs as the path to AGI (and private equity investors) as it is within overwhelming government bureaucracies.

Blake Pelton's avatar

Great article. Small typo here: "SWe see no evidence".

Gary Marcus's avatar

tx, corrected in the online version

Toluwalope Opaleye's avatar

Happens to the best of us. Initially thought he meant Software Engineers then I jumped back to read it as ‘We…’

Jonathan Grudin's avatar

As others have noted, the discussion of AGI is increasingly irrelevant, and efforts to define AGI are not time well spent. The Nature article asked whether people have GI, obviously relevant, and then said it wouldn't pursue it. Does anyone run humans through the benchmark tests? Those who are fine assuming only cognitive intelligence is significant, not social, motivational, emotional intelligences, should still read the Kahneman & Tversky experiments that show simple rewording reliably leads people to contradictory responses. Charging me for doing X is interpreted differently than givng me a discount for not doing X.

A problem most of us have is jobs and other professional and family obligations that limit time to explore enough to develop a sense of where AI can and cannot quickly bootstrap a search without risks. Research professors won't find it useful in their area of expertise, or helping much with departmental committee work or conference organizing, submissions reviewing, and so on. For those in a position to explore with it a lot, it can become like an industrious but not very bright assistant whose strengths and weaknesses you understand. A tool in your toolbox. Then the question is, is it useful enough for enough people to pay enough for, especially given the lack of a moat? I don't think so, though I'm not following generative uses. For porn, scams, and maybe blowing up Caribbean boats where no one much cares if they were really smugglers probably it can be profitable, and maybe decent uses. Anthropic has thrown down a gauntlet. If all new Claude code is being written by AI, the next version of Claude will either knock our socks off, or they will say it did but it didn't.

khimru's avatar

It's true that not all humans have intelligence and, more importantly, not all humans know how to apply intelligence. But let me ask you a simple question: why do you think job interviews exists and why do we have certificates, etc… does it have anything to do with the fact that in spite of the issues with human intelligence and problems with finding people who have one… our jobs require intelligence, specifically jobs that have a chance to justify trillion-dollar spendings.

Gerben Wierda's avatar

The Turing test is about the worst test there is to test for intelligence. Because it relies on the difficulty of fooling humans and that premise is problematic. Humans are extremely easy to fool. Even Eliza could do it.

lsgv's avatar

The thing is that as people in general and writers in particular, namely those of Nature, Wired and similar, get dumber and dumber by the hour, any machine will ultimately be smarter than them. I just saw a snail passing by with more cerebral activity than all those believing that statistical parrots are intelligent. Thanks anyhow for trying to explain the obvious.

John Konopka's avatar

“ get dumber and dumber by the hour”

This was explained in the documentary “Idiocracy”.