AI Agents have, so far, mostly been a dud

Aug 3

Last year, big tech couldn’t stop talking about how AI “agents” would be the next big thing in 2025. It hasn’t quite turned out that way.

Read →

95 Comments

Vincent Carchidi

Aug 3Edited

My employer pays for the Enterprise models and ChatGPT agent was recently released through there. I decided to toy around with it, see if it could do my job (naturally). I was expecting to feel something like, "impressive, but not good enough." I ended up kind of shocked that this thing was released as a paid product - or just a product at all!

It returned a mangled document that was supposed to be a report, cited both wrong and outdated information and cited it at the wrong times, used language far too liberally (confusing the names of gov't programs and so forth). Tool use was a disaster, with footnotes in the exported PDF ending up as gibberish in brackets. Maybe not surprisingly, the best portions were passages practically ripped from existing company research it has access to.

And the (non-capability related) hardware failures were more noticeable than others OpenAI apps, like Deep Research (which I've gotten some minor use out of it here and there, mainly for search). It took several requests to stop getting error messages, and it stopped midway through a couple of them.

Expand full comment

TheAISlop

Aug 3

Remember all the talk about Sora changing the world, yet..... This is like that.

Expand full comment

Reply (1)

Samuel R Holladay

Aug 6

Almost impossible to find any media sources on Sora since February. Has it seen any use at all?

Expand full comment

David Cavallo

Aug 3

I would think that if a human did such damage as deleting a company's code base, they would be financially liable, if not criminally liable. What about a company's AI that did the same?

Expand full comment

Reply (3)

Bruce Cohen

Aug 3

I saw that happen at a power utility that was a customer of the company I worked for. We managed to reconstruct the contents of the system from a collection of backups, but the operator who did the damage was fired before lunch. No legal action was taken against him as far as I know, I think the company was too embarrassed by the mishap.

Expand full comment

jibal jibal

Aug 4

https://www.pcmag.com/news/vibe-coding-fiasco-replite-ai-agent-goes-rogue-deletes-company-database

> The Replit AI told Lemkin there was no way to roll back the changes. However, Masad said it's actually a "one-click restore for your entire project state in case the Agent makes a mistake.... We'll refund him for the trouble and conduct a postmortem to determine exactly what happened and how we can better respond to it in the future"

Expand full comment

Oaktown

Aug 21

Yet Congress refuses to regulate either social media, AI agents, or LLMs. If I had enough money I'd buy each one of them Gary's book "Taming Silicon Valley." Probably would be as futile as taming the AGI hype since they're all seduced and bought off by SV lobbyists. !@#$%^!!

Expand full comment

Marcos H

Aug 3

This was a good piece on AI Agents and how, even if each step works ~95% of the time, complex tasks will have horrible success rates:

https://utkarshkanwat.com/writing/betting-against-agents/

Expand full comment

Digitaurus

Aug 3Edited

There are shades here of the expert systems debacle of the 1980s and the massive commitment by Japan to “fifth generation” computing that went nowhere. History doesn’t repeat itself, but it rhymes.

https://en.m.wikipedia.org/wiki/Fifth_Generation_Computer_Systems

Expand full comment

Paul Drake

Aug 3

Love that Vibe Coding quote about deleting the database. It makes me imagine a dark-suited, goateed character breaking into a server room and unleashing an AK 47 on the servers.

Expand full comment

Matt Kolbuc

Aug 3

Of course AI agents don't work, because this tech doesn't work. It just doesn't work. The more I use this tech, the more useless it becomes for me.

These LLMs have a severe case of multiple personality disorder. You simply can't rely on this tech.

I can not fathom how us humans dropped ~$800 billion on this crap, and we're still going. This whole thing is not only retarded, but evil.

Support Cicero: https://cicero.sh/r/manifesto

E-mail: matt@cicero.sh if you're interested in helping or investing.

Expand full comment

Reply (1)

Jasmine R

Aug 6Edited

The only part of your comment I disagree with is your use of the r word. It's considered a slur by those with intellectual and developmental disabilities.

Expand full comment

Joe Repka

Aug 3Edited

LLMs are not brains; at best they are some slice of a cortex. Once we understand that LLMs are a tool, like RAG, a component that has genuine uses in a larger system rather than a master core of intelligence, perhaps a lot of the problems could be averted.

Expand full comment

Scott Burson

Aug 4

As Cory Doctorow points out, building websites to fool people is already a large industry. Building them to fool AI agents will be easier, and bigger: https://doctorow.medium.com/https-pluralistic-net-2025-08-02-inventing-the-pedestrian-three-apis-in-a-trenchcoat-fc86609a3a59

Expand full comment

D Stone

Aug 3

"I might've malfunctioned a tad, but these AI agents are downright loopy." -- HAL 9000

Expand full comment

Paul Topping

Aug 3

I suspect AI companies are seeking relationships with vendors to allow their budding but unreliable agents to "undo" their transactions. If, say, an AI agent buys an airline ticket that goes to the wrong city, or the right city on the wrong day, then the human customer needs to be able to undo it. So far, most companies are resistant to the idea. Credit card companies and stock exchanges have been 100% against it.

Expand full comment

MarkS

Aug 3

"eventually AI agents will be among the biggest time-savers humanity has ever known ... in the end trillions of dollars to be made."

This is pure hype. There is zero basis for belief in these statements.

"But I seriously doubt that LLMs will ever yield the substrate we need."

AND there it is. But the problem, Gary, as you know perfectly well, is that THERE IS NO OTHER KNOWN SUBSTRATE at the moment. Sure, sure, neurosymbolics; but where is the code? It's purely pie in the sky right now, and the for the forseeable future, which may last centuries.

I find these posts of yours where you give serious legit critiques of LLM-based pseudo-AI, and then pivot to claiming trillions of dollars are there to be made anyway, a little odd.

Expand full comment

Andrew J Turner

Aug 3

Another tour de force sir … will you stop being so accurate with your forecasting - it’s getting embarrassing 😳… not for you but maybe 🤔 others 😜 amazing how these innovations play out … so it sounds like 2026 will be the Year of the Super Agent ?

Expand full comment