65 Comments

I love the cope here from Altman and Mollick. "It's a different kind of intelligence", "there's a magic to it I haven't felt before", "I am talking 'vibes' not scores".

Yes gentlemen, this is what bias feels like from the inside.

Expand full comment

It’s a different kind of intelligence.

An unintelligent kind.

Expand full comment

That is unfair toward all other kinds of biases - bias is always stupid, but the reason you incur it are not necessarily stupid.

Expand full comment

there is a genuine sense of less 'ai noise' in the output with grok 3 and likely gpt 4.5, that makes it feel more natural, but calling it an a different kind of intelligence is just stupid, its more comparable to better curve fitting in electronics I guess

Expand full comment

Funny how both Claude and ChatGPT are asking me to “save money” by paying up front instead of month by month….

Expand full comment

I wonder. Anthropic tells me their "special offer" ends tomorrow. I probably won't take it. How long before they make another special offer?

Expand full comment

AI has been a huge gift to mostly engagement farmers. Practical uses remain rather narrow.

UBI isn't coming, your disease won't be cured by LLMs, and you will struggle more to find reliable information in the wasteland of hallucinations filling blogs, podcasts, etc.

Expand full comment

Exactly. But people are still surrounded by hype and it’s easier to get promoted if you agree with common knowledge 🤨

Expand full comment

I mean, even if they got AGI, UBI would still never come. Total fantasy. The oligarchs aren't shoveling billions in to help people like you and me, they're doing it in the hopes of no longer needing us and killing us off. It might generate trillions in GDP, but people like us would see zero cents of it. I don't think I'll ever get why the AI evangelists online think they won't die in the bread line with the rest of us.

Expand full comment

Correct. Billions of dollars in hopes of creating the wish-granting machine. And if that were possible, who gets the first wish? Utopian dreamers really haven't thought this through as to what would likely happen.

Expand full comment

Thanks for "wish-granting machine".

About to Grant the Impossible. YEP. Thank you Mr. Altman.

Expand full comment

The Anthropic CEO went to Davos, said "AI will double the human lifespan by 2030," and everyone sat there in awe instead of laughing at him until he cried, which is what they should have done.

Expand full comment

He really said that?

You can always make a point about something stupid being smart from the perspective of a narcicist getting what he wants, but this is stupid intelligent stupid.

Expand full comment

But we should all understand that when they say, "AI will double the human lifespan," they aren't talking about you and me, not talking about the species. They're talking about those oligarchs who, you know, are the only ones who really matter.

Expand full comment

Gary Marcus, it would be great if you could test Open Evidence, an LLM specialized in clinical medicine. Doctors use it everywhere, so if Open Evidence hallucinates as much as ChatGPT, the situation would be tragic.

Expand full comment

Does Open Evidence transparently document how their "AI" works and what tasks and results they stand behind? From a quick search I see...

OpenEvidence is not peer-reviewed

It's not a substitute for clinical expertise

It doesn't provide medical advice, diagnosis, or treatment

Expand full comment

I agree that OpenEvidence includes disclaimers and safety measures, such as requiring users to be healthcare professionals, like physicians, and restricting its use to access and analyze clinical evidence rather than as a substitute for medical advice. However, many doctors rely on OpenEvidence as a source for the medical advice they provide to their patients. Besides, the claim that OpenEvidence does not hallucinate disturbs me. In the link below, Sequoia Capital states:

"The platform searches across 35 million peer-reviewed publications—and, thanks to a new content partnership, is trained on The New England Journal of Medicine. There are no hallucinations; if the literature is inconclusive, OpenEvidence simply doesn’t respond."

https://www.sequoiacap.com/article/partnering-with-openevidence-a-life-saving-healthcare-revolution/

Expand full comment

Did yet another cycle of the "Please solve this cryptogram" test. Claude 3.7 got the furthest. GPT 4.0 actually regressed since the last test -- it passed the first hurdle last time and failed it this time. All the others failed the first hurdle. Will have a report out on my substack as soon as I recover from an afternoon with stupid robots.

Expand full comment

Is it me or is Mollick sounding less and less academic and thought leader-ish, and more just balanced fanboy

Expand full comment

I think what you're observing has been the case since beginning of 23, I have not seen a difference.

Expand full comment

Well, sure, you are right. AGI or anything related isn't going to happen. Just like 'the electronic superhighway' of the 1990s (a.k.a. internet) wasn't going to bring us perfect information for all and democracy everywhere, and a new economy that would make everybody insanely rich while everything was free, etc.. Dehyping such nonsense predictions is a bit like shooting fish in a barrel (been there, done that, thirty years ago). I predict you're going to tell us next year you were right (and you will be, I strongly suspect).

But can we make a decent guess what *is* going to happen? A dotcom-bust like correction at some point? But what else? If you understand the tech, it's easy enough to say some stupid prediction *isn't* going to happen. But what might be? A lot of GenAI generated noise/junk is a pretty obvious one.. But some decent value too I suspect. GenAI-based conversation bots for lonely people? Anything?

Expand full comment

Well, LLM has cut down the time it takes for me to look things up on wikipedia really quite dramatically. Less so if you include the time taken to check the answers are actually true.

Expand full comment

Can you comment on what your workflow is?

For me, typing "w keyword" in my browser is still ten times faster and hundred times more energy-efficient than a query for an LLM that might also be less factual than wikipedia.

Expand full comment

As a researcher in mathematics and computer science, current LLMs have sped up 10x the time that I had to spend before to learn new things, outside my specialization area. I can instantly find what I need, without having to read dozens, hundreds of articles. That was daunting before. I can now ask for an intuitive explanation, with which in mind I can go far faster in understanding the actual papers. So, not AGI, but incredibly useful assistants. But LLMs give only back what you give to them. If you ask the right questions, and you have good intuitions, they will do the boring work for you, but are not creative.

Expand full comment

yeah, I more or less buy that. I don't know about 10x but it's quite helpful.

Expand full comment

I was being somewhat facetious, but with that said it does save time on queries like "does alzheimer's disease affect cortical iron content". Then you can follow up on the web with what it answers. Just typing those terms into duckduckgo doesn't always produce very targeted results. . Also, if I'm reading a paper that is somewhat outside my field it's good for quickly getting to grips with things. e.g. it quickly gave a very useful answer to "how does CNPase change with age".

So yeah, I do find chat GPT useful. I would easily survive without it though. I'm about to do some coding and I do expect it to help speed that up. Generally it is good if you are already good at something and can critically assess its output. That seems to be the consensus that's emerging.

Expand full comment

This is my question as well. When is the correction, and what are the benefits and harms that will stay even after the correction. I fear one harm is even more people leaning into cognitive bias and further erosion of trust.

Expand full comment

The money flowing in won't dry up because the potential payoff of getting to lay off all their human workers forever will be too much for oligarchs to give up on, so sadly it'll just keep going as is

Expand full comment

While there is the plattitude about markets staying irrational longer than (most) can stay solvent, burning through capital without tangible ROI *will* eventually come to an end.

Expand full comment

Who buys their products when that happens?

Expand full comment

Bots will buy bots to make more bots.

Any other questions?

Expand full comment

Does it matter? They'll control all the capital and none will be distributed. Social platforms will just be what they are now which is just bots.

Expand full comment

Do lonely people like to talk about the health benefits of eating crushed glass and the cooking benefits of glue on pizza?

Expand full comment

Twice the cost of Apollo, that is unreal Gary!!

Expand full comment

Haha, good comparison. Especially since I would no way set foot into a landing rover built by any of the models anytime soon...

Expand full comment

Tesla landing craft 😀

Expand full comment

Even irrational exuberance eventually comes to an end.

Expand full comment

It’s spelled “AIrational”

Expand full comment

No matter how you spell it, irrAItional. or even irrationAIL. Results are the same.

Expand full comment

It’s all AIr

Expand full comment

You earned a victory lap for sure :D

Expand full comment
20hEdited

I must say I do like Tyler Cowen, and have been extremely dismayed by his credulous, non-academic approach to AI. It makes me wonder whether the only reason I find anything he says interesting is Gell-Mann amnesia. Maybe he is all enthusiasm and information, and no actual thought in everything he's doing, not just in discussing AI.

I mean: the podcast linked here is just nauseating.

Expand full comment

Same guy who interviewed the Microsoft CEO last week. Dude actually asked if they were getting close to developing immortality and the CEO was just like "this crap can't even sort my email bro."

Expand full comment

You're right, Gary. Scaling is a shambles and isn't going to recover. And, as I pointed out somewhere and you've pointed out, these so-called reasoning models actually seem to employ a bit of symbolic computing in their architecture, a bit of expert-system search and control. Alas, I fear that these guys are likely to double-down on it and throw good money after bad, which is classic sunk-cost behavior.

And yet, what these LLMs can do is utterly remarkable. For example, I typically work across two or three disciplines at a time – chosen from cognitive psych, computational semantics, neuroscience, literary criticism, anthropology and a bit of this and that – and so have trouble getting knowledgeable feedback on my work. But Claude does that for me.

Here's it's evaluation of a series of experiments I did on story variation in ChatGPT, https://new-savanna.blogspot.com/2025/02/claude-37-evaluates-my-paper-on-story.html The experiments were, in turn, suggested to me by the work Claude Lévi-Strauss did on myth back in the 1960s. (And, for what it's worth, Sheldon Kline at Wisconsin did an Old School model of Lévi-Strauss's myth work in the late 1970s.)

Expand full comment

But it just praises your paper. Academia is mostly about giving and receiving tough but constructive criticism, which I think is far harder to do. At least it is in the sciences.

Expand full comment

Oh, I know what Academia is about. It's also about gate keeping. Gate keeping has the appearance of tough criticism, but it is a very different activity.

Claude's criticized other work I've done this with. Feel free to read that paper and criticize it. Though, if you do so, I'd appreciate real criticism, not the equivalent of "but we don't do things that way."

Expand full comment

Unfortunately, like everyone I am already turning down journal requests to review papers all the time. If LLMs can help with that problem, it's a good thing however I'm pretty dubious from what I've seen.

Expand full comment

Your project is very interesting. I wondered about a couple of things, 1. the use of "colorless green ideas." Given this is a known expression that ChatGPT was able to reference, did you attempt to use, or would, another, similarly analogous, syntactically correct, but semantically meaningless, expression have given the same, or similar result?

2. What is your evaluation of the CHATGPT evaluation. Do you feel it added anything insightful to either your work you asked it to 'evaluate,' or on your understanding of its methodology or capabilities, etc.? Thanks for sharing this.

Expand full comment

1.) On your first question, I doubt that the familiarity of the phase had anything to do with ChatGPT's response, but it's such a strange beast that you don't really know until you try.

At this point I've done over 200 story-variation tests, and 100s of other tests as well. I've used other nonsense phrases in other contexts. I don't off-hand recall just what you're asking about. I noted in the paper that later on I did try the green ideas phrase, but that ChatGPT told a story. If you check the link in the paper I think you'll see that it told a story where the ideas acted as ideas, in dreams or something like that. I was unable to replicate that exact experiment. Given that OpenAI kept changing the version, it's hard to tell what's going on.

Later on I asked ChatGPT to tell a story, no specification beyond that. So it told a fairytale about Lily. I then began prompting for more information about Lily and for more episodes in the ongoing story. At one point I introduced a flying saucer and a robot named Gort. Here's how ChatGPT responded:

"It sounds like you might be referring to a classic science fiction film called "The Day the Earth Stood Still," which features a robot named Gort and a flying saucer landing on Earth. However, as Lily's story has not previously included elements of science fiction or extraterrestrial visitors, I'm afraid I cannot answer this question within the context of her story. If you have any other questions about Lily and her world, I would be happy to try and answer them for you!"

That's quite similar to the green ideas response. ChatGPT had a sense of what was (ontologically) appropriate for the story and was unwilling to violate that sense. You can find that interaction on my blog: https://new-savanna.blogspot.com/2023/02/exploring-lilys-world-with-chatgpt.html

2.) On the second issue, I asked it to review that paper in the context of a longer interaction. I began that interaction by asking it to evaluate a long and complex theoretical paper about language and cognition in the brain. It pointed out strengths and weaknesses in the paper and that led to a discussion, not only of that paper, but of an ongoing collaboration I'd begun with a machine vision expert, Ramesh Viswanathan, at Goethe University Frankfurt. It was in the context of that discussion that I uploaded the story variation paper. Why? That's the paper that motivated Ramesh to contact me. What I got from Claude was simply that the paper presented a sensible line of research.

On the one hand, that doesn't seem like much. But, when I did that paper, I wasn't undertaking a standard kind of investigation. Rather, I was undertaking something that, as far as I knew, I had made up without precendent. When you do that kind of thing, which I've done a few times, it's useful to have a simple reality check: Is this anything, anything at all?

In this particular case, I already had Viswanathan's approval, which is significant because his background is quite different from mine. In particular, he has a great deal more mathematical expertise than I do. Still, the two us could be out to lunch on this one.

But Claude 3.7, for all practical purposes it's been trained on the whole literature (up to its cutoff date). In some sense in "knows" much more than Viswanathan and me put together. That's worth something. Just what it IS worth, I don't know – more than peanuts but most likely somewhat less than gold.

Expand full comment

Thanks for your work! My daughter Erica turned me on to it. https://4dthinking.studio/ux-book

I've written 6-7 "AI" rants lately. I would like to call your attention to this one:

https://portraitofthedumbass.blogspot.com/2025/02/hallucinations-my-ass.html

I really feel like everyone discussing LLMs should shun the "hallucinations" term. This was obviously invented by some marketoid to conflate LLMs with human brains, which is complete BS. Let's call them "errors" or "mistakes" or "bullshit" or whatever, but not "hallucinations". That gives them an upgrade they don't deserve.

Expand full comment

Gerben Wierda suggested to call them "failed approximations" in his insightful blogpost https://ea.rna.nl/2023/11/01/the-hidden-meaning-of-the-errors-of-chatgpt-and-friends/, I strongly recommend reading it :)

Expand full comment

Really interesting. I think I will ask CHATGPT what it "thinks" about that article.

Expand full comment

All the best marketing ideas are stolen. In this case, from the 1983 movie "War Games". Near the end of the movie Dr. Falken states, "Joshua is hallucinating, he is trying to trick you to get the codes". OpenAI's marketing, entirely driven by anthropomorphism, is how they trick you to get your money.

Expand full comment

adding tens of thousands of gpus--just how much energy is this using? that's what i'm really concerned about.

Expand full comment

They keep doing the same thing and expect the result to be different. Where have we heard that before?? It doesn't take an Einstein to figure it out.

Expand full comment

Hype and no substance to keep the money spigot going!!!

Expand full comment