48 Comments

I advise caution against buying into the hype created by OpenAI. They have repeatedly made claims that have later been discredited by independent research.

There are also significant issues with AI benchmarks, particularly given that the training data is neither shared nor independently verifiable, leaving them open to manipulation.

This appears to be yet another case of what resembles p-hacking – manipulating data to produce results that align with one’s interests.

Consider this excerpt from a tweet:

“Remember o3’s 25% performance on the FrontierMath benchmark?

It turns out OpenAI funded FrontierMath and had access to a substantial portion of the dataset.

The mathematicians who created the problems and solutions for the benchmark were not informed that OpenAI was behind the funding or had access to the data.

This raises critical concerns:

• It is unclear whether OpenAI trained o3 on the benchmark, making the reported results questionable.

• Mathematicians, some of whom are sceptical of OpenAI and would not wish to aid general AI advancements due to concerns about existential risks, were misled. Many were unaware that a major AI company funded the effort.

From Epoch AI:

‘Our contract specifically prevented us from disclosing information about the funding source and the fact that OpenAI has data access to much but not all of the dataset.’

https://x.com/mihonarium/status/1880944026603376865?s=46&t=oOBUJrzyp7su26EMi3D4XQ

Expand full comment

The whole “benchmarking” process is very unscientific and completely unreliable.

The people involved need to take some very basic science classes to learn what science is about.

Expand full comment

No self respecting, legitimate scientific organization would ever have agreed to such a contract to begin with.

Expand full comment

Sam Altman: “guys please stop believing the hype I’ve played a huge hand in creating,” 🙄. I feel like Sam Altman’s corporate title should be “Chief of AI Hype,” because that seems to be his whole job.

Also, Gary you’re the best.

Expand full comment

So what does "we know how to build AGI" mean, pray? A Sutskever-style answer like: we ask GPT4-3o to build it for us and give it unlimited time and resources. Anything that doesn't involve magical or alchemical thinking? Is *anyone* able to put the screws on so he'll answer what 'we know how' means?

Expand full comment

Not sure what “we know how to build AGI” means but I think we can be fairly confident it doesn’t mean Sam does.

Expand full comment

Most likely it means “Give us some more money”

Expand full comment

All an LLM is and can ever be is a cheaternet. It looks smart by globally peeking over people's shoulders, copying their answers, and then pretending it came up with those answers all by itself.

This needs to stop.

Expand full comment

Such a clown show!

Expand full comment

So bored by this bait-and-switch.

(On the other hand, I'm confident we'll end up on the better side of this with more instead of less skepticism.)

Expand full comment

He’s been giving Holmes vibes for a while. The whole refusing to show the details bit is unscientific and, as the kids say, sus.

Trade secrets are one thing if you start off as a for profit, and quite another when you claim to be saving humanity. 🤦🏼‍♀️

Expand full comment

This feels like definitional gaslighting.

Expand full comment

Thanks again Gary for being the voice of sanity countering the shitstorm that Substack Notes has become.

Expand full comment

I've been thinking about an apt analogy for current AI. I've landed on human muscle memory, except exceptionally better muscle memory. Just like a human can blurt out words of a song after listening to it over and over again, or automatically swing a tennis racket the right way after hours and hours of practice, or chess grandmasters just play opening moves by memory, AI is good at that. The problem is we are trying to now make muscle memory produce useful things. AI is failing exactly the same way human muscle memory would fail if it were applied this way. Imagine a bank clerk using muscle memory to process your transaction.

Expand full comment

I’d have no problem if the bank teller was using the muscle memory he had developed giving billionaires cash to process my transaction.

Expand full comment

You would want them to pay attention and use their intelligence rather than mindless going through rote steps

Expand full comment

Altman is making a mockery out of the technology his company pioneered. Can you imagine the CEO of IBM, or Microsoft, or Oracle doing this? No. If those guys did it, their board would fire them.

Expand full comment

I prefer the term hypesters to influencers. That way critics aren't lumped in with them.

Expand full comment

Imagine a world where AI research could advance without the extreme distorting effects of self-promotion. Imagine where we (would not) be if Feynman, Von Neumann, Cantor, Curie and the rest had been such shameless hucksters.

Expand full comment

So, does this mean that we now need to recalibrate the delivery of AGI to the end of this century or even the next?

Expand full comment

No. It won't take a century. I think the approach of learning a lot from the world from data is in fact the right starting point.

OpenAI's products are also moving from just rehashing stuff to multi-step logic where the AI tries many strategies and evaluates itself until it solves the problem.

What is needed is for AI to have a better understanding of what it is dealing with and a better feedback loop. Likely more sophisticated representations too.

I'd say another 5-10 years.

Expand full comment

Dario Amodei is now making claims of what can be interpreted as AGI or even ASI coming soon.

What is your opinion on this? Dario seems much more creidble than altman regarding hype and he has been involved in actual research

Expand full comment