92 Comments
User's avatar
Dakara's avatar

LOL, those graphs. This is what we call vibe-charting. It is how you report your vibe-coding progress to your manager.

Expand full comment
Amy A's avatar

You have to wonder about the ego required to put up such sloppy graphs.

Expand full comment
Antonio Eleuteri's avatar

Saul Goodman might say "flexible morals" (where "morals" in this case is "scientific integrity")

Expand full comment
Larry Jewett's avatar

BrAIking Bad

Expand full comment
Antonio Eleuteri's avatar

My keyboard thanks you for coffee I spilled on it :D

Expand full comment
Larry Jewett's avatar

To say nothing of about the intelligence

Expand full comment
Stephen Harrison's avatar

The misleading graphs are like the samples provided in middle school textbooks about how to spot misleading graphs. This is a huge presentation, there's no way they weren't deliberately* included to... mislead.

Expand full comment
Chris Davis's avatar

Yeah, hard to imagine no editing took place. This was either a yolo/screw it move (bad) or an attempt to pull a fast one (worse).

Expand full comment
hugh's avatar

Some might even say that OpenAI is “hitting a wall”

Expand full comment
PH's avatar

Their models hit a wall, but their hype generation has not. And so the story continues.

Though I think even a master talent like Sam Altman will hit limits there soon. After all, he announced GPT-5 as the Death Star of LLMs. There really is not much beyond that in popular culture. Perhaps a Borg cube?

Expand full comment
Fabian Transchel's avatar

"Gary Marcus thinks that perhaps scaling is not in fact all you need."

This made me laugh way harder than I should have. What a shit show AI has become.

Expand full comment
RMS's avatar

this is pathetic. An undergraduate who drew these up would be laughed out of the room. F-

Expand full comment
P Szymkowiak's avatar

As a tangential aside (and I'm NOT intending to Anthropomorphise) this is one of my go-to fallible / heuristic framing-models when dealing with LLM/GPT-based chat tools.

I frame my questions - and interpret the responses - as if I'm dealing with a very well-read, easily-excited undergraduate who is yet to leave academia, and has no real-world practical experience, yet shows potential as a research assistant.

Expand full comment
Kenneth Burchfiel's avatar

Thanks for the hot take! "Fan will still find something to rejoince in" should be corrected to "Fans will still find something to rejoice in."

Meanwhile, MSFT and the SP500 are both down today. Seems the markets aren't exactly 'rejoincing' over this new release.

Expand full comment
Meefburger's avatar

> which left the livestream looking like marketing rather than science.

If you watch release livestreams expecting science, you're gonna have a bad time.

If you want something closer to science, the model card is here: https://cdn.openai.com/pdf/8124a3ce-ab78-4f06-96eb-49ea29ffb52f/gpt5-system-card-aug7.pdf

You of course still need keep in mind all of the corporate incentives while reading that; it's not the same as a journal article.

Expand full comment
Robert Kraybill's avatar

Looks like they used GPT-5 to annotate the graphs.

Expand full comment
Mike Video's avatar

Gary Marcus, right again.

Expand full comment
Jan Steen's avatar

Free after Upton Sinclair: "It is difficult to get a man to understand that we are still far from AGI, when his salary depends on his not understanding it."

Expand full comment
Ian [redacted]'s avatar

I do better at my job than a billion dollar company does at their super important press events to the whole world? I should vibe myself into a job there!

Expand full comment
P Szymkowiak's avatar

To be fair, the word is all the good people took a billion dollars to work elsewhere, so Open-AI is likely relying on mechanical turk ...

Expand full comment
George Burch's avatar

Code deception percentage rate. Imagine that. Not displayed as linear but on a Möbius surface where 50 looks considerably lower. At least now we have statistics instead of damn lies

Expand full comment
TheAISlop's avatar

Feeling the SORA. Just waiting for Google to pull away comfortably, and bring the next big idea.

Expand full comment
Dan's avatar

This is the nittiest nit to ever nit but I believe “ And 52.8 is now less than 69.1??” should be more instead of less.

I wouldn’t even mention if this wasn’t an illustration of typos.

Expand full comment
Amy A's avatar

Right, typos in a hot take are okay. Completely wrong graphs in a massive presentation to announce a product you’ve spent 3 years building ought to be unacceptable.

Expand full comment
Stephen Harrison's avatar

It's almost as if they're meant to be misleading to an unsophisticated group of public users. Unless someone is suggesting that OpenAI employees are too dumb to notice the extremely obvious errors in graphs for a major presentation.

Expand full comment
RMS's avatar

too dishonest

Expand full comment
Stephen Harrison's avatar

Yes who could have guessed that someone like Sam Altman who's been fired from multiple jobs for lying to the board and has been caught faking demos and benchmarks would be misleading about the capabilities of his products.

Expand full comment
David in Tokyo's avatar

I am, once again, reminded of another Stanford drop out.

Who is in jail.

Expand full comment
Larry Jewett's avatar

Nothing is unacceptable when it comes to AI.

"Unacceptable" implies standards.

Expand full comment
P Szymkowiak's avatar

As the Fanbois say, it's not a problem: it's "hallucination-by-design"

Expand full comment
Matt Kolbuc's avatar

Of course it doesn't work, it's still based on the transformer architecture.

This technology is fundamentally flawd. It's cool tech, but you can't actually use it for anything. Why we're still going down this path alludes me.

From my calculations, about $800 billion spent on gen AI in total, and these idiots have the audacity to say they're trying to better humanity? If you want to better humanity, how about instead of building out your massive data centers, you go feed some poor kids? Idiots.

Expand full comment
P Szymkowiak's avatar

You highlight some reasonable points Matt - particularly the ludicrous ongoing investments to polish an already-well-polished plateau'd technique.

To be clear: I do agree with the gist of you point that LLM/GPT-based systems in-and-of-themselves are terrible as a complete approach to an AGI system.

However, I don't agree with your statement "This technology is fundamentally flawd. It's cool tech, but you can't actually use it for anything."

Rather, I think it's "This technology is fundamentally flawed for what it's currently being focused on: it cannot be AGI in-and-of itself"

I agree that it is cool tech, and I think LLM/GPT's *CAN* have an ongoing use in a more specific use for both input processing and output forming: essentially as intelligent language engines at the interface between humans and AI systems. Both of those are useful *as component parts* of a larger heterogeneous AGI system, along with world-models for different speciality problem-solution domains.

Expand full comment
George Burch's avatar

By automating building very large ontologies and semantically classifying web content at the sentence level serves as an intermediary between LLM and huma. we use an encyclopedia as the interchange. Thanks for like. George

http://aicyc.org/2025/07/26/aicyc-an-encyclopedia-for-llm/

Expand full comment
P Szymkowiak's avatar

There might be a place for SAM / AICYC-type solutions, however - at least as demonstrated on the website link you provided - from my perspective, this isn't targeting a fundamental problem / need.

In my view, the more general challenge / problem isn't auto-finding refined definitions or specialised sub-categories around a specific topic based on traditional classification decomposition: in my view, two more-important and fundamental problems are 1) having access to knowledge graphs - maybe simple ontologies - around fundamental / foundational sets of inter-related concepts and categories in a subject area or domain that shows the important properties and relations *between them* (not *within each independent member* of the set; and 2) understanding clearly the contextual constraints from which a question or request is being framed.

These more-significant problems are in large part a graph problem / a network problem: not a tightly constrained set of hierarchical ontologies *within* a specific topic / member. We've had those for years and - although they *can* be useful for some specific problem spaces - they're less generally useful: especially if your need is to create a useful generalised tool for problem solving.

Those more-significant problems - solved in way that provides a reliable, trustable, repeatable set of responses - require a) domain-specific / domain-tuned models that anchor on and reflect real and important concepts and constraints, that are b) tuned for the localised temporal-spatial and specialised contextual factors of the requestor.

I don't believe LLM/GPT-based systems can resolve the core challenges inherent of those problems - in and of themselves - in a reliable, repeatable, trustable way.

Expand full comment
George Burch's avatar

I agree and that is why we export our knowledge graph in W3C RDF.

Here is a directory of the KG with 5 million concept nodes (articles) .

https://aicyc.wordpress.com/wp-content/uploads/2025/07/aicyc-knowledge-domain-directory.pdf

Expand full comment