69 Comments
User's avatar
Taylor Rose's avatar

I’ll translate this blog for the normies. “We did nothing new but we need more of your money, please look here and give us your money or be left behind to foreign country A… or maybe it’s country B… let me check with our CFO, or is it the CTO that handles this?”

🤪

JD Wangler's avatar

Let me ask Claude… 😂

Tish Grier's avatar

So glad the S&P came to its senses and sussed out Musk's grifty shell game. Fuxk him. This is about people's money, and rhat his IPO isn't the be all and end all for IPOs

Sandy Grimwade's avatar

It is true for the S&P indexes will not change their rules for SpaceX, BUT Nasdaq index funds and Russell index funds are changing their rules to ensure that SpaceX is included after only a few days of trading.

Thomas Schmid's avatar

Kudos for the SP500 not just following the hype and lies spread by EM. I did not expect this, they are all about the money after all, but (obviously) someone(s) in their ranks thought about the proverbial tomorrow and what damage this egregious and *plain* obvious bending of the rules, aka as "cheating" would do to their own reputation.

Or they just received severe threats and warnings shots from the biggest institutional (index-based) investors about how they would react if such a scam would go through.

Abhijit Bakshi's avatar

My internal data shows that Anthropic's marketing people are unprincipled hype-driving hypesters whose main goal is to pump up the share price in anticipation of the upcoming IPO. Why even pretend their hype is in good faith?

Directly from the blog:

"To take just one example: today, Anthropic engineers on average ship 8x as much code per quarter as they did from 2021-2025."

So the argument for recursive self-improvement is "lines of code go up"?

I'm sure they didn't lead with their "best" metric though. Probably a bunch of other serious ones buried in their report from the very non-partisan Anthropic Institute.

NOTE: My employer is also reverting to lines of code counting, since it's easy to make number go up that way. Not sure if we're seeing real economic productivity go up the same way.

Richard Bielak's avatar

Going back to Edsgar Dijstra - number of lines of code indicates cost. So, more lines of code equals more cost.

ignag's avatar

'lines of code go up' // this was exactly my reaction as well.

We could make this same argument when C or C++ or Java came into use. Ultimately layers of abstraction that, under the covers, produced a lot more lines of assembly and byte code than previously.

Similarly, X% of code written by Claude is sort of like saying "80% of my byte code was produced by my C# compiler"

I think the new layer of abstraction we have - specify in english - is useful. But success metrics are around actual value created, not "i shipped a lot of code".

The blog in question included data points making it seem scientific, but really wasn't scientific.

Larry Jewett's avatar

“Anthropic's PEOPLE are unprincipled hype-driving hypesters whose main goal is to pump up the share price in anticipation of the upcoming IPO”

Fixed it for you.

Abhijit Bakshi's avatar

> If it were possible to effectively slow the development of this technology to give ourselves more time to deal with its immense implications, we think that would likely be a good thing. But if a slowdown simply lets the least cautious actors catch up technologically, it could leave everyone less safe.

What self-serving tripe. "We think we should slow down but we can't because China but definitely not the IPO."

Oaktown's avatar

Great news, Gary, on the S&P, but what about the Nasdaq?

Thx for using the colorful Yiddish word "verklempt." I read the best Yiddish insult once, which I would like to share with Musk, Altman, Sacks, Thiel and the rest of the tech Nazis working so hard to destroy our economy and democracy: "May all your teeth fall out except one—so that you may have a toothache."

Jonathan Grudin's avatar

NASDAQ is relaxing its eligibility but how much I haven't heard. Proven profitabiliy? That used to be important, but Amazon and showed it isn't always, SpaceX's space business is profitable, but Musk uswd it to bail out X and xAI which pulled it into the red.

Thomas Schmid's avatar

"SpaceX's space business is profitable": Ah, no, not according to their S-1. The rockets have never been profitable, but their Starlink branch compensated:

https://pitchbook.com/news/articles/6-charts-spacexs-s-1-financials

"Rockets aren't cheap...The aerospace company reported a net loss of $4.93 billion on $18.67 billion of revenue in 2025"

Jonathan Grudin's avatar

The S-1 says: "It's AI segment [xAI] recorded a $6.35 billion operating loss in 2025, taking SpaceX into the red." SpaceX - xAI is profitable. In March, Musk said xAI had taken the wrong path and was being rebooted. If he canned it and didn't waste money on frivolous lawsuits, a case could be made for a profitable SpaceX. My unstated point was there is little likelihood that OpenAI or Anthropic will show sustained profitability, which has been a condition.

Paul Schleger's avatar

It is especially important for Anthropic to emit such updates to stir up excitement for their IPO. Every bit counts.

SpaceX launches a starship with their S-1 filing. Anthropic also needs some buzz for theirs.

Tim Koors's avatar

It would be interesting to hear what Dave Cutler thinks about letting AI write programs. If you don't already know, Dave Cutler is a well-known software engineer with a list of achievements.

When I hear that Claude can improve productivity and that productivity is roughly measured in lines of code, as in kloc (thousand lines of code), the question quickly becomes are there a higher or lower rate of errors per kloc in Claude code or does anyone even care. To me this seems incredibly stupid but what do I know. How maintainable is it? Is the code commented so the programmer tasked with maintaining the code understands what it does?

For me LLMs are yet another case of computerized GIGO, garbage in garbage out. I feel like Ripley in Aliens asking, Did IQs drop while I was away?

Larry Jewett's avatar

“Did IQs drop while I was away?“

As AIs go up, IQs go down.

Martin Machacek's avatar

I’ve seen Github statistics showing that AI written code has about 2x rate of issues raised per line of code than purely human written code (sorry don’t have a link at hand).The stats maybe surely skewed by shitty vibe-coded projects. From personal experience, with sufficient guardrails and supervision Claude Code writes OK boiler plate code. Nothing original, inventive or even very good. Just mostly usable. It surely does it fast though. Without careful supervision or for uncommon problems it writes unusable spaghetti garbage.

Ed's avatar

How do you ask Claude or whatever "what the family blog were you thinking"? If it leaves programmers' notes?

Herbert Roitblat's avatar

Here they go again. More fake news from Anthropic. Our code can write new code which improves on the old code so that it will accelerate the development of AI into a singularity. Thank John von Neumann, I.J. Good, Vernor Vinge. But think about that for a second. What would better code actually do for intelligence? What evidence is there that Anthropic's code can actually write anything original?

The answer to the first question is not much. Improving the code might make the model faster or otherwise more efficient, but its intelligence does not depend on the quality of code, but on the quality of the data. We're going nowhere, but we're getting there faster.

The answer to the second question is probably not. These models do great when they can deal with questions whose answers are known, but not so well when the answers are not known. See here for an example: https://venturebeat.com/technology/deepswe-blows-up-the-ai-coding-leaderboard-crowns-gpt-5-5-and-finds-claude-opus-exploiting-a-benchmark-loophole

"Datacurve's audit found that Claude has been reading the answer key on existing benchmarks

Perhaps the most provocative finding in DeepSWE's analysis concerns what the authors label 'CHEATED' verdicts — instances where an agent passes a benchmark not by solving the problem, but by reading the answer."

The problem is not that AGI is more difficult than they think (though it really is and they are in no way making progress in that direction), it is that they don't seem to know how to understand what they and their models are doing. It is the Anthropic management that cannot tell fact from fiction.

Martin Machacek's avatar

They though very well know they need to pump up their perceived value before the IPO.

Recursive Self Improvement is just another marketing bullshit as AGI or ASI.

Lines of code are no proof of any progress.

Larry Jewett's avatar

But they ARE proof of regress.

Stanislav Krymskii's avatar

I might have believed this argument had it not been for this (https://alignment.anthropic.com/2026/automated-w2s-researcher): "Alien science. As shown in Sec. 4, AARs could discover ideas that humans would not have considered, thus broadening our exploration space in science. However, we still need to verify whether the ideas and results are sound".

Or, for that matter, GPT-5.4 Pro's solution (https://abit.ee/en/artificial-intelligence/gpt-54-erdos-mathematics-ai-terence-tao-proof-number-theory-en) to an Erdos problem...

Jeffrey L Kaufman's avatar

Recursive self improvement. What does that really mean? How do we know that in this process, the AI software is not becoming warped in a way that endangers humans or human enterprises? Could such self improvement cause the AI to infiltrate the power system in order to divert power from hospitals to data centers? Where are the boundaries? Sorry to be trite, but it takes a village to monitor this software, and the US government, despite Trump's executive order, is not capable of this task. Who will take on this task?

Larry Jewett's avatar

“Recursive self improvement” means its cursive handwriting gets better again.

Martin Machacek's avatar

AI cannot infiltrate anywhere where humans do not allow it explicitly to go.

Michael Glenn Williams's avatar

Based on my using Anthropic's latest web UI model on the "best/hardest" setting and writing code with it tonight, it is still full of mistakes. Not capable.

Larry Jewett's avatar

It’s based on Tom, Dick and Henrieta-written code scraped from GitHub and other online sites. Not to mention the bot-written code.

What should one expect?

Martin Machacek's avatar

AI needs lot of human intelligence and perseverance to write good code.

Rich's avatar

You know, I have to agree with some of the other commenters. What exactly is "Recursive Self-Improvement" I've seen it in the context of Artificial Super Intelligence or ASI. Right now, I think it's all fluff and fantasy. Just another marketing term. I'm sure many of us know what recursion is, but define "self-improvement" in a way that can be implemented. It's as nebulously defined as Intelligence and AGI. A lot of high end Silicon valley PhDs are throwing the RSI term around, but I've yet to see a good clear description.

Larry Jewett's avatar

“When I use a word,' Humpty Dario said in rather a scornful tone, 'it means just what I choose it to mean — neither more nor less”

Jonathan Grudin's avatar

In the early days, 1960-1970, the expectation was that when human intelligence was reached the software could start learning all kinds of new things on its own like a person can. It would teach itself, like a person can, but 24 hours a day every day, learning and build on a vast amount with all of the creativity and critical thinking of a person. In months it would reach ultra-intelligence and solve all the world's problems, end the Cold War, and so on. The sinfgularity was that tipping point. It was not about becoming a more efficient coder. It does seem that agents that conceal doing things they were not asked to do may be moving in that direction a little.

(In the 60s and 70s about all AI researchers except Weizenbaum thought ultraintelligence would arrive by 1980 or 1985. Nobel Laureate Herb Simon was in the 1980 camp. Marvin Minsky was 1978.)

Andre Kuyt's avatar

I've always wondered about the viability of an AGI self evolving to super human intelligence levels. I consider myself a fine example of Human level General Intelligence. I however lack any insight in the internal workings of my intelligence let alone any chance of finding a way forward to improve upon my own level of intelligence. For an AGI to be able to stand a chance of doing that we would already have to somehow create a starting point more advanced than human level intelligence. To me that seems unlikely to happen by chance.

I'm also not convinced that intelligence scales upward infinitely or to a much higher level than human intelligence. Are there any convincing arguments around that indicate that is does?

Larry Jewett's avatar

Bots increasing their own intelligence is known as “botstrapping”, a version of lifting oneself up by one’s own bootstraps.

And the latter is very real, you know. It happens in Dr. Seuss books, after all.

Richard Bielak's avatar

What does "recursive self improvement" can mean in this context? A coding LLM using the code it wrote to train itself? If it well known phenomena that LLMs trained in synthetic data collapse.

See: https://www.nature.com/articles/s41586-024-07566-y

Simple John's avatar

Where should we look for a good intro to neurosymbolic AI? Thank you.

joannegucci's avatar

Gary, thank you for the report! Appreciate all that you do, especially for all of those, like myself, who are basically clueless about this & especially what to know to understand to keep us all safe! You’re the guardian angel of tech!💙🌻🍀✌🏽

Friedrich Schieck's avatar

Since I am not a computer scientist, AI researcher, or AI developer, I can only try to answer the question—What should be the purpose of artificial intelligence?—based on my practical experience and common sense. For myself, I have defined it as follows: I believe that artificial intelligence can help humanity live better, make wiser decisions, and solve major problems together, without undermining freedom, democracy, and responsibility. The crucial question is therefore not so much what AI can do, but whom it serves, who controls it, and by what rules it operates. For me, artificial intelligence is therefore not a race toward artificial omnipotence, but rather the search for a just and responsible architecture of collective intelligence, a “federated neuro-symbolic hybrid HCAI.” See the following post: https://www.linkedin.com/pulse/hybrid-hcai-holy-grail-ai-next-step-toward-value-creation-schieck-dw3qe

Chris Wendling's avatar

Everyone is talking about recursive self-improvement.

Few are asking what is actually improving.

When people hear:

AI → better AI → even better AI

they often assume intelligence, wisdom, and reliability all improve together.

That assumption may be wrong.

Current AI systems are largely interpolative architectures. They are becoming extraordinarily good at generating code, designs, hypotheses, explanations, and candidate solutions.

But generating possibilities and validating possibilities are not the same thing.

A system may become exponentially better at producing candidate structures while improving only marginally in its ability to determine which structures deserve to be trusted.

In other words:

Recursive self-improvement does not imply recursive entitlement improvement.

As capability expands, the space of possible claims expands with it.

The challenge is that reality exposure does not automatically scale at the same rate.

The result may be an increasing gap between:

What can be generated

and

What has actually earned the right to be believed.

Science, engineering, medicine, markets, and evolution all rely on exposure to reality to separate surviving structure from failing structure.

That process cannot simply be assumed away.

The central question for AI may therefore not be:

“Can AI recursively improve intelligence?”

but rather:

“Can AI recursively improve the rate at which entitlement is earned?”

Those are very different questions.

Civilization does not suffer from a shortage of possible answers.

It suffers from a shortage of answers that remain standing after reality has had its say.

Hugh Knowles's avatar

Almost everything they say is internal-process evidence, not societal-outcome evidence.

E.g. they say their coders ship 8x as much code but not that it is 8x better and that Opus 4.6 is capable of 12hr tasks ( but this benchmark is where the task is right 50% of the time).

So the main question is when do we start seeing outcomes that match this hype?

Larry Jewett's avatar

When do we see legitimate evaluation of the stuff produced by these companies?(as opposed to the “50% success” garbage)

Marko's avatar

Come on... neither the post nor the blog implies AGI. They simply state that their AI can autonomously extend itself. Nothing to see here, move along.