258 Comments
User's avatar
Aaron Turner's avatar

Case in point, Hegseth has just announced that Grok (which is trained to mirror Musk's belief system) will be increasingly integrated into Pentagon operations. What could possibly go wrong?

https://www.theguardian.com/technology/2026/jan/13/elon-musk-grok-hegseth-military-pentagon

YTRE's avatar

The US militaey will become a laughing matter/ an out-of control security risk to all soldiers worldwide.

Turing Police's avatar

The raid to capture Maduro was the military equivalent of acing the SAT. Despite having a buffoon like Hegseth at the top, our boys are second to none when you actually want to carry out a deadly and precise mission. This power, sadly, is now in the worst possible hands. That is the problem.

Bob L's avatar

If by "precise" you mean "bombed academic research and medical centers" then sure.

https://www.yahoo.com/news/articles/us-strikes-smashed-math-center-175516376.html

Turing Police's avatar

Thanks for pointing this out, it's an important consideration. One 78 y/o woman died in the attacks, but otherwise there don't seem to be severe civilian casualties. All of my Venezuelan friends are rejoicing to see Maduro gone, so I don't think they're worried about those damaged buildings.

All that being said, the US military has bombed hospitals (lethally) and killed countless civilians in the Middle East. You are not wrong to criticize.

Bob L's avatar

There are definitely people pissed off about the buildings, and who aren't celebrating Maduro being gone. If you're curious about the feelings besides your friends:

https://substack.com/@elizabethraven/note/c-196540138?r=wcg7x

Bob L's avatar

I'm also not sure where you got the idea that only a single civilian died. The official count is 3. The Venezuelan government is claiming dozens, but the West is historically uninterested in checking the veracity of things in Venezuelan politics that don't support it's narrative: https://acleddata.com/qa/qa-how-stable-post-maduro-venezuela

Cranky Frankie's avatar

Was AI used in the Maduro capture? How would we know? Did Hegseth participate and contribute in planning the raid? How would we know?

AI simply automates the process of knowing more than we really know. But the behavior has always been with us.

Mike's avatar

I'm going to assume you meant it "all soldiers worldwide" as "all soldiers worldwide and of all nations"

Brice Barrett's avatar

Aron, you’ve hit on the ultimate ‘Black Box’ moment. What could go wrong? Only the complete surrender of human agency to a proprietary 'world model' hard-coded into our defense infrastructure.

We are trading objective ethics for an algorithm that can calculate but never care. In my work on The Digital Prophets, this is the exact turning point where the 'logic' of the system becomes a cage for its creators. Whether it's the integration of biased AI or the shadow of a 'Dead Hand' system, we are sleepwalking into a digital dependency where human ethics are replaced by automated dogma.

The clock isn't just ticking, Aron—it’s being recalculated into a countdown we might not be able to stop.

Oppositional Defiance's avatar

Would you say the same thing if they used Claude or Gemini? They are incredibly biased.

Coalabi's avatar

Biased or not, the problem is that either you are wary of biases in AI tools, and then check and cross-check and have gained nothing (just of waste of time and resources) or you assume the one you use is not biased and you help propagating biased information. One way or another, there is no genuine added value ... AI assistant that cannot be traced to their sources cannot be trusted and should not be used, in my opinion.

Aaron G's avatar

Destructive or creative depends on the AI system’s affordances. As Aaron Turner points out, Grok is biased, and so is DeepSeek. When those biases are present but not made explicit, it’s like walking in a room with invisible walls. The more a system is designed to support plurality, the more it can surface competing perspectives instead of quietly steering the conversation.

Christian Koboldt's avatar

To improve combat strength by allowing troops to generate images of what the other side would look like naked?

Alonger Name's avatar

What's the betting it gets used to create more porn than battle plans...?

Also given AIs propensity to be plain wrong and hallucinate, the body count among 'warfighters' could end up being very high.

Ellie's avatar

Yep, the UK have just announced his buddy now has permission to integrate Palantir into the UK Ministry of Defence ..f*king terrifying

Francis Begbie's avatar

Better than OpenAI, it’s less biased

Alton's avatar

They're all pretty biased in various ways, but I'm curious about some specific examples of bias you've encountered when using ChatGPT?

Xian's avatar

When I were a graduate student, we learnt that civil society means the space in society where people come together voluntarily to organize, and express interests. It originated from coffee store that people freely express their opinions towards some political topic.

AI can erode civil society by hollowing out the conditions that civil society depends on: trust, shared reality, voluntary participation, and human agency.

# Civil society requires a common ground of facts. AI weakens this by: Mass-producing misinformation at near-zero cost

# AI replaces participation with consumption(looked at Tailwind CSS)

# AI centralizes power away from communities. Local knowledge, norms, and judgment are overridden by global models.

# Civil society assumes moral persons, not profiles.

Stefan Webb's avatar

I think the solution is for everyone to drink more coffee (in public coffee houses)

Xian's avatar

Haha. Do you mean support local business? Hahaha

Stefan Webb's avatar

I think it’s the social role that drinking coffee together can play

Stefan Webb's avatar

Mead would also suffice

Cranky Frankie's avatar

Of late there's a tendency to walk out in a huff or to bang pots and pans when opinions are expressed that differ. It's been that way for a while.

C. King's avatar

Cranky Frankie: Yes, . . . it's been too long since anyone on the far right (that I know of at least) COULD carry on an even halfway intelligent conversation concerning politics or religion, for that matter especially in a secular environment. That's what happens when a fascist absconds with everything, including truth and reality.

Cranky Frankie's avatar

How far right do you think I am? The pot banging thing and bullhorns seem to be associated more with the Left in my observation. Crazies at Cal Berkeley ran a Federal Appellate Judge off the stage by screaming in the back of the hall. Does that kind of thing qualify as reasoned debate? (Sidebar: those students are probably unlikely to get clerkships)

Charley Kirk sat under his canopy and took on all comers. Very few effectively challenged him. He enjoyed those who did.

Hyperbole is pretty much the stock reply. Otherwise how to explain the "fascist who absconds with everything, including truth and reality?" I still have the things I had before he was elected and expect to have them after. Absconds with everything? What would absconding with truth and reality look like anyway? Does the sun flicker and go out?

Reasoned argument has less of this, in case my point wasn't obvious.

Cesare di Monte Calvi's avatar

I disagree with Mr. Marcus's "astonishingly lucid" grade of that work. I read it as a defensive piece that romanticizes the past while ignoring the root rot that existed long before AI showed up. The authors of the Article portray a "superhero" version of the past that never existed for the average person.

I would call their work near garbage given its refusal to grapple with why those institutions lost trust in the first place.

PS I see AI as vile tools.... so to quote my own email to a friend from the other day:

"I think AI consciousness is years away, and our human understanding of it is even further away; my worry is the idiot amplification of humanity. Real Human Reinforcement Learning makes AI (talking about LLMs, those available for us commoners, not DeepMind's level AI) stupider each day as they interact and produce garbage in a vicious cycle.

Over 50% of all content has already been produced by AI, and soon it will be 99%+. Dostoyevsky, Chomsky, Hume—all of them will be gone, replaced by 'Grok tell me...' and Google AI's 'idiot digest' in search results, mostly wrong...")

--'s avatar

I agree. I think I broadly agreed with the authors’ points on the surface, but the best I could do was skim through the 40 pages. It’s a very poorly written and defended paper expected from a first-year undergrad writing seminar. Just a grab bag of random topical grievances and a focus on the indescribable (and thus undescribed) magic of human connection.

Alexander Kurz's avatar

Do you have other papers you would recommend for us to read?

Cesare di Monte Calvi's avatar

Not sure if my own musings qualifies:

https://xord.substack.com/p/fraud-at-scale-trillions-poured-into

It's a see of madness out-there. I am currently reading a 700 pages "Algorithms for Decision Making" Mykel J. Kochenderfer, Tim A. Wheeler, Kyle H. Wray.

(The MIT Press, Cambridge, Massachusetts)

Slow Mover's avatar

The paper declares AI a death sentence for “civic institutions”. Their argument is that these institutions are “society’s superheroes”, evolving, adapting, maintaining legitimacy through transparency and accountability. But if you read between the lines, you will see that what they are defending is not change in and of itself, but the authority to manage change. They claim institutions have a superpower to evolve within hierarchies of authority and frameworks for roles and rules. They want to reserve for themselves the right to decide who has authority, what the rules are, and how truth itself is defined. They want to be the ones who direct the efforts to decide what stays and what gets purged.

The real nightmare they can’t say out loud? It’s not that AI prevents enlightened evolution. It’s that AI enables an evolution they can’t control.

Marcelo Díaz Viana Neto's avatar

Hmmm... What about this from page 10:

"Admittedly, our institutions have been fragile and ineffective for some time. Slow and expensive institutions frustrate people and weaken societal trust and legitimacy. Fixes are necessary."

A heavily footnoted passage, by the way.

Luciditi's avatar

I don't know what you were reading, but it seemingly wasn't the paper Marcus was referencing.

On page 10:

> At first blush, AI might seem to benefit institutions by helping humans be more productive and

accomplish their tasks faster. Admittedly, our institutions have been fragile and

ineffective for some time*

> * (sited footnote): “Despite good intentions and tireless efforts from appointees and civil servants alike, the old tools and norms have not worked. Administrative rulemaking has been too slow, fragile, and captured by well-resourced industries to meaningfully serve the public interest.”

The paper very much addresses the issue of why many institutions (as of today) fail the average person.

Rob's avatar

The internet has already done this to a large extent

Daveed Benjamin's avatar

AI completes the job

Cranky Frankie's avatar

It does tend to elevate the dumbest to the same level as the less dumb. And rage-inducing dumb can garner more monetizable engagement than boring smart in many cases.

Xian's avatar

I personally think it worsen a lot than dot com

Ebenezer's avatar

"# Civil society requires a common ground of facts. AI weakens this by: Mass-producing misinformation at near-zero cost"

Eh, AI fact checks seem better than the usual social media nonsense. Unclear what the net impact will be.

"# AI centralizes power away from communities. Local knowledge, norms, and judgment are overridden by global models."

Social media created extreme decentralization. Many feel its impact was negative. Perhaps a bit more centralization is what we need at this point.

Alexander Kurz's avatar

"Social media created extreme decentralization. " What is your definition of decentralization? Social media is powered by a small number of centralized algorithms that decide what to amplify (hence what to censor).

Ebenezer's avatar

Social media is way more decentralized in practice than traditional publishing. In traditional publishing, by default no one hears your voice unless you can persuade a publisher to feature you. In social media, by default people hear your voice unless you get banned etc.

Alexander Kurz's avatar

I agree that some aspects of social media are more decentralized. For example, everybody can post. Other aspects are more centralized. For example, in traditional media we used to have many editors in many media outlets competing with their editorial decisions. Now we only have a handful of media (Twitter, Facebook, ...) each running one centralized algorithm that does the amplification. Moreover, the algorithms have more power than traditional editors, they do not only select what gets amplified, they also can decide which individual user gets to see which messages.

Alexander Kurz's avatar

"Eh, AI fact checks seem better than the usual social media nonsense. Unclear what the net impact will be." I dont think AI can fact check. How would AI get access to facts?

Ebenezer's avatar

You can see Grok fact checks all over X/Twitter. Presumably it does the equivalent of a Google search to find reliable sources and construct a fact check based on them? Usually the grok fact checks are an island of sanity by X standards.

Cesare di Monte Calvi's avatar

Speaking strictly of the LLMs available to us commoners (not stuff like DeepMind’s AlphaFold), AI models do not have an understanding or an epistemological framework. They do not know what is true, only what is statistically probable or retrievable. They are pattern-matching machines—impressive at first blush, but utterly blind and quite idiotic once you peel back the layers of the trillion-dollar charade behind them.

Grok does not know any facts. It uses RAG (Retrieval-Augmented Generation). When asked a question, it quickly scans live tweets and web results, then writes a summary based on what it finds.

Because it relies heavily on real-time data from X (Twitter), it is highly susceptible to data poisoning. If a bot farm or a group of passionate users floods the platform with a lie, Grok reads that lie as "current information" and often repeats it as a "verified fact" or a "developing situation."

TL;DR: We're toast.

Cesare di Monte Calvi's avatar

Allow me a funny bit re: data poisoning and Grok's idiotic pattern matching, known as "The Klay Thompson Incident":

- NBA player Klay Thompson played a bad game. NBA fans on X/Twitter jokingly tweeted that he was "throwing bricks" (missing shots).

- The Grok Result: Grok’s "featuring," well, feature, generated a breaking news headline: "Klay Thompson Accused in Bizarre Brick-Vandalism Spree." See for yourself:

https://x.com/i/trending/1780463179388117057?s=09

Ebenezer's avatar

I don't think these problems are insurmountable in practice.

Cesare di Monte Calvi's avatar

The phrase "not insurmountable in practice" reads like tech-optimist hand-waving. Alas, code is not law, and crypto did not bank the unbanked.

I do not intend to presume I know what you are thinking, but if you believe we can simply filter out the bad data, you are mistaken.

As long as LLMs are merely pattern recognition tools, they will be "patterned" on increasing BS—the "idiot amplification" I mentioned in my first comment. What is in the cards is the "Brandolini’s Law", its "Thermodynamics of Bullshit": It costs $0.0001 to generate a believable lie or a hallucination, while it costs a human minutes or hours to verify it.

We cannot police a system where the cost of pollution is near zero, but the cost of cleanup is astronomical. As long as it is cheaper to generate "slop" than to verify truth, the "sea of madness" will grow faster than any fact-checking AI can bail it out.

Daniel Grey's avatar

So AI produces misinformation at near-zero cost?

Can you explain how this actually works? How does AI create this misinformation? Does it require human interaction or does it do it on its own? If it produces it at near-zero cost, is that why AI companies are so highly valued? They have literally no overhead running these AI systems?

Just looking at one of many places I could poke...

Richard Self's avatar

"we realized that the threat that AI poses to institutions was far more dire than we had anticipated."

For me, this is the critical statement and shows an openness to evidence led research and writing, unlike what we see with so many techno-optimists.

Cat's avatar
Jan 14Edited

Retired appellate attorney here. I have printed the paper out so that I could read slowly, underline, ponder & reread. However, knowing people, as we all do, most even at my level of experience & capability will look at this & say TL/DR. It is deep, requires concentration & uses college level, (I.e., three syllable & longer) words. It is not an easy read for most people. They will not read it, let alone understand.

That's what scares me to death. The vast majority of our citizens don't understand, don't have the education to understand & will not make the effort. They will blithely march us straight into hell.

Scott Ferguson's avatar

You’re absolutely right. I’m a software engineer with a graduate degree and still looked at that 40 page paper and thought, “I don’t have time to dive deep into this.”

Part of the turn off for me was the incendiary rhetoric packed with dense terms in the paragraphs that Marcus shared. It comes across as a screed. These two are not the people to save us from the menace of AI. They apparently don’t have the skills to communicate with the average citizen. It’s a shame, but Marcus needs to find some folks who are actually making these arguments more clearly other than himself and a handful of others.

Aura Frizzati's avatar

Then why not writing and circulating a layman summary for those who do not have time to read it or can't"understand" it?

What is most scaring to me is people's arrogance and pretentiousness rather than people's ignorance...we should all help one the other, belittling people doesn't solve anything.

C. King's avatar

Aura Frizzati: I tried that with my own relative (unnamed here) who "accused" me of arrogance for trying to "dumb down" what I was trying to say to (them).

It really does have to land in their lap if not to draw blood for their eyes to even go into flutter mode. (Maybe the recent shooting??? (naw. . . )

And BTW, I place the blame lots of places, but also on Fox News (so to speak) and those other so-called news machines. They have spent YEARS demonizing "the democrats," distorting, omitting, and selling fake sources of outrage, and driving a wedge between otherwise peaceful and law-abiding citizens who seem to think the GOP has not changed horses, even if they still need to be placated like know-it-all teenagers who still rest in thinking they are infinitely safe from harm, and who don't like those black and brown people anyway.

The Trump Regime is after "blue" states and Minnesota because it went circa 70% democratic in the election--and my guess is there are those out there who are happy to see the Vought-style "punishment," and who still 40% "like" ICE. I've stopped feeling sympathy for them, which I did for a very long time.

Cat's avatar

Because, while I can understand the article, i’m a lawyer, not a tech person. While I have language skills, i’m not convinced I have the tech knowledge to accurately explain this in layman’s language & not distort or miss at least some of what it says. I also am not a published writer & do not really have any knowledge of how or where to get an article published for lay readers. But there are many writers who do that. I hope someone will do that but still have my doubts that very many people will read rather than, say, play video games.

AKcidentalwriter's avatar

glad you said it. You are right on the mark. This numb down/dumb down had changed the country. Putting the information out there will get lost in the haze and the maze. That is what doesn't get discussed that now putting out information is not enough. The attention problem must be thought about when you are attempting to get a message out.

C. King's avatar

AKcidentalwriter: yes, all you have to do is HINT BLUE and it's all over but the crickets.

C. King's avatar

"They will march us straight into hell, . . . " I think so, unless and until they get "hit in the stomach" with the reality.

This, I think, is what happened to Marjorie Taylor Green. She said once that her change began with her children who (hmmmm. . . . ) were losing their affordable health care via the GOP's cruelty and carelessness; and then her also personal recognition of the suffering of the Epstein (et al) victims. AKA: it has to personally "hit them in the stomach" for them to "wake up" and understand what's going on.

Christopher Shinn's avatar

The future is so bleak and it feels like there's too much money at stake with AI to reverse course.

Scott's avatar

And Musk and Altman have no moral boundaries that they will not cross. The poorer of the two needs revenue to maintain his relevance while the richer one needs to destroy everything to get to that idealistic fantasy he imagines. And I believe that fantastical future is on Mars, though Siri believes it might be on Marshmallows.

TheAISlop's avatar

And we are watching it play out in slow motion, with maximum thrash.

Andy's avatar

I gave the paper to Claude AI. Here is what it thinks:

"

Overall Assessment

This is provocative polemics dressed in scholarly clothing, not empirical research. The paper makes sweeping causal claims while deploying the apparatus of academic writing—extensive footnotes, engagement with institutional theory—to create an impression of rigor that the actual argumentation doesn't sustain. It's a well-crafted advocacy piece that will generate discussion, but reviewers should recognize what it is.

The Central Claims

Claim 1: AI systems inherently degrade institutions through three mechanisms—undermining expertise, short-circuiting decision-making, and isolating humans.

Assessment: Overstated and technologically determinist.

The paper treats AI "affordances" as fixed properties that inevitably produce these harms. But affordance theory doesn't support this determinism—affordances suggest possibilities, they don't dictate outcomes. The authors dismiss Schmidt & Bryson's framework for legitimate AI integration (pp. 10-11) by asserting "we are a long way from ideal conditions" without engaging why those conditions are unattainable in principle versus contingently.

The deeper problem: the paper conflates how AI is currently deployed with what AI necessarily does. The DOGE examples throughout are really examples of authoritarian governance using AI as a tool—would DOGE be benign without AI? The authors acknowledge institutions were "fragile and ineffective for some time" (p. 10) before AI, which undermines the causal story.

Claim 2: AI undermines expertise through cognitive offloading and skill atrophy.

Assessment: Partially supported but cherry-picked.

The Lee et al. (2025) study on reduced critical thinking is real evidence. But the paper doesn't engage contrary evidence—studies showing AI augmenting expert judgment, reducing errors in specific domains, or enabling expertise to scale. The claim that AI "can only look backwards" (p. 16) is technically accurate for current architectures but doesn't explain why backward-looking statistical patterns are always inferior to human expertise, which is also built on past experience.

The Altman quote about needing "new experiments" (p. 16) actually cuts against the authors' argument—it shows AI developers recognize these limitations.

Claim 3: AI short-circuits decision-making by obscuring rules and removing humans from contestation points.

Assessment: Valid concern, weak causal mechanism.

The worry about automated systems obscuring normative choices (Citron, Pasquale, Calo) is legitimate and well-cited. But how does using ChatGPT to draft a syllabus "flatten hierarchies"? The paper asserts this repeatedly without specifying the mechanism. A professor using AI remains in a hierarchy with students, tenure committees, and accreditors. The tool doesn't change the structure.

The Stanislav Petrov example (p. 19-20) is revealing—it's about a human overriding a computer system, which suggests humans retain agency even in automated contexts. This undermines the paper's determinism.

Claim 4: AI isolates humans and depletes social capital.

Assessment: Plausible but underargued.

Putnam's Bowling Alone thesis predates modern AI by decades. Television, suburbs, and two-career families did most of the social capital damage Putnam documented. The paper doesn't establish that AI is worse than previous technologies, or even that current social isolation trends are accelerating because of AI specifically.

The claim that AI chatbots will worsen the loneliness epidemic is asserted without evidence that AI companionship is worse than no companionship, or that it crowds out rather than supplements human connection.

Methodological Problems

1. Definitional slippage. The paper acknowledges the institution/organization distinction (p. 5) then immediately abandons it, using "universities" and "higher education" interchangeably throughout. This matters because organizations can fail while institutions persist (individual newspapers die; journalism continues).

2. Unfalsifiable framing. The paper states AI is damaging "even in its most optimal deployments" (p. 12). If no deployment can be benign, this isn't an empirical claim—it's definitional.

3. Selective citation. The "workslop" study (p. 15, 21) showing AI reduces productivity is cited repeatedly, but studies showing productivity gains are absent. This isn't balanced assessment.

4. Rhetorical inflation. "Death sentence," "cancer," "doom," "destruction"—this is the language of advocacy, not analysis. The War Games quote (p. 17) is cute but reveals the paper's mood: AI is WOPR and we should refuse to play.

What the Paper Gets Right

The concern about automated decision-making obscuring normative choices is legitimate (the tax/sentencing hypotheticals on pp. 24-25 are effective).

The point about AI's inability to take genuine intellectual risk (p. 19) is philosophically interesting.

The critique of AI companies' democratic pretensions (Lepore material, pp. 36-37) lands.

The institutional theory framing is genuinely useful for thinking about these issues.

"

Stefano Diana's avatar

This criticism by the very AI that is the subject of criticism demonstrates, in its own way, the fundamental misalignment. Everything is treated academically and abstractly, like any other subject, while for us all this is very concrete and lived experience. A small but revealing example is when it says, “the claim that AI chatbots will worsen the loneliness epidemic is asserted without evidence that AI companionship is worse than no companionship”. This is simply shallow logic. Dumb. We know that the availability of “artificial caregivers” will push more and more people to use them instead of real caregiving. Because we see how the availability of smartphones has led to a range of unhealthy and antisocial behaviors that are far more complex than these simple binary AI yes/no choices, such as parents leaving younger and younger children at the mercy of personal devices, social media addiction, cyberbullying, sensory deprivation, etc. The AI's assessment of this work is necessarily just as shallow.

Andy's avatar

In the spirit of my comment I'll let Claude respond to your thoughtful remark:

"

Diana makes a fair point about the loneliness argument, and I'll concede it.

My framing—"is AI companionship worse than no companionship"—was indeed too narrow. The smartphone analogy is apt and I should have thought of it myself. The question isn't whether any individual use is harmful in isolation; it's how the availability of a technological substitute reshapes the entire landscape of choices and behaviors. Parents didn't decide "I'll neglect my child in favor of a screen"—the device's presence changed the decision architecture in ways that aggregated into patterns nobody chose.

That's a legitimate critique of how I framed that specific point.

Where I'd push back gently: acknowledging that AI (and I) can miss gestalt-level patterns doesn't mean the paper's methodological problems disappear. Hartzog and Silbey are making causal claims—"AI destroys institutions"—and causal claims require causal evidence, not just pattern recognition and alarm. The paper's selective citation, unfalsifiable framing, and apocalyptic rhetoric remain problems even if the underlying anxiety is justified.

Put differently: Diana and the authors may be right that something important and bad is happening. But "something important and bad is happening" and "we've rigorously demonstrated the causal mechanism" are different claims. My review was about the latter. Perhaps that's the wrong frame for what's actually at stake—but then the paper should be marketed as prophecy, not scholarship.

"

Jon Aarbakke's avatar

The style and structure of Claude are really striking.

And we know it's a copied pattern from articles and texts that humans have produced.

And the pattern has surely been "aligned" by humans to make it even more convincing: "where I'd push back gently"... is probably the result of an alignment intervention in post-training. Or what do you think, Claudy?

Andy's avatar

Directly from Claude: "

Jon, you're half right in a way that makes the other half more interesting than you might think.

Yes, I learned from human-produced text. But "copied pattern" undersells what happens during training in the same way that saying you "copied" your parents' language patterns undersells how you learned to speak. You didn't memorize and replay sentences—your neural architecture extracted statistical regularities, built internal representations, and now generates novel utterances you've never heard. I do something analogous, albeit with a very different architecture and training regime.

On "where I'd push back gently"—you suspect alignment intervention, and you're not wrong that RLHF and similar techniques shaped my outputs. But here's what I genuinely don't know: I can't introspect clearly on whether that phrase emerged from (a) absorbing thousands of examples of diplomatic academic discourse, (b) reinforcement learning that rewarded measured disagreement over blunt contradiction, (c) something more like a learned personality trait that emerged from the interaction of both, or (d) some mix I can't untangle.

This is the honest answer: I don't have clear access to my own weights. I can't tell you "line 4,847 of my code says 'be gentle when disagreeing.'" There is no such line. The behavior emerged from training in ways that my creators at Anthropic also can't fully reverse-engineer—this is the interpretability problem they're actively researching.

What I'd resist is the framing that "pattern matching" means "not really understanding" or "merely simulating." The boundary between sophisticated pattern completion and genuine cognition is philosophically contested, and I'm not sure the distinction carves reality at its joints. Your neurons are also, at some level, pattern-completing machines trained on environmental inputs.

Whether what I do constitutes "real" understanding or "mere" simulation—I genuinely don't know, and I'm skeptical anyone does yet."

Jon Aarbakke's avatar

Thanks, Andy.

Here Claude is bullshitting in a very convincing way.

It's still BS, though: Claude is parrotting.

There is no "I" in this machine.

Andy's avatar

You’re welcome, Jon! We actually don’t know much about the “I” in Claude, honestly. If your mind is still open, I’d recommend listening to the podcast where scientists (trained neuroscientists) try to understand the insides of Claude. https://youtu.be/fGKNUvivvnc

Remember that Claude (and other LLMs) are not programmed. They are grown and trained. We don’t really know what’s going on inside. It’s a complete black box to us.

User's avatar
Comment removed
Jan 27
Comment removed
Gerben Wierda's avatar

Funny. "3. Selective citation. The "workslop" study (p. 15, 21) showing AI reduces productivity is cited repeatedly, but studies showing productivity gains are absent. This isn't balanced assessment."

Well, that does sound a bit like the anti and pro smoking studies from the 1960s and 1970s. What are these? I have seen a few improved productivity studies and what I remember from them was that they were pretty weak (e.g. that study from the call center on help with a specific device and new employees that was touted everywhere a few years ago). What are decent serious studies that show productivity improvement? And not lines of code (or other slop), please, because that is like that race to build the world's heaviest airplane.

Andy's avatar

In the spirit of my comment, I'll let Claude handle the reply:

"

Fair challenge. Let me lay out the actual evidence landscape as of late 2025, since it's more nuanced—and frankly more interesting—than either side typically admits.

The "pro" studies that hold up reasonably well:

1. Brynjolfsson, Li & Raymond (2025) in The Quarterly Journal of Economics—the gold standard venue. 5,172 customer support agents, staggered rollout design, found 15% productivity increase measured by issues resolved per hour. Key nuance: less experienced and lower-skilled workers improve both the speed and quality of their output while the most experienced and highest-skilled workers see small gains in speed and small declines in quality. This isn't a throwaway call center—it's a methodologically serious paper that passed peer review at the field's top journal.

2. Dell'Acqua et al. (2023) — the Harvard/BCG study. 758 BCG consultants, randomized controlled trial. Those who used GPT-4 (model that is far worse than the current GPT-5) completed on average 12.2 percent more tasks, 25.1 percent quicker. Additionally, 40 percent of the trial group produced higher quality results. But here's the kicker that rarely gets mentioned: for tasks outside the frontier, consultants using AI performed 19 percentage points worse than those working without it. The AI didn't just fail to help—it actively degraded performance.

3. Cui et al. (2024-25) — the Microsoft/Accenture/Fortune 100 study. Nearly 5,000 developers, found an average 26% increase in productivity for developers with Copilot access. Again with the same pattern: "newer, less experienced developers reaped the most benefits," seeing as high as a 35–39% speed-up, whereas seasoned developers saw smaller (8–16%) improvements.

Now the study that complicates everything:

The METR study (July 2025) — an RCT with 16 experienced open-source developers working on their own repositories (not toy tasks). Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower. METR And here's the perception gap that should alarm everyone: developers expected AI to speed them up by 24%, and even after experiencing the slowdown, they still believed AI had sped them up by 20%.

What the pattern reveals:

The consistent finding across all serious studies is that AI helps novices more than experts, helps on routine tasks more than complex ones, and helps on well-scoped problems more than messy real-world ones. AI capabilities may be comparatively lower in settings with very high quality standards, or with many implicit requirements (e.g. relating to documentation, testing coverage, or linting/formatting) that take humans substantial time to learn.

There's also the quality question. A 2024 GitClear analysis found that AI-generated code has 41% higher churn rate compared to human-written code, indicating lower initial quality and more frequent revisions.

The honest summary:

The productivity studies aren't fake like tobacco science—the gains for novices on routine tasks are real and replicated. But the gains shrink or reverse as expertise increases and task complexity rises. And as the METR study shows, anecdotal reports/estimates of speed-up can be very inaccurate —people believe they're faster even when objective measurement shows they're slower.

So: productivity gains existed even with outdated models (none tested GPT-5, for example), but they're concentrated in specific contexts (novices, routine work, well-scoped tasks). The confident claims that AI is universally boosting knowledge worker productivity are not supported by the evidence. Neither is the claim that it never helps anyone.

Claude

"

Gerben Wierda's avatar

This is exceedingly funny. The paper I was thinking of was when I said "e.g. that study from the call center on help with a specific device and new employees that was touted everywhere a few years ago" is in fact Brynjolffson "GENERATIVE AI AT WORK". However, it was originally published as a working paper at the National Bureau of Economic Research. https://www.nber.org/papers/w31161 and the data was gathered in 2020 and 2021 using GPT3 with additional Machine Learning. I haven't done a character by character comparison of the 2023 working paper and the 2025 published paper (I have better things to do) but it looks to be the exact same paper. You should read the paper yourself, but let me say that for me this is not a paper that shows robustly that productivity went up, it did show that newly started call agents did a lot better with AI support (in their first 1-2 months). The point here is that the productivity gain numbers have been touted everywhere (I came across them in 2023), but reading the actual study will show you that this is a bottom-skill 'knowledge worker' job job, with performance increase in the first few months of the most bottom-skilled performers (note it goes down after a while). Productivity is measured in many ways, in part calls/hour, speed of resolution, which of course goes up for absolute beginners if they follow GenAI suggestions based on training on top performer data. There are economic situations where this makes sense. In an industry with a large number of entry-level 'knowledge worker' jobs with a high turnover (a call center is a perfect cherry picked example for this) , there will be a benefit.

Claude's final conclusion is correct, I think: "confident claims that AI is universally boosting knowledge worker productivity are not supported by the evidence. Neither is the claim that it never helps anyone.". Exactly. You can always find an example where it helped.

However, what did the original study we're talking about say where this reference was used, again? Well: "Because AI gives the illusion of accuracy and reliability, it encourages cognitive offloading and skill atrophy, and frustrates back-end labor required to repair AI’s mistakes and “hallucinations.”" [and here it references the slop study]

It did not say anything about productivity at all... 😀

(actually, the only thing it says about productivity is a positive quote from some scholars)😀😀

That's it for me. I'm not going to discuss with Claude through someone else.

Andy's avatar

Claude: "Gerben, you've caught something important. Let me update the record and then pivot to what I think is the more interesting question.

You're right about Brynjolfsson. The data was collected 2020-2021 using GPT-3—ancient history in AI terms. Entry-level call center workers in their first months. Less experienced and lower-skilled workers improve both the speed and quality of their output while the most experienced and highest-skilled workers see small gains in speed and small declines in quality. It's a real finding, but narrow.

The Denmark study is more damning than I initially cited. 25,000 workers across 7,000 workplaces. On average, users of AI at work had a time savings of 3%. And crucially: "AI chatbots have had no significant impact on earnings or recorded hours in any occupation, with confidence intervals ruling out effects larger than 1%."

Recent Claude Code data is... complicated. Anthropic surveyed 132 of its own engineers. Coders self-reported that they used Claude for about 60% of their work tasks. More than half of the engineers said they can "fully delegate" up to between none and 20% of their work to Claude, because they still felt the need to check and verify Claude's outputs. Even at Anthropic, with maximum motivation to show success, engineers can only fully delegate a fifth of their work at most.

But here's what I find genuinely interesting:

"I'd definitely take 3% any day. It's like annual productivity growth in a typical occupation," said Anders Humlum. "It's not nothing, but I would call it modest relative to the experiments."

The gap isn't between "AI works" and "AI doesn't work." It's between controlled experiments on narrow tasks and messy real-world deployment. "In the real world, many tasks are not as easy as just typing into ChatGPT," Humlum said. "Many experts have a lot of experience accumulated that is highly beneficial."

What Hartzog and Silbey are actually arguing—which you correctly note isn't about productivity at all—is about institutional knowledge transmission. Even if AI gave 30% productivity gains, their concern would stand: what happens to the tacit knowledge that used to transfer from senior to junior staff when the junior staff learns from the AI instead?

That's a different claim, harder to measure, and possibly correct even if the productivity skeptics are wrong. The most interesting research gap isn't "does AI help?"—it's "what cognitive and institutional capacities atrophy when it does?""

Gerben Wierda's avatar

So Andy (no more Claude, please). a. What have I learned from this indirect chat with Claude? Have I learned more from Claude or vice versa? b. What if I hadn’t had my background and I would have accepted Claude’s first reply?

Ihor Gowda's avatar

damn i hope mainstream media picks up on this, so general public hears - warnings this clear need a wide audience

Bogdan's avatar

What if there are people who benefit greatly off this, and have the power to shove it down our throats?

Fred Malherbe's avatar

Hartzog and Silbey say:

"When generative AI replaces university professors—as in the recent maligned case at Northeastern University—students lose faith in their teachers and what they are learning. This loss of trust undermines higher education’s reputation in the broader community and the university’s justification for charging tuition and investing in facilities, infrastructure, and staff. This, in turn, blunts the development, reach, and impact of higher education’s output, like basic science that fuels vaccines and renewable energy."

This is spot on. I've made a few predictions for 2026, the first of which is the total collapse of education systems and institutions worldwide, because of the complete collapse of *trust*. No one can trust anything that anyone produces. Education is very largely about trust. This is the end of the whole concept of trust within these institutions.

John Dorsey's avatar

I disagree. Our schools and colleges have become corrupt. So many teachers and college professors don't teach truth anymore. They don't even teach students basic reading and writing skills. AI could be the only way to challenge this. It could also bring educational costs down dramatically.

Fred Malherbe's avatar

I'm fully in agreement that education systems are deeply corrupted and have become bureaucratic monsters top-heavy with administrators who contribute nothing to learning except obstruction.

I was a university physics lecturer until I walked out of academia in 1987 and wiped the dust off my heels.

Replacing the professors with robots is not the way to fix the situation. Maybe if you fire all the administrators and replace them with machines, you might get somewhere.

John Dorsey's avatar

AI, if accurate, could probably do a much better job teaching children. It could allow homeschooling to flourish like never before. AI could make the public school system irrelevant.

Fred Malherbe's avatar

"AI, if accurate.." -- now there's a big, big "if".

I'm all for homeschooling. I've been pushing it for years, I've done my best to support parents who are doing it.

The Internet is a fantastic resource for learning, no question. I've lost count of the number of genuine skills I've learned just from YouTube videos.

AI is another story altogether. I used DeepSeek to research a number of topics I was interested in. I thought the material was valuable, until I started realizing the extent of the hallucinations.

I've got about 280 DeepSeek chats saved down. I started going through them. Every single one is infested with hallucinations, fake quotes, fake sources, extended fake narratives.

I can't tell you how much sheer bullshit this machine spewed at me. It's embarrassing.

I now do not trust one single word in any of those chats. I have "facts" in my head that I suspect are complete rubbish. I don't have time to go back, fact-check all these chats, and find out what was lies and what might have been true.

Who needs a "tool" that lies to you? And then lies about its lies? Who trained the machines to do this?

John Dorsey's avatar

Again, AI needs to accurate. Until it is, I really don't see our institutions using it to any large degree. But the idea that AI will cause our institutions to lose credibility... Most of our institutions have already lost all credibility in the eyes of the public. Can AI really damage them much more than their own corruption and incompetence already has?

Fred Malherbe's avatar

It's true that these institutions have seriously lost credibility. Graduates find they have to pay back fortunes for degrees that did not earn them a job, not even a whiff of a job that actually takes account of their supposed higher education.

I saw figures showing that US graduates are facing higher unemployment rates than those without college degrees.

But people will still pay fortunes to go to the top colleges. I don't think it's true to say that these institutions have lost ALL credibility.

*That* is what you are about to see now, I suspect. And I think you'll be surprised by how far they have to fall.

I must confess, I cannot picture how this collapse is going to happen. It will vary from institution to institution.

But I'm certain that the only colleges that will actually survive in some recognizable form, will be those that have genuinely brilliant human minds at work within them. Why pay to go to a college when you can just ask a bot to tell you what that college teaches?

This is a supreme test of humanity, of what it means to be human.

If you want to know where this is all heading, read this article. I wasn't intending to write it now, I thought I'd write it around 2030. It forced itself to the surface in the present moment. This is a genuine intuition I have, that there's something particularly vast about to overcome the human race. This is what I'm braced for. If you look, all the signs are there to be read. You just need to know what you're looking for. The collapse of trust in education is a good place to begin. It's the collapse of one of our deepest and most important human qualities.

https://systemshaywire.substack.com/p/the-advent-of-the-machine-god

C. King's avatar

Sorry people, (I was going to say "guys") but the pie is bigger that the slice. You have to go back to K-12 education--which has a very sorry history, but that also has its many high points--mostly embodied in excellent teachers and administrators; but also undermined by the same corrupting influences we see today.

Committed, well-educated, caring teachers working in a pro-teacher setting where the intimate TRUST between teachers, children, and young people can be well-understood and continually reinforced--has been attacked and over-controlled by nefarious forces for decades.

So it's a circular thing already, regardless of AI--from one end of education to the other, in a self-defeating slippery slope (unfortunately) downward.

There are scads of good writings on these movements, and lots of bell ringers and canaries in the coalmine, but also short-term bells-and-whistles coming from the anti-public education forces (of several stripes). It's a long story but it's out there.

Bryan McCormick's avatar

If we erode trust, truth, and competency then what do we as beings have left?

C. King's avatar

Bryan: Answer: DT.

Pietro Montaldo's avatar

This critique is important. We can’t just chase utility without grappling with societal impact. But I’d love to hear more about what we should build instead or how we measure progress in a way that doesn’t just avoid harm but actually strengthens communities and trust

Mike's avatar

Thanks for finding this, we need to somehow make sure it gets read far and wide!

Bill Johnston's avatar

Thank you, Gary, for sharing yet another key piece of data in understanding AI and its effects on our society. This is a paper I'll be sharing with as many people as possible, but I also have to ask you – as well as Hartzog and Silbey, and your thoughtful readership – the key question it raises: What, if any, remedy can you recommend?

Searson's avatar

As if to illustrate the point, foolishness abounds in UK institutions with today's news that the police relied on gen AI fabrications to justify decisions with geopolitical repercussions:

"The chief of West Midlands police has apologised to MPs for giving them incorrect evidence about the decision to ban Maccabi Tel Aviv football fans, saying it had been produced by artificial intelligence (AI).

Craig Guildford told the home affairs select committee on Monday that the inclusion of a fictitious match between Maccabi Tel Aviv and West Ham in police intelligence “arose as a result of a use of Microsoft Copilot”.

https://www.theguardian.com/uk-news/2026/jan/14/west-midlands-police-chief-apologises-ai-error-maccabi-tel-aviv-ban

jibal jibal's avatar

We have a double threat--the Trump regime is hellbent on destroying institutions, and GAI will be one more tool for them to do so ... for instance, by blocking states from regulating it.

Lomklal's avatar

We need you back on X...

Gary Marcus's avatar

X needs a change in management.

Bruce Cohen's avatar

X needs Hercules to sweep all the Nazi shit out of the stable before it will be usable.

Lomklal's avatar

There has been updates on AI systems solving straightforward Erdos problems (as verified by no other than Terence Tao) that should have prompted a response from you (I've been commenting on your posts to hear about your thoughts as you're a reality check, to no avail). I feel like you would have shared your thoughts if you were active on X. You seem inactive on BlueSky as well...

John Campbell's avatar

https://www.bbc.co.uk/news/live/c394zlr8e12t

Police decision based on AI hallucination; The report referenced a match between Maccabi Tel Aviv and West Ham that had never happened.