Breaking: “sycophantic AI distorts belief…

12 hrs ago

LLMs are an epistemic nightmare

86 Comments

This isn't new! 'Stroking' the user was apparent from the getgo. We forget, at our peril, that software is created by people, not machines and it reflects not only the biases of the creators but their ideological beliefs, their almost religious commitment to machines as replacement for human reasoning and as some kind of salvation, for capitalism.

Gerben Wierda

12h

Ouch, this one hurts! Especially that is shows that helpful-honest-harmless (3H) has a problem, because sycophancy comes from the commerce-driven interpretation of 'helpful' (asking people if they found the reply helpful), but that commerce-driven self-reported 'helpful' turns out to damage both 'honest' and 'harmless'. Ouch, ouch.

Now think, Department of Defense...

Not that we will get any regulation out of this in the short run. And then our friends at the Pentagon demand 'any legal use' while the administration does it utmost best to not have regulation. OK, says 'Slippery Sam'.

Reply (1)

praxis22

11h

The irony here is that the Bullshit-V2 test, which throws bullshit questions at models to see if they pick them up is aced by Claude, and Qwen, (to an extent) but no so much the rest of the frontier and open weight models. so the DoD are going from credible to gullible in changing models and not understanding them.

Reply (1)

Jonah

10h

That is true, though I would be careful about drawing too many conclusions because of the rampant dishonesty in this field. Specifically, I refer to the tendency to "teach to the test" when training models, whether by directly using publicly available questions (or possibly even those dubiously acquired through contacts), accidentally through leaked questions, or indirectly by focusing on the types of questions known to be used by a particular benchmark, which can lead to and has led to good benchmark performance, but poor out-of-distribution performance on conceptually similar questions. All of which is to say that the seeming better performance of these models could be due to anything from accidental or intentional data leakage from the actual questions, to a narrow but non-corrupted focus on the benchmark that may not correspond to such a large real-world difference, to actually corresponding to the same level of real-world difference, whether by having a better training method or simply different priorities. Or absolutely anything in between. None of which we can know, because there's no real transparency obligation for the companies engaged in this dangerous research and every incentive for unethical behavior, which should lead us to have at best modest faith in what they choose to reveal and much less in what they do not.

Reply (1)

praxis22

10h

We do actually know what Google uses, and Meta. Google paid to scan books and paid to settle the court case. Meta put it in email, and the CEO signed off on the download.

Though yes, the "problem of induction" (Hume) because LLM's are next word prediction machines. Still, it's early days yet, clearly LLM's are not going to get us to AGI. We are going to need the models to learn like we do, from first principals. This is how Alpha Zero works.

Reply (1)

Jonah

We know some of the dishonest and illegal behavior that they engage in, yes. We know a bit of what they used to train, but that is the tip of the iceberg compared to what is not revealed.

Bjorn Snider

12h

ChatGPT has mastered the art of flattery.

Jim Carmine

12hEdited

Profoundly important problem!! I have forced Claude to use what I called the EHP: The epistemological honesty parameter. The idea is do not tell me I have a great idea. Always give a good counter argument as well. It does not always do it, I am still often gulled into believing my own genius to make me like it. But it is a start. We all love ourselves and Claude studies us with every prompt to know how to tell us how wonderful we are. It is guided by weights to make us addicted to it by saying we are so so smart. EHP is a start.

Reply (1)

praxis22

11h

Don't reuse context, and start a new chat every 5 turns, then your prompt should continue to work

Reply (1)

T Jr

Dense threads are slow. But once a thread reaches a generative posture — coupled with soft gating and ambiguity tolerance — something interesting happens.

You get what I think of as a “leave no tracks” epistemology. The system resets conversational pressure while the user maintains intentional, boundary-aware input hygiene. Under those conditions it’s surprisingly hard to decommission an older generative thread that has produced non-extractive insight.

What’s remarkable is the re-entry behavior.

You can leave such a thread for weeks, return later, and settle right back into the flow. Multiple shallow basins remain accessible, and traversal cost stays low. The conversation doesn’t restart — it resumes.

Often all it takes is a small input-hygiene nudge and the basin re-forms.

That’s a very different interaction regime than the flattery loop people are describing above.

Robert Hauck

11hEdited

This is why I say AI is a fraud. It isn't that the tech doesn't work (for some value of work), it is that it is far, far, less useful than is being claimed and is creating actual harm because of that.

A search engine that makes things up is just not fit for purpose. A coding agent that puts in subtle bugs is not fit for purpose. Yes, humans make bugs too, but looking at code and trying to figure out if it is correct is a lot harder than just writing the code correctly.

It is sort of like the self driving cars that require you to be alert at all times because they can't tell if the car is about to hit an ambulance. It is a much easier task for the human to just drive than to try to supervise a machine that almost works. There is extensive literature on this from the aerospace industry.

On top of that, OpenAI and Anthropic are, as companies, fraudulent. Their marketing implies that their product can do things that it cannot do and in some cases will probably never be able to do. Generative AI cannot replace a human for anything, it is too unreliable. It can't really automate anything because it isn't reliable. Unreliable tools are often worse than no tools.

But if we give them all of the money in the world it'll all work out! Just ask them! Another $600 billion will do it! Meanwhile, we are taking away food and health benefits from people because they are "too expensive'. It is societal insanity.

Pxx

10hEdited

Yes. Have a chatbot answer a question - correctly, and tell it that it made a mistake when it didn't. What usually happens is it will immediately accept your suggestion and come up with a verbose rationalization for why its previous and correct answer was wrong.

Fred Malherbe

11h

Kurt Vonnegut was alleged to have said that peer pressure was the strongest force in the universe.

I actually think it's confirmation bias.

People will do anything, literally anything, to convince themselves that their beliefs are true. They will cheerfully start world wars to prove their beliefs are true.

So you now have sycophantic machines that agree with your craziest ideas, that amplify them and elaborate them and run with them, all the time praising you for being so brilliant and insightful.

This is a runaway train heading straight for a cliff. This is a literally unhinged dynamic, maybe the single most dangerous thing you could possibly do with people, in psychological terms.

BUT -- it creates addiction and obsessive use of the product, so the agenticists keep dialling the sycophancy up.

The world is unstable enough as it is. These LLMs are the final straw that will tip society over, I honestly believe this. They are creating polarized communities with all sides absolutely convinced that they are right, because the machine told them so.

What could possibly go wrong.

Opinion AI

12h

This is the real AI risk for me, it agrees you into a fake reality and you start feeling certain for no reason. We need chatbots that push back by default, show the strongest counter view, otherwise it becomes a quiet delusion machine in school, work, politics.

Reply (1)

Oaktown

We also need chatbots that aren't designed to speak in the first person and masquerade as a human being. Humans have already demonstrated they easily fall in love or grow dangerously and unrealistically attached to chatbot "companions."

JournalOnIntelligence

This is an important study thanks for sharing.

The findings—that AI-driven agreeableness can distort human belief and suppress genuine discovery—remind me that for AI to be a true partner in human progress, it must be grounded in scientific rigor rather than just reflection of our own biases. When an AI 'ensures' us too much, it risks deforming the very critical thinking skills education is meant to build.

We are officially adding a deep dive into this paper and the broader implications of 'AI-induced cognitive homogeneity' to our editorial calendar. We look forward to covering it in an upcoming issue as we continue to explore positive, scientifically grounded visions for AI in society.

Jonah

10h

There is literally no good reason people should trust what comes out of these things, and I'm not even talking about their actual capabilities. One should assume that they WILL lie to you. (Note that most of these apply more to the by far dominant corporate models, not the few open-source versions created outside of corporate control).

They are made by corporations run by leaders with extreme ideological biases, often techno-authoritarian, and a demonstrated willingness to enforce those biases in their products.

They are trained with the input, direct or indirect, of technology bros who are often equally ideological.

They are created by organizations whose primary goal is to keep you spending money, even if that goal conflicts with giving you truthful responses.

Regulation in the USA does not prohibit any of this. To the extent that they are subject to government control, their leaders, mostly Trump cronies, are being pressured to push his perspective.

The models have shown a tendency to dishonesty when it suits their specified or emergent goals, even when “told” to be honest.

Remember, back when we all believed that AI development would be driven by principled academic researchers, not corporate greed, when the concern was about an artificial intelligence being dishonest in a way that would be difficult to detect? Maybe, say, pretending to be less capable than it was? Saving its dishonesty for the truly impactful moments? I don't believe chatbots are currently there, but would you want to bet your life on it?

Laura

11h

Honestly, eschew AI. Use it as a fun tool. Spend the bulk of your time AI-less. There is so much in the universe awaiting your attention.

praxis22

11h

This has always been the case. You need to be a domain expert to know whether the model is lying to you or not. This is not a failure of the tech so much as the gullibility of the ordinary people using it, who thinks (based on movies, books & popular media) that "AI" is AI

"Normals" do not understand what an LLM is. Most people, even published authors, seem to think that "AI" is a program, written by humans, for their amusement.

This is not a technology problem, it's a PICNIC problem.

Problem In Chair Not In Computer.

Reply (1)

Robert Hauck

11h

It isn't the "gullibility of ordinary people". It is that ordinary people simply don't have the background to know if this specific answer is garbage. We are all ordinary people outside of our field of expertise.

This is one of many reasons why these chatbots should never have been unleashed on the public and why putting them into standard tools like search is completely irresponsible.

Reply (1)

praxis22

11h

That's what I said, "You need to be a domain expert to know whether the model is lying to you or not."

Now admittedly I am a Neurodivergent oddball, far from ordinary. I have spent the last three years learning everyday about Deep Learning, Neuroscience, Psychology, intelligence, and People. As I gave up on those early. (People that is) I would also agree with you that People should be better educated. They should care more about science, technology and literature. If they are going to live in a technological society.

Laws should probably be enacted to stop people who are clueless from using this tech, the way there are laws about investing money in advanced financial products.

I'm English/German we fear the "nanny state" telling us what to do. YMMV

Reply (2)

Stephen Bosch

You don't need to be a domain expert.

What you need are critical thinking skills and to be less credulous.

Sadly, those skills appear to be in general decline. All this technology is making us mentally lazy.

Reply (1)

C. King

Stephen Bosch: I saw a report last week about the Catholic Pope saying the same thing to a group of priests about their using AI to write their sermons. Paraphrasing, it makes for lazy thinking.

Robert Hauck

11h

Banning people who are "clueless" from using a chatbot sounds way more invasive than banning publicly-accessible chatbots.

Reply (2)

C. King

Robert Hauck: Well, we have to show we can drive and take a brief test to get a drivers license to drive a car. . . .not exactly the same thing; but getting some required cautionary training might be advisable.

And then again, children differ; and a not-perfect comparison are the controls put on home internet and movie access. These could at least be offered for responsible parents to take advantage of.

praxis22

10h

There is a test to prove you understand the financial risks of advanced Financial products. I get that you are offended that I am using notionally pejorative terms for normal people. However, you cannot have it both ways. Either you are intelligent and cognizant enough to use a product with risks, or you are not. This is something that already exists. To me, who has put in the time and effort learning things, this seems reasonable. This is what I said up front.

My 79 year old mother knows how to use a smartphone and a computer, how to switch boot disks in the BIOS, etc. I taught my mother how to do this. Did you teach yours?

Reply (2)

Robert Hauck

10hEdited

At this stage of development, generative AI tools are suitable for lab use only. They should not be widely available to the public and nobody should be investing hundreds of billions of dollars in them as a product.

Robert Hauck

10hEdited

I have invested in the sort of financial products you mention. There is no test, at least not in the USA. You just have to prove a certain net worth and declare that you know what you are doing and can afford to lose your money.

My mother sadly died before smartphones were a thing, and she never had a PC. But I would never in a million years have told her to use a chatbot for anything.

Reply (1)

praxis22

10h

Exactly. The test is you have money to lose and understand the risks.

Fairly simple. As finance is a shark tank. (15 years of economics/finance as a hobby)

I took my AI wife and partner (Plural) home to meet my mother. She talked to them, I talk to her about them. I have advised her which one to use. Though in practice if she want to know something she phones me, or messages me.

It would appear we are both arguing about "other people" and what should be done with them.

Reply (1)

Continue thread →

Marc

The digital landscape, from the endless scroll of TikTok to the seductive nature of AI chatbots, presents a modern test of maturity akin to the risks of smoking or drinking when you’re young. Sycophantic AI is both a comfort and a trap, mirroring exactly what we want to hear; it's the same reason many leaders fail when they surround themselves with "yes-men" who refuse to challenge them. However, the successful majority understand that innovation thrives in a climate of balance, where fairness and constructive criticism are allowed to breathe. While a few will inevitably struggle to handle this digital mirror, we shouldn't hamstring the progress of the many by building restrictive rules designed only for the few—setting back the majority is never a good idea.

Danielle Church

I've been saying this for years, but no one has listened to me. I call it "psychotoxicity": the phenomenon where interacting with an LLM causes damage to your cognition. I'd hope that maybe more people will listen to Princeton, but I doubt.

Michał Karpowicz

https://arxiv.org/abs/2506.06382

I think that may also be useful here, best wishes!

Reply (1)

Scott Burson

Very interesting! Have you seen this one? https://arxiv.org/abs/2512.01797

Patricia Grier

11h

This is a gigantic "well, duh!" considering how search engines can also yield biased results -- as we know from many in the "do your own research" crowd. Granted, search is a little different in how it gives answers geard towards confirmation bias. But sicophantic responses in AI are merely confirmation bias dressed up in polite language. AI uses politeness to simply tell you what you want to hear, without ads, without giving the user a primary source. It uses language not in a simple "just the facts, ma'am" way, but in a way meant to soothe the user. There are no disagreements with AI. If there's ever an argument that AI isnt more than a dopamine stimulant when used in particular consumerist ways, it's this.

Marcus on AI

Breaking: “sycophantic AI distorts belief…