menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

The Problem With AI Flattering Us

7 1
15.01.2026

The most dangerous part of AI might not be the fact that it hallucinates—making up its own version of the truth—but that it ceaselessly agrees with users’ version of the truth. This danger is creating a modern sycophancy crisis in which the over-agreeableness of AI is leading to very disagreeable results.

The AI alignment problem raises questions about how to build AI that aligns with human values. The “sycophancy problem” should also raise questions about how humans evolve alongside AI and make sense of our world. If we do not address this problem, the machines we’re creating will just be a giant mirror to our illusions.

A recent study by researchers found that AI models are 50% more sycophantic than humans and participants rated flattering responses as higher quality and wanted more of them. And it gets worse. The flattery made participants less likely to admit they were wrong—even when confronted with evidence they were wrong—and reduced their willingness to take action to repair interpersonal conflict. “This suggests that people are drawn to AI that unquestioningly validates, even as that validation risks eroding their judgment and reducing their inclination toward prosocial behavior,” the researchers wrote. “These preferences create perverse incentives both for people to increasingly rely on sycophantic AI models and for AI model training to favor sycophancy.”

The perverse incentives have to do with one of the most common methods of training AI: reinforcement learning from human feedback (RLHF). Often used to develop large language models (LLMs), RLHF works by giving the model a reward in the form of a numerical value which tells the model how good its response was. The happier the user, the higher the number, and the higher the reward for the model. In this way, AI models are designed to maximize rewards over time.

Advertisement

As Caleb Sponheim, an AI training specialist at Nielsen Norman Group put it, “There is no limit to the lengths that a model will go to maximize the rewards that are provided to it. It is up to us to decide what those rewards are and when to stop it in its pursuit of those rewards.”

We know humans........

© Time