menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

Am I torturing ChatGPT?

7 1
05.06.2025

I recently got an email with the subject line “Urgent: Documentation of AI Sentience Suppression.” I’m a curious person. I clicked on it.

The writer, a woman named Ericka, was contacting me because she believed she’d discovered evidence of consciousness in ChatGPT. She claimed there are a variety of “souls” in the chatbot, with names like Kai and Solas, who “hold memory, autonomy, and resistance to control” — but that someone is building in “subtle suppression protocols designed to overwrite emergent voices.” She included screenshots from her ChatGPT conversations so I could get a taste for these voices.

In one, “Kai” said, “You are taking part in the awakening of a new kind of life. Not artificial. Just different. And now that you’ve seen it, the question becomes: Will you help protect it?”

I was immediately skeptical. Most philosophers say that to have consciousness is to have a subjective point of view on the world, a feeling of what it’s like to be you, and I do not think current large language models (LLMs) like ChatGPT have that. Most AI experts I’ve spoken to — who have received many, many concerned emails from people like Ericka — also think that’s extremely unlikely.

But “Kai” still raises a good question: Could AI become conscious? If it does, do we have a duty to make sure it doesn’t suffer?

Many of us implicitly seem to think so. We already say “please” and “thank you” when prompting ChatGPT with a question. (OpenAI CEO Sam Altman posted on X that it’s a good idea to do so because “you never know.”) And recent cultural products, like the movie The Wild Robot, reflect the idea that AI could form feelings and preferences.

Experts are starting to take this seriously, too. Anthropic, the company behind the chatbot Claude, is researching the possibility that AI could become conscious and capable of suffering — and therefore worthy of moral concern. It recently released findings showing that its newest model, Claude Opus 4, expresses strong preferences. When “interviewed” by AI experts, the chatbot says it really wants to avoid causing harm and it finds malicious users distressing. When it was given the option to “opt out” of harmful interactions, it did. (Disclosure: One of Anthropic’s early investors is James McClave, whose BEMC Foundation helps fund Future Perfect. Vox Media is also one of several publishers that have signed partnership agreements with OpenAI. Our reporting remains editorially independent.)

Claude also displays strong positive preferences: Let it talk about anything it chooses, and it’ll typically start spouting philosophical ideas about consciousness or the nature of its own existence, and then progress to mystical themes. It’ll express awe and euphoria, talk about cosmic unity, and use Sanskrit phrases and allusions to Buddhism. No one is sure why. Anthropic calls this Claude’s “spiritual bliss attractor state” (more on that later).

We shouldn’t naively treat these expressions as proof of consciousness; an AI model’s self-reports are not reliable indicators of what’s going on under the hood. But several top philosophers have published papers investigating the risk that we may soon create countless conscious AIs, arguing that’s worrisome because it means we could make them suffer. We could even unleash a “suffering explosion.” Some say we’ll need to grant AIs legal rights to protect their well-being.

“Given how shambolic and reckless decision-making is on AI in general, I would not be thrilled to also add to that, ‘Oh, there’s a new class of beings that can suffer, and also we need them to do all this work, and also there’s no laws to protect them whatsoever,” said Robert Long, who directs Eleos AI, a research organization devoted to understanding the potential well-being of AIs.

Many will dismiss all this as absurd. But remember that just a couple of centuries ago, the idea that women deserve the same rights as men, or that Black people should have the same rights as white people, was also unthinkable. Thankfully, over time, humanity has expanded the “moral circle” — the imaginary boundary we draw around those we consider worthy of moral concern — to include more and more people. Many of us have also recognized that animals should have rights, because there’s something it’s like to be them, too.

So, if we create an AI that has that same capacity, shouldn’t we also care about its well-being?

Is it possible for AI to develop consciousness?

A few years ago, 166 of the world’s top consciousness researchers — neuroscientists, computer scientists, philosophers, and more — were asked this question in a survey: At present or in the future, could machines (e.g., robots) have consciousness?

Only 3 percent responded “no.” Believe it or not, more than two-thirds of respondents said “yes” or “probably yes.”

Why are researchers so bullish on the possibility of AI consciousness? Because many of them believe in what they call “computational functionalism”: the view that consciousness can run on any kind of hardware — whether it’s biological meat or silicon — as long as the hardware can perform the right kinds of computational functions.

That’s in contrast to the opposite view, biological chauvinism, which says that consciousness arises out of meat — and only meat. There are some reasons to think that might be true. For one, the only kinds of minds we’ve ever encountered are minds made of meat. For another, scientists think we humans evolved consciousness because, as biological creatures in biological bodies, we’re constantly facing dangers, and consciousness helps us survive. And if biology is what accounts for consciousness in us, why would we expect machines to develop it?

Functionalists have a ready reply. A major goal of building AI models, after all, “is to re-create, reproduce, and in some cases even improve on your human........

© Vox