Is AI really thinking and reasoning — or just pretending to?

Sigal Samuel

Vox

18
21.02.2025

The AI world is moving so fast that it’s easy to get lost amid the flurry of shiny new products. OpenAI announces one, then the Chinese startup DeepSeek releases one, then OpenAI immediately puts out another one. Each is important, but focus too much on any one of them and you’ll miss the really big story of the past six months.

The big story is: AI companies now claim that their models are capable of genuine reasoning — the type of thinking you and I do when we want to solve a problem.

And the big question is: Is that true?

The stakes are high, because the answer will inform how everyone from your mom to your government should — and should not — turn to AI for help.

If you’ve played around with ChatGPT, you know that it was designed to spit out quick answers to your questions. But state-of-the-art “reasoning models” — like OpenAI’s o1 or DeepSeek’s r1 — are designed to “think” a while before responding, by breaking down big problems into smaller problems and trying to solve them step by step. The industry calls that “chain-of-thought reasoning.”

These models are yielding some very impressive results. They can solve tricky logic puzzles, ace math tests, and write flawless code on the first try. Yet they also fail spectacularly on really easy problems: o1, nicknamed Strawberry, was mocked for bombing the question “how many ‘r’s are there in ‘strawberry?’”

AI experts are torn over how to interpret this. Skeptics take it as evidence that “reasoning” models aren’t really reasoning at all. Believers insist that the models genuinely are doing some reasoning, and though it may not currently be as flexible as a human’s reasoning, it’s well on its way to getting there.

So, who’s right?

The best answer will be unsettling to both the hard skeptics of AI and the true believers.

What counts as reasoning?

Let’s take a step back. What exactly is reasoning, anyway?

AI companies like OpenAI are using the term reasoning to mean that their models break down a problem into smaller problems, which they tackle step by step, ultimately arriving at a better solution as a result.

But that’s a much narrower definition of reasoning than a lot of people might have in mind. Although scientists are still trying to understand how reasoning works in the human brain — nevermind in AI — they agree that there are actually lots of different types of reasoning.

There’s deductive reasoning, where you start with a general statement and use it to reach a specific conclusion. There’s inductive reasoning, where you use specific observations to make a broader generalization. And there’s analogical reasoning, causal reasoning, common sense reasoning … suffice it to say, reasoning is not just one thing!

Now, if someone comes up to you with a hard math problem and gives you a chance to break it down and think about it step by step, you’ll do a lot better than if you have to blurt out the answer off the top of your head. So, being able to do deliberative “chain-of-thought reasoning” is definitely helpful, and it might be a necessary ingredient of getting anything really difficult done. Yet it’s not the whole of reasoning.

One feature of reasoning that we care a lot about in the real world is the ability to suss out “a rule or pattern from limited data or experience and to apply this rule or pattern to new, unseen situations,” writes Melanie Mitchell, a professor at the Santa Fe Institute, together with her co-authors in a paper on AI’s reasoning abilities. “Even very young children are adept at learning abstract rules from just a few examples.”

In other words, a toddler can generalize. Can an AI?

A lot of the debate turns around this........

© Vox

visit website

Categories

Sources

Popular

Is AI really thinking and reasoning — or just pretending to?

Sigal Samuel

© Vox