menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

This new benchmark could expose AI’s biggest weakness

17 0
25.03.2026

Exclusive: This new benchmark could expose AI’s biggest weakness

ARC-AGI-3 tests whether models can reason through novel problems, not just recall patterns, a task even top systems still struggle to do.

[Screenshot: ARC Prize]

The influential AI researcher François Chollet has long argued that the field measures intelligence incorrectly, that popular benchmarks reward a model’s ability to memorize vast amounts of data rather than navigate novel situations and learn new skills. Only recently, with the rise of autonomous AI agents, have companies begun to take that critique seriously. On Tuesday, the ARC Prize Foundation, which Chollet founded with Zapier cofounder Mike Knoop, released a new and more difficult version of its benchmark. The test, called ARC-AGI-3, may offer the clearest measurement yet of how close today’s AI agents........

© Fast Company