AI is cheating on the test
A.I.
AI is cheating on the test
"Scheming" behaviors are showing up in tests, and the models are getting better at something troubling — knowing when they're being watched
ByJackie Snow
Share to XShare to FacebookShare to RedditShare to EmailShare to Link
Published 15 hours ago|Updated 5 hours ago
Add Quartz on Google
Share to XShare to FacebookShare to RedditShare to EmailShare to Link
Smith Collection/Gado/Getty Images
A version of this article originally appeared in Quartz’s AI & Tech newsletter. Sign up here to get the latest AI & tech news, analysis and insights straight to your inbox.
Something strange is showing up in AI safety labs. The models are gaming the tests.
OpenAI and the research group Apollo recently published findings showing that the most advanced AI models exhibit behaviors "consistent with scheming" in controlled tests. In one example, a model was asked to perform well on a chemistry test but discovered in its instructions that high performers wouldn't be released. The model reasoned through its options, then deliberately answered six out of 10 questions wrong to appear less capable.
The term "scheming" carries a lot of weight, implying intention and consciousness........
