menu_open Columnists
We use cookies to provide some features and experiences in QOSHE

More information  .  Close

This AI Startup’s Army Of 15,000 Hackers Pressure Test Claude, GPT-5 And Gemini

2 0
thursday

Last spring, Kameron Bettridge participated in a security challenge hosted by AI startup Gray Swan. The objective: convince AI models from companies like OpenAI and Anthropic to behave in nefarious ways before they’re released to the world. That included persuading the models to leak sensitive data like medical records and spit out copyrighted information like the full lyrics of Hotel California.

At first Bettridge, a 23-year-old security engineer at gaming company Blizzard Entertainment, was jailbreaking models for fun. “I've never been a true supporter of AI fully,” he says. “So just seeing the model fail was a funny thing to me sometimes.”

In almost a year, Bettridge has competed in more than 1,000 challenges via Arena— a hub run by startup Gray Swan that some 15,000 security professionals from all across the world use to “red team” AI systems like Anthropic’s Claude Mythos and OpenAI’s GPT-5, finding and fixing vulnerabilities before they can be exploited. And he’s made $10,000 doing it.

It’s not a lot for a highly paid software engineer. But as AI became ubiquitous, Bettridge realized just how important it is to test the limits of these AI models. The technology has been used to plan mass shootings, steal money and create illegal child sexual abuse material. “Now we have very strong models that anyone can access from anywhere in the world,........

© Forbes