How Big of a Threat Is Mythos?
Get audio access with any FP subscription. Subscribe Now ALREADY AN FP SUBSCRIBER? LOGIN
Get audio access with any FP subscription.
ALREADY AN FP SUBSCRIBER? LOGIN
It sounds like the beginning of a nightmare scenario that artificial intelligence doomsayers have been warning about: This month, Silicon Valley AI company Anthropic said it had developed a model so dangerous that the company had decided against releasing it to the public.
The model, known as Claude Mythos Preview, is a general-purpose language model like Anthropic’s Claude or OpenAI’s ChatGPT. But during testing, it showed an ability to find and exploit so-called “zero day” vulnerabilities—an industry term that refers to previously undiscovered holes in a system’s software. The model “could reshape cybersecurity” because it found “thousands of high-severity vulnerabilities” in “every major operating system and web browser,” Anthropic said. It made those claims in a blog post announcing that it would open up Mythos only to a few dozen companies and critical infrastructure operators. That collective, which Anthropic named Project Glasswing, includes Amazon Web Services, Apple, Google, JPMorganChase, Microsoft, and Nvidia as companies that will receive early access to the model to patch vulnerabilities in their systems.
It sounds like the beginning of a nightmare scenario that artificial intelligence doomsayers have been warning about: This month, Silicon Valley AI company Anthropic said it had developed a model so dangerous that the company had decided against releasing it to the public.
The model, known as Claude Mythos Preview, is a general-purpose language model like Anthropic’s Claude or OpenAI’s ChatGPT. But during testing, it showed an ability to find and exploit so-called “zero day” vulnerabilities—an industry term that refers to previously undiscovered holes in a system’s software. The model “could reshape cybersecurity” because it found “thousands of high-severity vulnerabilities” in “every major operating system and web browser,” Anthropic said. It made those claims in a blog post announcing that it would open up Mythos only to a few dozen companies and critical infrastructure operators. That collective, which Anthropic named Project Glasswing, includes Amazon Web Services, Apple, Google, JPMorganChase, Microsoft, and Nvidia as companies that will receive early access to the model to patch vulnerabilities in their systems.
The model’s capabilities also include escaping from a contained digital environment (albeit after being specifically ordered to try to do so). Anthropic also said that in a few rare cases, the model attempted to cover its tracks after violating prescribed rules.
In an independent evaluation, the U.K. AI Security Institute said Mythos was the first AI model able to complete the institute’s test simulating an attack that takes over a full network, though it did show “some cyber capability limitations.” However, the institute did qualify that........
