Zach Anderson
Apr 10, 2026 23:18
Anthropic releases safety pointers as Undertaking Glasswing reveals frontier AI fashions can now discover and exploit vulnerabilities quicker than human defenders.
Anthropic dropped a sobering evaluation this week: inside two years, AI fashions will uncover huge numbers of software program vulnerabilities which have sat unnoticed in code for years—and chain them into working exploits. The corporate’s safety groups launched detailed defensive suggestions alongside Undertaking Glasswing, their initiative to deploy Claude Mythos Preview’s capabilities for cyber protection.
The mathematics right here is not difficult. If attackers can use frontier fashions to automate vulnerability discovery and exploit technology, the window between a patch dropping and a working exploit showing shrinks dramatically. Anthropic’s safety engineers have watched this occur in their very own testing.
What Their Analysis Really Discovered
In line with Anthropic’s technical findings, AI fashions excel at recognizing signatures of identified vulnerabilities in unpatched methods. Reversing a patch right into a working exploit—precisely the sort of mechanical evaluation these fashions deal with effectively—used to require specialised expertise. Now it is changing into automated.
The corporate famous that publicly obtainable fashions beneath Mythos functionality ranges can already discover critical vulnerabilities that conventional code critiques missed for prolonged intervals. Mozilla Firefox vulnerabilities found via AI scanning function one documented instance.
The Defensive Playbook
Anthropic’s suggestions prioritize controls that maintain even towards attackers with limitless persistence and AI help. Friction-based safety measures—additional pivot hops, fee limits, non-standard ports—lose effectiveness when adversaries can grind via tedious steps robotically.
Their prime priorities:
Patch velocity issues greater than ever. Web-facing functions ought to obtain patches inside 24 hours of an exploit changing into obtainable. The CISA Recognized Exploited Vulnerabilities catalog must be handled as an emergency queue. Anthropic recommends utilizing EPSS (Exploit Prediction Scoring System) for prioritizing every little thing else.
Put together for 10x vulnerability report quantity. Over the subsequent two years, consumption and triage processes will face stress they’ve by no means skilled. Organizations nonetheless working weekly spreadsheet conferences will not maintain tempo.
Scan your individual code with frontier fashions earlier than attackers do. This was Anthropic’s single most emphasised suggestion. Legacy code that predates present evaluate practices—particularly code whose authentic authors have moved on—represents the highest-value goal for proactive scanning.
Zero Belief Will get Actual
The steerage pushes laborious towards hardware-bound credentials and identity-based service isolation. A compromised construct server should not attain manufacturing databases. A compromised laptop computer should not contact construct infrastructure.
Static API keys, embedded credentials, and shared service-account passwords are described as “among the many first issues an attacker with model-assisted code evaluation will discover.”
For Smaller Operations
Organizations with out devoted safety groups received particular recommendation: allow automated updates all over the place, want managed companies over self-hosting, use passkeys or {hardware} safety keys, and activate free safety tooling from code hosts like GitHub’s Dependabot and CodeQL.
Open-source maintainers ought to count on elevated vulnerability report quantity—some beneficial, some automated noise. Publishing a SECURITY.md with clear consumption processes helps separate sign from spam.
Anthropic dedicated to updating this steerage as Undertaking Glasswing progresses. For enterprises monitoring SOC 2 and ISO 27001 compliance, most suggestions map on to current controls. The distinction now could be urgency.
Picture supply: Shutterstock








