OpenAI Launches Safety Bug Bounty Program Targeting AI Agent Vulnerabilities

Felix Pinkston
Mar 25, 2026 17:33

OpenAI expands its safety efforts with a brand new Security Bug Bounty program targeted on agentic dangers, immediate injection assaults, and knowledge exfiltration in AI merchandise.

OpenAI has launched a public Security Bug Bounty program geared toward figuring out AI abuse and security dangers throughout its product suite, marking a big enlargement of the corporate’s strategy to securing more and more autonomous AI programs. This system, introduced March 25, 2026, particularly targets vulnerabilities in agentic AI merchandise that would result in real-world hurt.

The brand new initiative enhances OpenAI’s current Safety Bug Bounty by accepting submissions that pose significant abuse and security dangers even once they do not qualify as conventional safety vulnerabilities. Researchers who establish points may have their submissions triaged by each Security and Safety groups, with experiences routed between applications primarily based on scope.

Agentic Dangers Take Middle Stage

This system’s scope reveals OpenAI’s rising concern about AI brokers working with rising autonomy. Key focus areas embody third-party immediate injection assaults the place malicious textual content can hijack a consumer’s agent—together with Browser, ChatGPT Agent, and related merchandise—to carry out dangerous actions or leak delicate data. To qualify for rewards, such assaults have to be reproducible at the very least 50% of the time.

Different in-scope vulnerabilities embody agentic merchandise performing disallowed actions on OpenAI’s web site at scale, publicity of proprietary data associated to mannequin reasoning, and bypasses of anti-automation controls or account belief alerts.

What’s Out of Scope

Normal jailbreaks will not qualify for this program. OpenAI explicitly excludes common content-policy bypasses with out demonstrable security influence—getting a mannequin to make use of impolite language or return simply searchable data would not depend. Nonetheless, the corporate runs periodic personal campaigns targeted on particular hurt sorts, together with current applications focusing on biorisk content material in ChatGPT Agent and GPT-5.

The corporate will think about edge instances on a case-by-case foundation if researchers establish flaws that create direct paths to consumer hurt with actionable remediation steps.

Business Implications

This launch alerts that main AI builders are taking agentic security critically as these programs achieve capabilities to browse the net, execute code, and work together with exterior providers. The Mannequin Context Protocol (MCP) dangers talked about in this system scope recommend OpenAI is especially targeted on how brokers work together with third-party instruments and knowledge sources.

For the broader AI ecosystem, this program establishes a framework that different corporations might observe as autonomous brokers change into extra prevalent. Researchers desirous about taking part can apply via OpenAI’s Bugcrowd portal, with the corporate emphasizing its dedication to working alongside moral hackers to safe AI programs earlier than vulnerabilities will be exploited at scale.

Picture supply: Shutterstock

Source link