AI

OpenAI says AI browsers are smart, helpful, and extremely gullible

AI browsers may always be vulnerable to prompt injection attacks.

by
Ronil Thakkar
December 23, 2025

ChatGPT message window with search icon.

Image: OpenAI

Just a heads up, if you buy something through our links, we may get a small share of the sale. It’s one of the ways we keep the lights on here. Click here for more.

Even as OpenAI armors up its shiny new Atlas AI browser, the company is openly admitting a hard truth: prompt injection attacks aren’t going anywhere.

In a blog post, OpenAI compared prompt injection to scams and social engineering, essentially saying: Welcome to the internet, this problem is forever.

The company acknowledged that ChatGPT Atlas’ “agent mode,” the feature that lets it act on your behalf, dramatically increases the attack surface.

Atlas launched in October, and security researchers wasted no time stress-testing it.

Some demonstrated that a few cleverly placed words in something as innocent as a Google Doc could alter the AI browser’s behavior.

That same day, browser-maker Brave posted a blog warning that indirect prompt injection is a systemic issue for all AI browsers, including Perplexity’s Comet.

OpenAI isn’t alone in waving the caution flag.

Earlier this month, the UK’s National Cyber Security Centre warned that prompt injection attacks may never be fully mitigated and advised companies to focus on damage control, not magical fixes.

So what’s OpenAI’s plan? Fight fire with fire, or more accurately, fight hackers with a hacker bot.

The company has built an “LLM-based automated attacker,” trained with reinforcement learning, whose sole job is to think like a villain.

This bot runs attack simulations, studies how Atlas would respond, tweaks its strategy, and tries again, faster than any human red team could.

According to OpenAI, this attacker has already uncovered novel exploits that humans missed.

In one demo, a malicious email convinced an AI agent to send a resignation email instead of drafting an out-of-office reply.

After security updates, Atlas flagged the attempt instead, a small but meaningful win.

Rivals like Anthropic and Google agree the solution isn’t a single patch but constant pressure-testing. Still, experts urge caution.

Rami McCarthy of cybersecurity firm Wiz summed it up neatly: risk equals autonomy times access.

Agentic browsers sit squarely in the danger zone, lots of access, just enough independence to cause chaos.

OpenAI now recommends limiting what Atlas can touch, requiring confirmations, and avoiding vague instructions like “handle everything.”