Meet OpenAI’s Operator, an AI agent that navigates the web for you

Meet OpenAI’s Operator, an AI agent that navigates the web for you
Source: Venture Beat

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


OpenAI has unveiled Operator, its first semi-autonomous AI agent, which is designed to “operate” a web browser much like a person would, on their behalf. The agent uses the cursor to point and click, types on its own, browses the web and performs actions on various websites, such as booking restaurant reservations through OpenTable and assembling orders on Instacart and DoorDash. That’s instead of being confined to the ChatGPT interface or OpenAI’s application programming interface (API).

“This product is the beginning of our step into agents,” said CEO and cofounder Sam Altman in a demo livestreamed on the company’s YouTube Channel today at 1 pm ET.

OpenAI president and fellow cofounder Greg Brockman wrote on X: “2025 is the year of agents.”

The preview, now available to paying U.S. subscribers of OpenAI’s ChatGPT Pro ($200 per month) plan, aims to demonstrate the potential of agentic AI while gathering critical feedback to refine its capabilities.

Operator doesn’t take over your web browser, though. Instead, you visit a separate, new website — operator.chatgpt.com — and are confronted with a prompt input box similar to ChatGPT.

Typing a request into this box — “find me tickets for the LA Lakers game tonight” — will trigger Operator to open a separate, virtual browser running in the cloud on OpenAI servers. Then, the agent can execute tasks like filling out forms, managing online reservations, even booking tickets to sporting events and concerts, and navigating other common workflows. The user watches the cursor move on its own on the cloud-based browser in real time. If the agent encounters a problem, it will stop and message the user via a text output, similar to ChatGPT’s responses.

Also, below the virtual browser, the user will see suggestions of actions Operator can take on their behalf.

Yet, the user can take control at any time — similar to semi-autonomous driving systems in modern cars. Operator also asks the user to input their own payment credentials when it reaches a purchase screen on another website. Finally, users can save particular workflows that they wish to use going forward and start them again.

Operator is powered by what OpenAI calls computer-using agent (CUA) technology, a new variant of GPT-4o trained specifically to use computers.

Bridging AI and GUIs

Operator stands apart from other automation tools by mimicking human interaction with graphical user interfaces (GUIs).

Instead of relying on specialized APIs, the system leverages screenshots for visual input and uses virtual mouse and keyboard actions to complete tasks.

The underlying CUA model combines GPT-4o’s vision capabilities with reinforcement learning, enabling the agent to perceive, reason, and act on screen.

This approach allows Operator to handle diverse tasks, including ecommerce browsing, travel planning, and even repetitive tasks like creating playlists or managing shopping lists. Notable benchmarks illustrate its effectiveness:

87% success rate on WebVoyager, a test of live website navigation

58.1% success rate on WebArena, which simulates real-world ecommerce and content management scenarios

But there’s already tough competition: Just yesterday, Chinese tech firm ByteDance (TikTok’s parent company) launched its own AI agent for controlling web browsers and performing actions on a user’s. behalf. Called UI-TARS, it’s totally open-source and boasts similarly impressive benchmark performance (though does not appear to have been compared directly on the same benchmarks). That means OpenAI’s Operator will need to be significantly better or more reliable to justify the relatively high ($200/month) cost of accessing it through ChatGPT Pro subscriptions.

Already being tested in enterprise web navigation use cases

OpenAI is partnering with several businesses to ensure Operator meets real-world needs. Companies including Instacart, DoorDash and Etsy are already testing the technology for use cases ranging from grocery delivery to personalized shopping.

Brett Keller, CEO of Priceline, remarked on its utility for travel planning, calling it “a significant step in making travel more seamless and personalized.”

For public-sector applications, the City of Stockton is exploring ways to use Operator to simplify civic engagement. Jamil Niazi, the city’s director of information technology, highlighted AI’s potential to make enrolling in services easier for residents.

Yet there are limitations. Tech publication Every got an early preview, has been testing it for the past week, and found that:

“One of the peculiarities of Operator’s design is that it doesn’t use your browser. Instead, it uses a browser in one of OpenAI’s data centers that you can watch and interact with remotely. The upside of this design decision is that you can use Operator wherever and whenever — for example, on any mobile device.

“The downside is that many sites like Reddit already block AI agents from browsing so they can’t be accessed by Operator. In this research preview mode, Operator is also blocked by OpenAI from accessing certain resource-intensive sites like Figma or competitor-owned sites like YouTube for performance or legal reasons.”

Safety measures

Given its ability to act on users’ behalf, Operator has been developed with robust safety features:

User control: Operator requests confirmation for sensitive actions, such as making purchases or sending emails.

Watch mode: Ensures user supervision for critical tasks, particularly on sensitive sites like email or financial platforms.

Misuse prevention: The system is trained to refuse harmful requests and includes safeguards against adversarial attacks, such as malicious prompts embedded in websites.

OpenAI has also incorporated features to protect user privacy, including options to clear browsing data and opt out of data sharing for model improvements.

Enterprise edition coming

OpenAI envisions a broader role for Operator in both individual and enterprise settings. Over time, the company plans to expand access to Plus, Team, and Enterprise users, eventually integrating Operator into ChatGPT.

There are also plans to make the underlying CUA technology available via an API, enabling developers to create custom computer-using agents.

Despite its potential, Operator remains a work in progress. OpenAI has been transparent about its limitations, such as difficulties with complex interfaces or unfamiliar workflows. Early user feedback will play a pivotal role in improving the system’s accuracy, reliability and safety.

As OpenAI refines Operator through real-world use, it is seeking to transform AI from a passive tool into an active participant in the digital ecosystem. Whether it’s simplifying everyday tasks or innovating business workflows, OpenAI is positioning Operator as the next step in making AI accessible, practical, and secure.



Read Full Article