OpenAI Operator research preview is now out

I am pleased to have been given an early look at this new project, I think in less than a year’s time many of us will be using an updated version for many ordinary tasks: “Operator is one of our first agents, which are AIs capable of doing work for you independently—you give it a task and it will execute it.” And:

Operator is powered by a new model called Computer-Using Agent (CUA). Combining GPT-4o’s vision capabilities with advanced reasoning through reinforcement learning, CUA is trained to interact with graphical user interfaces (GUIs)—the buttons, menus, and text fields people see on a screen.

Operator can “see” (through screenshots) and “interact” (using all the actions a mouse and keyboard allow) with a browser, enabling it to take action on the web without requiring custom API integrations.

Here is the associated OpenAI blog post. Exciting times.