Gemini 2.5 Computer Use: 70%+ Accuracy, ~225s Latency 

GigaNectar Team

Gemini 2.5 Computer Use title card on a blue gradient with UI elements and code backdrop

Gemini 2.5 Computer Use — Preview Access, Benchmarks, Loop & Safety

Interactive overview with facts, quick quiz, and handy links

Availability: The model is in public preview via the Gemini API in Google AI Studio and Vertex AI.

What it does: Through the computer_use tool, tasks are completed by analyzing a screenshot and emitting UI actions (click, type, scroll, drag, work with forms and dropdowns, operate behind logins). Loop execution continues until completion or termination.

Benchmarks & latency: Reported results include 70%+ accuracy with ~225s latency on the Browserbase harness for Online‑Mind2Web, with leading results also reported on WebVoyager and AndroidWorld. Source announcement provides the benchmark table and scatterplot.

Where to learn more: See the Google DeepMind announcement for the model flow, safety notes, demos, and evaluation references.

70%+

Accuracy on Online‑Mind2Web (Browserbase harness)

≈ 225s

Latency reported for the same harness

Leads

Online‑Mind2Web • WebVoyager • AndroidWorld

Preview

Accessible in AI Studio & Vertex AI

1) Inputs

User task, screenshot of the UI, recent actions, optional allow/deny list for functions.

2) Propose Action

Model returns a function call (click, type, scroll). Some actions may request user confirmation.

3) Execute

Client code runs the action in the browser or app.

4) Update Context

New screenshot and current URL are sent back to continue or finish the loop.

Demo 1

“From https://tinyurl.com/pet-care-signup, get all details for any pet with a California residency and add them as a guest in my spa CRM at https://pet-luxe-spa.web.app/. Then, set up a follow up visit appointment with the specialist Anima Lavar for October 10th anytime after 8am. The reason for the visit is the same as their requested treatment.”

Demo 2

“My art club brainstormed tasks ahead of our fair. The board is chaotic and I need your help organizing the tasks into some categories I created. Go to sticky-note-jam.web.app and ensure notes are clearly in the right sections. Drag them there if not.”

Quick Quiz

Q1. Which Gemini API tool emits the UI actions?

Q2. Reported Online‑Mind2Web accuracy band?

Q3. What is required for certain sensitive actions?

This section covered preview access, the action loop, referenced benchmarks and latency, demo prompts, safety controls, and links for further reading.

Leave a comment