Google is developing screen automation capabilities for Gemini that will let the AI assistant control Android apps directly on your device. Code analysis of the Google app beta version 17.4 reveals a feature codenamed “bonobo” that enables Gemini to book rides, place food orders, and complete multi-step tasks through direct app interaction.
The feature uses screen automation technology built into Android 16 QPR3, allowing Gemini to navigate app interfaces the same way humans do. This development comes as Google expands its agentic AI capabilities, following the recent introduction of Auto Browse in Chrome for desktop browsers.
Google has included privacy warnings in the code strings, stating users remain responsible for actions taken by Gemini. Screenshots of app interactions will be reviewed by trained reviewers when Keep Activity is enabled. The company advises against using screen automation for emergencies or tasks involving sensitive information.
Development Timeline
Track the rollout of Gemini’s screen automation capability
How Screen Automation Functions
The screen automation feature allows Gemini to interact with apps installed on your Android device by viewing and manipulating on-screen elements. When you give a command like “book an Uber to the airport” or “order pad thai from my usual restaurant,” Gemini opens the relevant app, navigates through menus, selects options, and completes the task.
This capability differs from traditional API integrations. Instead of apps building specific connections to Gemini, the AI visually processes what appears on screen and simulates user inputs like taps and swipes. The approach provides flexibility since Gemini can work with apps without requiring developer cooperation, though it also introduces complexity since app interfaces change frequently.
Technical Requirements
- Android 16 QPR3 or later operating system
- Google app beta version 17.4 or newer
- Google Labs access for experimental features
- Keep Activity setting for screenshot review
Google has confirmed the feature will initially work with “certain apps” only. Early indications suggest ride-hailing services and food delivery platforms will receive priority support. The limited rollout allows Google to refine the technology before broader deployment across the Android ecosystem.
Screen Automation vs Traditional App Control
Understanding the difference in AI app interaction methods
Privacy and User Responsibility
Google’s code strings include explicit warnings about privacy implications. When Gemini interacts with apps through screen automation, the system captures screenshots of the process. If users have Keep Activity enabled in their Google account settings, trained reviewers will examine these screenshots to improve the service.
The company states: “When Gemini interacts with an app, screenshots are reviewed by trained reviewers and used to improve Google services if Keep Activity is on.” Users can disable this data collection by turning off Keep Activity, though this limits Google’s ability to refine the feature based on real usage patterns.
Google explicitly advises users: “Don’t enter login or payment information into Gemini chats. Avoid using screen automation for emergencies or tasks involving sensitive information.” The warnings acknowledge that AI systems can misinterpret instructions or make errors during execution.
User Responsibility Notice
Google’s warning states: “Gemini can make mistakes. You’re responsible for what it does on your behalf, so supervise it closely.”
- Users remain legally responsible for all actions Gemini takes
- Manual supervision required during task execution
- Stop button available to halt automation at any time
- Screenshot review occurs when Keep Activity is enabled
- Payment information should not be shared in Gemini chats
Device Compatibility Check
Verify if your device will support screen automation
Release Timeline and Availability
Android 16 QPR3 stable release is scheduled for March 2026, providing the technical foundation for screen automation. Google has been testing the operating system’s screen automation permissions since January 2026, with QPR3 Beta 2 including new settings under “Special app access” that allow apps to “help you complete tasks by interacting with other apps’ screen content.”
The timing for when Google will officially announce or enable the bonobo feature remains unconfirmed. Based on patterns from similar features like Chrome’s Auto Browse capability, screen automation may initially be restricted to paid subscription tiers. Chrome Auto Browse launched exclusively for Google AI Pro and AI Ultra subscribers in the United States, with Pro costing $20 monthly and Ultra priced at $250 monthly.
The feature will initially work with a limited set of applications. App developers frequently update their interfaces, which can break screen automation that relies on recognizing specific visual elements. Google’s cautious rollout strategy allows the company to refine the technology with popular apps before expanding to the broader Android ecosystem. Related developments in AI-powered device control include Apple’s work on enhanced Siri capabilities and ongoing improvements to smart home integration across platforms.
Comparison to Desktop Auto Browse
Google introduced Auto Browse for Chrome on desktop in January 2026, powered by Gemini 2.0 technology. The feature allows AI to navigate websites, fill forms, and complete multi-step workflows within the browser. Desktop Auto Browse currently handles 20 requests per day for Pro subscribers and 200 requests for Ultra tier users.
Screen automation on Android represents an extension of this agentic AI approach to mobile devices. While Chrome Auto Browse operates within a web browser environment, the Android implementation must handle native mobile apps with varied interfaces and interaction patterns. The technical challenges differ significantly, as mobile apps don’t follow standardized structures like web pages do.
Both features share common privacy considerations. Users must grant permission for the AI to access login information stored in Google Password Manager. The company has implemented safeguards including user confirmation requests before completing certain actions and the ability to manually take over tasks at any point. These protections aim to reduce unintended consequences while maintaining the convenience of automated task completion. Technology developments across the industry also include hardware improvements in computing devices that support more sophisticated AI processing.
Integration with Project Astra
The screen automation capability connects to Google’s broader Project Astra initiative, which aims to create a universal AI assistant that understands and interacts with the world through multiple modalities. At Google I/O 2024, the company demonstrated Project Astra controlling Android apps, including scrolling through Chrome and clicking elements in the YouTube app.
Project Astra combines visual understanding, natural language processing, and action execution into a unified system. The technology being tested in screen automation represents a practical implementation of these research concepts. Google has been integrating Astra capabilities into consumer products gradually, with Gemini Live receiving video understanding features that originated from Project Astra research.
Additional code strings in the Google app beta reference a “Likeness” feature codenamed “wasabi,” which relates to 3D avatar creation for video calls. Android XR uses similar avatar technology, suggesting Google is building a cohesive AI ecosystem across different form factors including phones, headsets, and smart glasses. The integration of screen automation with these broader initiatives indicates Google’s long-term strategy for AI-powered device interaction. Consumer electronics developments like software platform changes continue to shape how users interact with their devices.
Looking Ahead
The information presented covers Google’s development of screen automation for Gemini on Android devices. Technical details include the feature’s codename “bonobo,” its foundation in Android 16 QPR3, and privacy considerations around screenshot review. The March 2026 timeline corresponds to the expected stable release of Android 16 QPR3.
Google’s approach involves initial testing through Labs features before potential wider deployment. Users interested in screen automation capabilities should monitor announcements around the Android 16 QPR3 launch and check device compatibility requirements. The feature’s development continues alongside other agentic AI initiatives including Chrome Auto Browse and Project Astra integration.






