Gemini 2.5 Flash: New AI Model Achieves 12.1% Score on Tough Reasoning Test, Manages Costs

Google has released Gemini 2.5 Flash, a new AI model that lets users decide how much “thinking” their AI should do – and only pay for what they actually need. This mid-April 2025 launch marks a shift in AI pricing and performance.

The model introduces what Google calls a “thinking budget,” allowing developers to set limits on how deeply the AI reasons through problems. When you need simple answers, you can turn thinking off completely. For complex problems, you can dial up the reasoning power.

“We know cost and latency matter for a number of developer use cases, and so we want to offer developers the flexibility to adapt the amount of thinking the model does, depending on their needs,” said Tulsee Doshi, Product Director for Gemini Models at Google DeepMind.

The Cost of Thinking

The price difference between thinking and non-thinking is significant. While input costs remain at $0.15 per million tokens, output costs jump from $0.60 per million tokens without thinking to $3.50 with thinking – nearly six times more expensive.

This pricing model reflects how much more computing power is needed when the AI analyzes complex problems step-by-step rather than providing quick responses. Developers can set a thinking budget anywhere from 0 to 24,576 tokens.

What makes this system clever is that the model doesn’t always use its full budget. It automatically judges how much thinking a task requires. A simple question like “How many provinces does Canada have?” needs minimal reasoning, while engineering calculations trigger deeper thinking processes.

Similar Posts

Google’s Gemini AI: Free Deep Research and Personalized Assistance

Google’s Gemini 2.0 Flash Processes 750,000 Words, Costs 10 Cents per Million Tokens, Beats DeepSeek AI

How Good Is It?

Despite being designed for speed and cost savings, Gemini 2.5 Flash performs well on difficult tasks. It scored 12.1% on Humanity’s Last Exam (HLE), a tough benchmark that tests advanced reasoning. This score beat several competitor models like Claude 3.7 Sonnet (8.9%) and DeepSeek R1 (8.6%), though it fell short of OpenAI’s recently launched o4-mini (14.3%).

The model can process up to one million tokens of context, meaning it can handle extremely long documents or conversations. It works with text, images, video, and audio inputs.

What Can It Do?

Gemini 2.5 Flash is built for tasks that need a balance of smarts and speed: summarizing documents, answering questions in chat systems, pulling specific information from texts, and creating captions for images and videos.

The model’s thinking capabilities shine when handling multi-step problems like complex math or analyzing detailed research questions. When these abilities aren’t needed, users can turn thinking off to save money and speed up response times.

Availability and Industry Impact

Currently available in preview through the Gemini API in Google AI Studio and Vertex AI, the model is also accessible in the Gemini app as “2.5 Flash (Experimental).” Google has not announced when it will be generally available for full production use.

For businesses, this release offers a way to control AI costs while still accessing advanced capabilities when needed. The approach reflects a maturing AI market where companies need to carefully manage expenses as they integrate AI into daily operations.

This release comes during a busy week for Google, which also rolled out Veo 2 video generation capabilities and announced free Gemini Advanced access for U.S. college students until spring 2026.

The Cost of Thinking

Similar Posts

How Good Is It?

What Can It Do?

Availability and Industry Impact

Leave a comment Cancel reply

News, Technology

Samsung Galaxy Card Launches With 5% Cash Back and $0 Annual Fee

News, Technology

Google Says Android SMS, Call Logs Now Count Toward 15GB Free Storage—40MB Average Impact

Gadgets, News, Technology

Samsung Z Fold 8 Ultra With 200MP Camera and 5,000mAh Battery Launches July 22

Business, News

Apple Tops Nvidia at $4.88T to Become World’s Most Valuable Company

AI, News, Technology

China’s 2.8T Open-Source AI Model Kimi K3 Nears Fable 5, Tops Arena Coding Rank

AI, News, Technology

Apple OS 27: Siri AI public beta, watchOS 27 drops 3 models, Intel Macs lose support

Gemini 2.5 Flash: New AI Model Achieves 12.1% Score on Tough Reasoning Test, Manages Costs

The Cost of Thinking

Similar Posts

How Good Is It?

What Can It Do?

Availability and Industry Impact

Share this:

Leave a comment Cancel reply

most recent

News, Technology

Samsung Galaxy Card Launches With 5% Cash Back and $0 Annual Fee

News, Technology

Google Says Android SMS, Call Logs Now Count Toward 15GB Free Storage—40MB Average Impact

Gadgets, News, Technology

Samsung Z Fold 8 Ultra With 200MP Camera and 5,000mAh Battery Launches July 22

Business, News

Apple Tops Nvidia at $4.88T to Become World’s Most Valuable Company

AI, News, Technology

China’s 2.8T Open-Source AI Model Kimi K3 Nears Fable 5, Tops Arena Coding Rank

AI, News, Technology

Apple OS 27: Siri AI public beta, watchOS 27 drops 3 models, Intel Macs lose support