Claude 3.7 Sonnet: Hybrid AI Model Excels in Coding

Anthropic has officially released Claude 3.7 Sonnet, its most advanced AI model to date and the market’s first hybrid reasoning model. This new version allows users to choose between quick responses and extended, step-by-step thinking that is visible to the user.

The model maintains the same pricing as previous versions at $3 per million input tokens and $15 per million output tokens, including tokens used for thinking. Claude 3.7 Sonnet is available across all Claude plans—Free, Pro, Team, and Enterprise—though the extended thinking mode is not included in the free tier.

Two Ways of Thinking in One Model

What sets Claude 3.7 Sonnet apart is its dual approach to reasoning. Unlike competitors who offer separate models for quick answers and complex problem-solving, Anthropic has integrated both capabilities into a single model.

“Just as humans use a single brain for both quick responses and deep reflection, we believe reasoning should be an integrated capability of frontier models rather than a separate model entirely,” Anthropic stated in its press release.

In standard mode, Claude 3.7 Sonnet functions as an improved version of Claude 3.5 Sonnet. When switched to extended thinking mode, it self-reflects before answering, improving performance on math, physics, coding, and other complex tasks.

API users gain additional control through a customizable “thinking budget,” allowing them to specify how many tokens (up to 128K) the model can use for reasoning. This feature helps users balance speed, cost, and answer quality based on their specific needs.

Coding Capabilities Take Center Stage

Early testing shows Claude 3.7 Sonnet excels particularly in coding and front-end web development. According to Anthropic’s benchmarks, the model achieved state-of-the-art performance on SWE-bench Verified, reaching 70.3% accuracy when evaluating the model’s ability to solve real-world software issues.

Several tech companies have reported positive results with the new model. Cursor noted it is “best-in-class for real-world coding tasks,” while Cognition found it “far better than any other model at planning code changes and handling full-stack updates.” Canva’s evaluations showed the model “consistently produced production-ready code with superior design taste and drastically reduced errors.”

Claude Code: A New Tool for Developers

Alongside Claude 3.7 Sonnet, Anthropic introduced Claude Code—a command-line tool currently available as a limited research preview. This agentic coding tool allows developers to delegate engineering tasks directly from their terminal.

Claude Code can search and read code, edit files, write and run tests, commit and push code to GitHub, and use command line tools while keeping users informed throughout the process.

“In early testing, Claude Code completed tasks in a single pass that would normally take 45+ minutes of manual work, reducing development time and overhead,” Anthropic reported.

Safety Improvements

Claude 3.7 Sonnet also makes “more nuanced distinctions between harmful and benign requests,” reducing unnecessary refusals by 45% compared to its predecessor. This means the model is less likely to reject legitimate requests while maintaining safety standards.

The system card for this release details Anthropic’s Responsible Scaling Policy evaluations and addresses emerging risks, particularly prompt injection attacks. It also examines potential safety benefits from reasoning models, including improved decision transparency.

Competitive Landscape

Claude 3.7 Sonnet enters a market with several new reasoning models, including OpenAI’s o1 and o3 series, Google’s Gemini 2.0 Flash Thinking, and DeepSeek’s R1. According to self-reported benchmarks from Anthropic, Claude 3.7 Sonnet outperforms competing models in coding tasks.

When comparing prices, Claude 3.7 Sonnet ($3 per million input tokens, $15 per million output tokens) is more expensive than some alternatives. For comparison, OpenAI’s GPT-4o Mini costs $1.10 per million input tokens and $4.40 per million output tokens.

FAQ

Q: What makes Claude 3.7 Sonnet different from previous models?
A: Claude 3.7 Sonnet is the first hybrid reasoning model that combines quick responses with extended thinking in one model. It shows its reasoning process and allows users to control how much thinking the model does before answering.

Q: How much does Claude 3.7 Sonnet cost?
A: The model costs $3 per million input tokens and $15 per million output tokens, including thinking tokens. This pricing is the same as previous Claude models.

Q: Is the extended thinking mode available for free users?
A: No, the extended thinking mode is available on all Claude plans (Pro, Team, and Enterprise) except the free tier.

Q: What is Claude Code?
A: Claude Code is an agentic coding tool that allows developers to delegate engineering tasks to Claude directly from their terminal. It can search code, edit files, run tests, and push code to GitHub. It’s currently available as a limited research preview.

Q: How does Claude 3.7 Sonnet compare to competitors like OpenAI’s models?
A: According to Anthropic’s benchmarks, Claude 3.7 Sonnet outperforms competitors in coding tasks and certain reasoning benchmarks. It shows particularly strong results in SWE-bench Verified and TAU-bench for real-world software development tasks.Q: How has the safety of Claude improved in this version?
A: Claude 3.7 Sonnet makes more nuanced distinctions between harmful and benign requests, reducing unnecessary refusals by 45% compared to previous versions. It also includes new defenses against prompt injection attacks.

Two Ways of Thinking in One Model

Coding Capabilities Take Center Stage

Claude Code: A New Tool for Developers

Similar Posts

Safety Improvements

Competitive Landscape

FAQ

Leave a comment Cancel reply

News, Technology

Microsoft Ends 40-Year Blue Screen Era: New Black Design Promises 2-Second Recovery Times

AI, Apps, News

Google Photos Solves AI Search Latency: Results Appear Instantly While Gemini Works in Background

AI, Apps, News

Google’s Doppl App Creates AI Videos of You Wearing Any Outfit as Virtual Try-On Market Heads to $49B

AI, Apps

Perplexity AI Adds Task Scheduling to WhatsApp: No Downloads Required for 3 Billion Users

AI, News

Google Launches Gemini CLI with 1,000 Free Daily AI Requests

News, Technology

Apple iCloud 5-Hour Outage Affects 9 Services with 900+ User Complaints and No Explanation

Claude 3.7 Sonnet: Hybrid AI Model Excels in Coding

Two Ways of Thinking in One Model

Coding Capabilities Take Center Stage

Claude Code: A New Tool for Developers

Similar Posts

Safety Improvements

Competitive Landscape

FAQ

Share this:

Leave a comment Cancel reply

most recent

News, Technology

Microsoft Ends 40-Year Blue Screen Era: New Black Design Promises 2-Second Recovery Times

AI, Apps, News

Google Photos Solves AI Search Latency: Results Appear Instantly While Gemini Works in Background

AI, Apps, News

Google’s Doppl App Creates AI Videos of You Wearing Any Outfit as Virtual Try-On Market Heads to $49B

AI, Apps

Perplexity AI Adds Task Scheduling to WhatsApp: No Downloads Required for 3 Billion Users

AI, News

Google Launches Gemini CLI with 1,000 Free Daily AI Requests

News, Technology

Apple iCloud 5-Hour Outage Affects 9 Services with 900+ User Complaints and No Explanation