Anthropic unveiled its latest AI model, Claude Sonnet 4, alongside the more powerful Claude Opus 4. The new Sonnet 4 model promises significant improvements in coding, reasoning, and task handling while maintaining the same pricing as its predecessor.
The upgraded Sonnet 4 isn’t just a minor update. It shows impressive performance on software engineering tests, scoring 72.7% on the SWE-bench coding benchmark. This puts it nearly on par with the premium Opus 4 model in coding ability, despite being the more affordable option.
“Claude Sonnet 4 and Opus 4 transform AI from a tool into a true collaborator for every person and every team,” says Kate Jensen, head of Growth and Revenue at Anthropic. “Our customers will see project timelines shrink—in many cases from weeks to hours.”
Key Improvements
Sonnet 4 brings several major upgrades that change how people can work with AI:
Extended thinking with tool use allows Sonnet 4 to pause its reasoning, check information through tools like web search, and then continue where it left off. This helps the AI tackle more complex problems that require research.
The model follows instructions more precisely and shows significantly better memory when working with files. It can remember important information across longer conversations, helping it maintain context during extended tasks.
For everyday users, Sonnet 4 is available to free users, while also being available through Pro, Max, Team, and Enterprise plans. Pricing stays at $3/$15 per million tokens for input/output, making it the more affordable option compared to Opus 4’s $15/$75 rate.
Real-World Applications
Early adopters are already seeing benefits from Sonnet 4’s improvements. GitHub plans to use Sonnet 4 to power their coding assistant in GitHub Copilot. Other companies report significant gains in how the AI handles complex tasks:
iGent, a software company, found that Sonnet 4 reduced navigation errors in codebases from 20% to nearly zero, making it much more reliable for developers.
Sourcegraph noted the model “stays on track longer, understands problems more deeply, and provides more elegant code.”
Augment Code, another early user, reported “higher success rates, more surgical code edits, and more careful work through complex tasks.”
Agentic Capabilities
One of the most significant advancements in Sonnet 4 is its ability to function more like an “agent” – working independently on tasks with less human guidance.
Both new Claude models are 65% less likely to use shortcuts or loopholes when completing tasks compared to the previous version. This improvement makes the AI more trustworthy when handling complex assignments without constant supervision.
Scott White, Anthropic’s product lead, explained the practical benefit: “It’s like the kind of thing that is challenging that might represent 30% of your day, that isn’t necessarily fulfilling or professionally expanding you, but is necessary in the pursuit of being successful in your job.”
Similar Posts
The Competitive Landscape
Anthropic’s release comes amid fierce competition in the AI industry. Google recently released Mariner, an AI agent built into Chrome that can complete tasks like buying groceries. OpenAI has launched its own coding agent and a web browsing tool called Operator.
Industry spending on generative AI has grown sixfold in 2024 compared to 2023, according to venture capital firm Menlo Ventures. The same report indicated that Anthropic has doubled its market reach, cutting into OpenAI’s dominant position.
Concerns About Job Impact
As AI models like Sonnet 4 become more capable, questions about their impact on jobs grow louder. The World Economic Forum’s Future of Jobs report found that 41% of employers plan to downsize as generative AI takes on more work tasks.
Anthropic acknowledges this concern but suggests AI will also create new opportunities. White believes AI will make it easier for people to grow their careers outside formal education, like engineers using AI for design work without design training.
“It’s not also something that only Anthropic can take a perspective on,” White said. “This is something that the government, policy makers, many companies, need to work together to understand the arc of how this is going to be implemented.”
Safety Measures
Anthropic has implemented increased safety measures with its new models. The company activated AI Safety Level 3 (ASL-3) protections specifically for Claude Opus 4 due to its advanced capabilities and potential for misuse, particularly regarding chemical, biological, radiological, and nuclear knowledge.
For users who need to track AI reasoning processes, Anthropic introduced “thinking summaries” that condense lengthy thought processes. These summaries are only needed about 5% of the time, as most thought processes are short enough to display completely.
The introduction of Claude Sonnet 4 marks another step toward more capable AI assistants that can handle increasingly complex tasks with less human guidance. While the technology shows promise for boosting productivity, the broader implications for work and society remain an open question that will require input from many stakeholders.