OpenAI has released its GPT-4.1 family of models with a laser focus on coding capabilities and dramatic price cuts that could intensify competition in the AI market. The new lineup—GPT-4.1, GPT-4.1 Mini, and GPT-4.1 Nano—offers improved performance at lower costs, putting pressure on competitors like Anthropic, Google, and xAI.
Big Context Windows, Smaller Price Tags
All three new models boast a massive 1-million token context window—eight times larger than GPT-4o’s 128,000 token limit. This expanded capacity allows developers to process entire codebases, lengthy documentation, and complete meeting transcripts without losing context.
The most striking change? The pricing:
- GPT-4.1: $2.00 per million input tokens, $8.00 per million output tokens
- GPT-4.1 Mini: $0.40 per million input tokens, $1.60 per million output tokens
- GPT-4.1 Nano: $0.10 per million input tokens, $0.40 per million output tokens
OpenAI has increased the prompt caching discount to 75% (up from 50%), making repeated queries much cheaper. For median queries, GPT-4.1 costs 26% less than GPT-4o.
“We’ve optimized GPT-4.1 for real-world use based on direct feedback to improve in areas that developers care most about: frontend coding, making fewer extraneous edits, following formats reliably, adhering to response structure and ordering, consistent tool usage, and more,” an OpenAI spokesperson told TechCrunch.
Coding Performance Jumps
In coding benchmarks, GPT-4.1 scored 54.6% on SWE-bench Verified, a significant improvement over GPT-4o’s 33.2% and GPT-4.5 Preview’s 38.0%. While this falls short of Google’s Gemini 2.5 Pro (63.8%) and Anthropic’s Claude 3.7 Sonnet (62.3%), OpenAI’s focus on practical developer workflows appears to be paying off.
Qodo found GPT-4.1 provided better suggestions in 55% of real GitHub pull requests when compared to other leading models. In the area of frontend development specifically, websites built by GPT-4.1 were preferred over those created by GPT-4o 80% of the time in human evaluations.
Varun Mohan, CEO of Windsurf, reported that GPT-4.1 performed “60 percent” better than GPT-4o on their internal benchmarks. “We found that GPT-4.1 has substantially fewer cases of degenerate behavior,” Mohan said, noting that the new model spends less time reading and editing irrelevant files by mistake.
Similar Posts
Competitors Feel the Squeeze
The pricing structure puts pressure on Anthropic’s Claude models:
- Claude 3.7 Sonnet: $3.00 per million input tokens, $15.00 per million output tokens
- Claude 3.5 Haiku: $0.80 per million input tokens, $4.00 per million output tokens
- Claude 3 Opus: $15.00 per million input tokens, $75.00 per million output tokens
Google’s Gemini pricing also looks expensive by comparison, especially with its tiered structure that increases costs for longer contexts:
- Gemini 2.5 Pro (≤200k tokens): $1.25 per million input tokens, $10.00 per million output tokens
- Gemini 2.5 Pro (>200k tokens): $2.50 per million input tokens, $15.00 per million output tokens
Availability and Integration
The GPT-4.1 family is available exclusively through OpenAI’s API, not directly in ChatGPT. The models feature a knowledge cutoff of June 2024 as reported by OpenAI and come with expanded output token limits of 32,768 (up from GPT-4o’s 16,384).
OpenAI has integrated these models into GitHub Copilot and GitHub Models, making them accessible within developers’ existing workflows.
Technical Limitations
OpenAI acknowledges some limitations. GPT-4.1 becomes less reliable as the number of input tokens increases—its accuracy on the OpenAI-MRCR test dropped from around 84% with 8,000 tokens to 50% with 1 million tokens. The company also notes that GPT-4.1 tends to be more “literal” than GPT-4o, sometimes requiring more specific prompts.
Latency for GPT-4.1 is approximately fifteen seconds with 128,000 tokens of context and a minute for a million tokens. The Mini and Nano versions are faster, with Nano typically returning the first token in less than five seconds for queries with 128,000 input tokens.
AI Market Implications
OpenAI’s pricing moves could force competitors to adjust their strategies. Windsurf is already offering an unprecedented free, unlimited GPT-4.1 trial for a week, betting that developers won’t want to return to pricier options once they experience the cost savings.
For companies managing tight AI budgets, especially startups and smaller teams, GPT-4.1’s combination of improved performance and lower costs presents a compelling value proposition that could accelerate AI adoption and integration.
In a recent TED interview, OpenAI reported significant user growth, with the GPT-4.1 family potentially contributing to continued expansion as competition in AI pricing intensifies.