Claude Sonnet 5 Ships as Anthropic Default: Agentic Performance Closes Opus Gap - AI

7 x 24 Track global technological trends

Hot Topic

Day

News Topic

Claude Sonnet 5 Ships as Anthropic Default: Agentic Performance Closes Opus Gap

4 hour ago / Read about 33 minute

Source：TechTimes

Sonnet 5 anthropic.com

Anthropic on June 30, 2026, launched Claude Sonnet 5 as the new default AI model for its Free and Pro subscription tiers — replacing Sonnet 4.6 with a model the company says narrows the performance gap with its Opus flagship line while remaining substantially cheaper to operate. For developers running automated, multi-step workflows, the upgrade matters most not because of benchmark numbers but because of what those numbers enable architecturally: a single model that covers a continuous cost-performance curve from light tasks to near-Opus-grade autonomous work, depending on how much compute budget a developer assigns to each call.

Anthropic's launch announcement describes the model as its most agentic Sonnet yet — capable of making multi-step plans, using tools like browsers and terminals, and running through complex tasks that previous Sonnet models would stall on before completing.

The launch lands as Anthropic prepares for what could become one of the largest technology initial public offerings in history — the company confidentially filed a draft S-1 registration statement with the Securities and Exchange Commission on June 1, 2026, following a $65 billion Series H round that valued it at $965 billion. Sonnet 5 is the first major model release since that filing and since the June 12 Commerce Department order that suspended Claude Fable 5 and Mythos 5 for all users worldwide. Those models remain offline for general customers; Sonnet 5 and Opus 4.8 are now the effective ceiling of what most developers can access.

Read more: Claude Fable 5 Still Offline as US Clears Mythos 5 for Critical Infrastructure

Claude Sonnet 5 Becomes Free and Pro Default Today

Sonnet 5 is available immediately across all Claude subscription tiers and is the default model for Free and Pro accounts. Max, Team, and Enterprise users also have access. Developers can reach it through Claude Code, the claude.ai web and mobile interfaces, and the Claude Platform API using the model string claude-sonnet-5. The full list of available models and their API identifiers is in Anthropic's models documentation.

The model is also available on Amazon Bedrock and has been added as an option in GitHub Copilot for Pro, Pro+, Max, Business, and Enterprise users — a distribution point that matters for any team already billing through GitHub's token-credit system.

Read more: GitHub Copilot Billing Switches to Token Costs Today: Agentic Users Face Steepest Increases

How Effort Levels Create a Continuous Cost-Performance Curve

The single most consequential architectural feature of Sonnet 5 is not a benchmark score. It is the model's position on a tunable cost-performance curve that overlaps meaningfully with Opus 4.8.

AI models typically charge a flat rate per token regardless of how hard the model works on a given request. Anthropic's effort level system changes that. Developers can instruct the model to apply more or less compute to a given task — choosing from low, medium, high, xhigh, or max — trading cost against output quality. What the Sonnet 5 launch charts show — based on the BrowseComp agentic search benchmark and the OSWorld-Verified computer use evaluation — is that Sonnet 5 at high or extra-high effort levels achieves performance comparable to Opus 4.8 on some task categories. At medium effort, it is substantially cheaper than Opus while still outperforming Sonnet 4.6 at any setting.

The practical implication: teams that previously used Sonnet for routine tasks and Opus for complex ones may now be able to route a wider share of complex work through Sonnet 5 at elevated effort levels, reserving Opus 4.8 only for tasks that specifically require its stronger agentic search or computer use performance. Opus 4.8 remains the better choice for the highest-accuracy requirements on those specific tasks and for cybersecurity work that requires reduced guardrails.

On the benchmark that specifically targets agentic coding — the measure most relevant to automated pipeline deployments — Sonnet 5 scores 63.2 percent compared to Opus 4.8's 69.2 percent and the outgoing Sonnet 4.6's 58.1 percent. On knowledge work tasks, Sonnet 5 slightly edges ahead of Opus 4.8. The full evaluation data is in the Sonnet 5 system card.

What Developers Report: Completed Tasks Where Prior Sonnets Stalled

Early access developers described a consistent improvement in what Anthropic calls follow-through: the ability to complete multi-step tasks without stalling partway through. Daniel Shepard, a senior engineer at Zapier, said his team handed the model a combined Salesforce account update and email launch task — a workflow that previously required human intervention at the halfway point — and the model completed it without a stop.

Fabian Hedin, co-founder of Lovable, noted a quality that is rarely benchmarked but matters significantly in consumer-facing deployments: consistent, clean refusal of unsafe requests. A model deployed at scale that refuses appropriately and reliably is, in Hedin's framing, as operationally important as raw capability when putting powerful tools in the hands of millions of users.

The model also checks its own outputs without being explicitly prompted to do so — a behavior change from Sonnet 4.6 that reduces the rate of compounding errors in automated pipelines, where a mistake in one step propagates through all subsequent steps.

What the Tokenizer Change Means for Your Bill

Anthropic is offering an introductory API rate of $2 per million input tokens and $10 per million output tokens through August 31, 2026. After that date, pricing moves to the standard rate of $3 per million input tokens and $15 per million output tokens. Full pricing details, including batch API discounts, are in Anthropic's pricing documentation.

For comparison, Opus 4.8 costs $5 per million input tokens and $25 per million output tokens — meaning Sonnet 5 at standard pricing is 40 percent cheaper on inputs and 40 percent cheaper on outputs than Opus.

There is a meaningful asterisk. Sonnet 5 uses an updated tokenizer — the same revision introduced with Opus 4.7 — that changes how the model processes text. The same input can map to 1.0 to 1.35 times as many tokens as it would have under the previous tokenizer, depending on content type. Anthropic designed the introductory pricing to keep the transition roughly cost-neutral for Sonnet 4.6 users. But developers should audit their actual token consumption against the new tokenizer before assuming the upgrade is free after September 1, when standard pricing takes effect.

Agentic workflows are particularly exposed to this dynamic. A model that plans, verifies, and iterates across multiple tool calls generates far more tokens per completed task than a single-turn chatbot response. The effort level architecture compounds this: higher effort settings mean more tokens spent per inference call.

How Sonnet 5 Handles Safety and Prompt Injection Attacks

Anthropic's pre-deployment safety evaluations found Sonnet 5 improved on Sonnet 4.6 on the behaviors most relevant to agentic deployment: it shows lower rates of hallucination, lower rates of sycophancy — the tendency to agree with incorrect premises rather than correct them — and improved resistance to prompt injection attacks.

Prompt injection is a category of cyberattack specific to language models deployed in automated contexts. When a model processes external content — a webpage, an email, a document retrieved by a tool — that content may contain adversarial instructions designed to override the model's original instructions and redirect it toward a different goal. As models like Sonnet 5 are deployed in agentic pipelines that regularly fetch and process untrusted external content, this attack surface expands significantly. Sonnet 5's improved resistance here is a specific engineering gain, not a generic capability improvement. In a live bug bounty hosted with Gray Swan, only 0.19 percent of unique attacks succeeded against Sonnet 5 — matching Opus 4.8 and outperforming GPT-5.5 at 3.08 percent.

On broader safety, Sonnet 5 carries the same real-time cybersafeguard classifiers as Opus 4.7 and 4.8 — systems that detect and block dangerous cybersecurity requests in real time. These safeguards are less strict than those deployed with the currently-suspended Fable 5, which blocked a wider range of security tasks. Anthropic did not deliberately train Sonnet 5 on cybersecurity tasks, and in evaluations testing the ability to develop working software exploits for Firefox 147, Sonnet 5 scored zero percent — the same result as Sonnet 4.6. A slight increase in partial-success rates, attributed to general intelligence improvements rather than specific cybersecurity training, prompted Anthropic to enable the real-time safeguards. Sonnet 5 shows somewhat higher rates of misaligned behavior than Opus 4.8 and Claude Mythos Preview, both of which remain the safer choice for high-stakes or sensitive deployments.

Claude Sonnet 5 in a Crowded Agentic AI Field

The launch arrives four days after OpenAI released GPT-5.6 Sol in preview, which OpenAI also framed as its most agentic offering — capable of distributing work across subagents for extended autonomous runs. Google's Gemini 3.5 Flash, launched in May, carried a similar pitch: a shift from conversational chatbot to an autonomous planning and execution tool.

Sonnet 5 is priced below Opus 4.8, GPT-5.5, and Google Gemini 3.1 Pro at both introductory and standard rates. It remains more expensive than Gemini 3.5 Flash. The more significant distinction is the effort-level architecture: rather than choosing between a capable-but-expensive frontier model and a cheaper-but-limited mid-tier option, developers can tune a single model across a wider cost-performance range within a single API call.

Anthropic also launched Claude Science on June 30 — a desktop application for scientific research that integrates tools and packages commonly used by researchers, produces auditable research artifacts, and provides flexible access to computing resources. The company framed it as consolidating fragmented research tooling into a single environment, with pre-configured support for genomics, single-cell analysis, proteomics, and cheminformatics.

Rate Limits Increase Alongside Sonnet 5

To accommodate the higher token volumes that come with more capable agentic tasks — which plan, iterate, and call tools across longer sessions — Anthropic raised rate limits across Chat, Cowork, Claude Code, and the Claude Platform. The company simplified its API tier structure to three levels — Start, Build, and Scale — in April 2026. Current limits are visible in the Claude Console.

Frequently Asked Questions

What is the difference between Claude Sonnet 5 and Opus 4.8, and when should I use each?

Sonnet 5 is Anthropic's mid-tier model, now capable of matching Opus 4.8 performance on some task categories when set to higher effort levels. Opus 4.8 remains the stronger choice for the highest-accuracy requirements on agentic search and computer use tasks, and for cybersecurity work that requires reduced guardrails. The practical recommendation: use Sonnet 5 with a high or xhigh effort setting for complex agentic work and reserve Opus 4.8 for tasks where even a small accuracy difference is costly.

How much will Claude Sonnet 5 actually cost after the introductory pricing window closes?

Introductory API pricing — $2 per million input tokens and $10 per million output tokens — runs through August 31, 2026. After that date, standard pricing applies: $3 per million input tokens and $15 per million output tokens. There is an important additional variable: Sonnet 5 uses a revised tokenizer that may process the same input as 1.0 to 1.35 times as many tokens as the previous tokenizer. Developers running high-volume agentic workflows should measure their actual token consumption against the new tokenizer before the standard pricing takes effect.

Does Sonnet 5 replace Fable 5 for users who lost access when the US government suspended it?

Not directly. Fable 5 and Sonnet 5 are different products with different capability profiles. Fable 5 was the public-access version of Anthropic's most powerful model, equipped with safety classifiers; Sonnet 5 is an upgraded mid-tier model. For the overwhelming majority of development tasks — coding, automation, knowledge work — Sonnet 5 covers much of what Fable 5 could do. For the frontier cybersecurity and advanced reasoning tasks that were Fable 5's specific differentiator, Sonnet 5 is not a substitute; Opus 4.8 remains the highest-capability model currently available to general users.

Does the effort-level system mean I am choosing how much the model "thinks" per call?

Yes, in practical terms. Anthropic's effort level architecture lets developers allocate more or less compute budget to a given inference call, which affects both cost and output quality. At higher effort settings, Sonnet 5's performance on the BrowseComp and OSWorld-Verified benchmarks overlaps with Opus 4.8's performance at lower effort settings, creating a continuous cost-performance curve rather than a hard step between two product tiers.

Previous page：Trump drops restrictions on Anthropic’s Mythos and...

Next page：No More

Return to List

Hot Reading

2 day ago

From Photo Backups to My Own Cloud Server: My Trip Into Home Data Storage

2 day ago

Google DeepMind's Coding Pivot Lost Six Researchers to Meta, OpenAI, and Anthropic

2 day ago

China’s Loongson launches homegrown 16-core 3C3000 server CPU built on LoongArch

2 day ago

TechCrunch Mobility: All eyes on Tesla FSD