What's new in Claude Opus 4.8?

It builds on Claude Opus 4.7 with improvements across benchmarks and is a more effective collaborator, with gains in long-horizon agentic coding, effort calibration, and tool triggering. The model ID is `claude-opus-4-8` and regular pricing is unchanged at $5/$25 per MTok.

When and where is it available?

It launched on May 28, 2026, on the Pro, Max, Team, and Enterprise plans and on the Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

Does it affect my existing Opus 4.7 code?

API constraints are unchanged from Claude Opus 4.7 (sampling parameters unsupported, adaptive thinking only), so code that runs on 4.7 needs no changes. Behavior changes like fewer wasted thinking tokens and better tool triggering may warrant prompt updates.

What performance gains were reported?

On Super-Agent it is the only model to complete every case end-to-end, it set the highest recorded score on the Legal Agent Benchmark, and it scored 84% on Online-Mind2Web. It is also around four times less likely than its predecessor to let flaws in its own code pass unremarked.

On the Claude API, set `speed: "fast"` to get up to 2.5x higher output tokens per second from the same model at premium pricing, available as a research preview.

Where can I find the official announcement?

The announcement is at anthropic.com/news/claude-opus-4-8, and the technical details are in "What's new in Claude Opus 4.8" on platform.claude.com.

Claude Opus 4.8 — ClaudeKit

Overview

Anthropic introduced Claude Opus 4.8 on May 28, 2026. It builds on Claude Opus 4.7 with improvements across benchmarks and is a more effective collaborator, available today for the same price ($5/$25 per MTok). The model ID is claude-opus-4-8.

Key improvements

Long-horizon agentic coding

On long coding sessions, prior Opus models could drift off task as context grew or lose the thread after compaction. Claude Opus 4.8 improves long-context handling, runs fewer compactions, and recovers better after one. It is also around four times less likely than its predecessor to allow flaws in code it has written to pass unremarked.
Reasoning effort calibration

Behavior at a given effort level could vary across domains. Claude Opus 4.8 is calibrated for more reliable behavior at each effort level across a range of domains. The effort parameter default is high on all surfaces, including the Claude API and Claude Code; an explicit setting is unchanged.
Better tool triggering

Some users reported Claude Opus 4.7 skipping a tool call the task required. Claude Opus 4.8 is less likely to skip required tool calls and calls tools meaningfully more efficiently, using fewer steps.
Adaptive thinking

With adaptive thinking enabled, Claude Opus 4.8 triggers reasoning only when it judges the turn needs it — responding directly on simple lookups and short agentic steps, and reasoning before answering on complex multi-step problems. This reduces wasted thinking tokens versus Claude Opus 4.7 at the same effort level.

New features

Mid-conversation system messages

Updating instructions in a long conversation meant restating the full system prompt. Claude Opus 4.8 accepts role: "system" messages immediately after a user turn in the messages array, so you can append updated instructions while preserving prompt cache hits on earlier turns. No beta header is required.
Fast mode (research preview on the Claude API)

Set speed: "fast" to get up to 2.5x higher output tokens per second from the same model at premium pricing.
Refusal stop details

The stop_details object on refusal responses is now publicly documented. In addition to the refusal stop reason, it describes the category of refusal, making it easier to tell apart different classes of declined request and route the user to the right next step.
Lower prompt cache minimum

The minimum cacheable prompt length is 1,024 tokens, lower than on Claude Opus 4.7, so prompts that were too short to cache before can now create cache entries with no code changes.

Benchmarks

Area	Result
Super-Agent	Only model to complete every case end-to-end, beating prior Opus models and GPT-5.5 at parity on cost
CursorBench	Exceeds prior Opus models across every effort level
Legal Agent Benchmark	Highest score recorded; first model to break 10% overall on the all-pass standard
Online-Mind2Web	84% (a meaningful jump over both Opus 4.7 and GPT-5.5)

Pricing & availability

Item	Detail
Input / output (regular)	$5 / $25 per MTok (unchanged from Opus 4.7)
Model ID	`claude-opus-4-8`
Context window	1M tokens by default on the Claude API, Amazon Bedrock, and Vertex AI (200k on Microsoft Foundry)
Max output	128k tokens
Plans	Pro, Max, Team, Enterprise
Platforms	claude.ai, Claude API, Amazon Bedrock, Google Cloud Vertex AI, Microsoft Foundry

Notes

Sampling parameters unsupported — setting temperature, top_p, or top_k to a non-default value returns a 400 error, same as on Claude Opus 4.7. Use prompting to guide behavior.
Adaptive thinking is the only thinking mode — extended thinking budgets (thinking: {type: "enabled", budget_tokens: N}) return a 400 error. Use adaptive thinking and the effort parameter to control thinking depth.
No breaking API changes — these constraints are unchanged from Claude Opus 4.7, so code that runs on 4.7 needs no changes; behavior changes (fewer wasted thinking tokens, better tool triggering, better compaction handling) may warrant prompt updates.
Migration — see the official migration guide for moving from Claude Opus 4.7; Claude Code and Agent SDK users can apply the migration steps automatically with the Claude API skill.

Claude Opus 4.8