Claude Opus 4.8
Modest version bump, real frontier gains — tops the intelligence index at the same price as 4.7.
Read the full Claude Opus 4.8 analysisContext
1M
Max output
128K
Input /1M
$5.00
Output /1M
$25.00
Live pricing via OpenRouter
Best for
- Hard agentic coding and codebase-scale migrations
- Long-horizon autonomous work with a clear up-front spec
- Economically valuable knowledge work (leads GDPval-AA at 1890)
Watch out
Loses Terminal-Bench 2.1 to GPT-5.5; narrates more and asks more by default than 4.7, so prompts may need re-tuning. GPQA/USAMO numbers are vendor-claimed.
For creators. Single-pass long-form drafts (128K output) and full-archive synthesis (1M context). Re-baseline any style prompts written against 4.7’s clipped voice — 4.8 is warmer by default.
Benchmarks
| swe bench verified | 88.6 |
| swe bench pro | 69.2 |
| terminal bench 2 1 | 74.6 |
| gdpval aa | 1890 |
| osworld | 83.4 |
| humanitys last exam tools | 57.9 |
| gpqa diamond | 93.6 |
| usamo 2026 | 96.7 |
Capabilities
- Dynamic workflows in Claude Code (up to 1,000 subagents, 16 concurrent)
- Effort control on claude.ai (defaults to high)
- Fast mode at ~2.5x speed, 3x cheaper than prior fast modes
- Mid-conversation system messages / real-time messages-array updates
- Adaptive thinking with low/medium/high/xhigh/max effort levels
- 1M context at standard pricing (no long-context premium), 128K output
Compare Claude Opus 4.8
Claude Opus 4.8 vs GPT-5.5
Opus 4.8 leads aggregate intelligence and SWE-Bench Pro; GPT-5.5 wins computer-use, terminal-agent loops, and native voice. The split is real enough to run both.
DeepSeek V4 vs Claude Opus 4.8
Opus 4.8 is the stronger model outright; DeepSeek V4 is the open-weight value play — frontier-class coding at roughly a third of the price, and you can own the weights.