AnthropicGA
Claude Opus 4.6
The reasoning + long-context flagship. THE model for high-stakes synthesis.
Context
1M
Max output
128K
Input /1M
$5.00
Output /1M
$25.00
Live pricing via OpenRouter
Best for
- Abstract reasoning (#1 ARC-AGI-2, 68.8%)
- Computer-use agents (#1 OSWorld, 72.7%)
- 1M-context research synthesis and long agent sessions
- Parallel agent orchestration (Agent Teams)
Watch out
Premium pricing. For high-volume routine steps, route to a cheaper model.
For creators. The reasoning brain behind serious creator OS builds and multi-file code generation.
Benchmarks
| terminal bench 2 | 65.4 |
| arc agi 2 | 68.8 |
| osworld | 72.7 |
| biglaw bench | 90.2 |
| mrcr v2 1m | 76 |
Capabilities
- Adaptive thinking (auto-determines reasoning depth)
- 1M token context window (beta)
- 128K output tokens (2x previous)
- Agent Teams (parallel Claude Code agents)
- Compaction API (server-side context summarization)
- Enhanced agentic coding and debugging
- Lowest over-refusal rate in Claude family
- Fine-grained tool streaming (GA)
- Data residency controls (US-only option)
Compare Claude Opus 4.6
Gemini 3.5 Flash vs Claude Opus 4.6
Different tiers, different jobs. Flash wins cost-sensitive agentic coding (76.2% Terminal-Bench 2.1 at a fraction of the cost); Opus 4.6 wins high-stakes reasoning (68.8% ARC-AGI-2) and computer-use.
Claude Opus 4.6 vs GPT-5.2 Pro
Opus 4.6 for reasoning and long-context depth; GPT-5.2 Pro for native voice and the broadest multimodal + integration footprint.
More from Anthropic
Sources
- https://www.anthropic.com/news/claude-opus-4-6
- https://platform.claude.com/docs/en/about-claude/models/whats-new-claude-4-6
- https://siliconangle.com/2026/02/05/anthropic-rolls-claude-opus-4-6-1-million-token-context-support/
- https://thenewstack.io/anthropics-opus-4-6-is-a-step-change-for-the-enterprise/