Skip to content
DeepSeekGA

DeepSeek V4

Open-weight frontier-class coding at one-sixth the price — MIT-licensed, 1M context, self-hostable.

Read the full DeepSeek V4 analysis

Context

1M

Max output

384K

Input /1M

$0.44

Output /1M

$0.87

Live pricing via OpenRouter

Best for

  • Budget coding agents at scale
  • Open-weight self-hosting and fine-tuning
  • Cost-anchor for routing decisions

Watch out

Trails the closed frontier (Opus 4.8, GPT-5.5) on agentic SWE, security, and hardest reasoning; CAISI puts it ~8 months behind. Many spec-sheet numbers are vendor-claimed, and the compressed 1M context can degrade exact long-context retrieval.

For creators. Run a capable coding agent on your own hardware (Flash, 284B) without per-token API costs, or use the cheap Pro API for reasoning work.

Benchmarks

swe bench verified80.6
aa intelligence index52
livecodebench93.5
gpqa diamond90.1
mmlu pro87.5
aime 202587.5
terminal bench 2 067.9
humanitys last exam37.7

Capabilities

  • Open-weight MIT-licensed frontier-class coding (80.6% SWE-bench Verified)
  • 1M-token context via hybrid CSA+HCA attention (~10% KV cache vs V3.2)
  • Aggressive cost efficiency (~1/6 the price of Opus-class models)
  • Dual variants: Pro for reasoning, Flash for high-throughput
  • Self-hostable; NVFP4 quantization available

Compare DeepSeek V4

Sources