Verified June-2026 pricing for every route to GPT-5.5, Claude Opus 4.8 and Gemini 3.1 Pro — subscriptions, raw APIs, and gateways. The cheapest path per use case, with the exact break-even math.

Pick the lowest-cost route to each frontier model based on how you actually use it.
The cheapest route depends entirely on how much you use the model. If you chat in a browser a few hours a day, a single $20 subscription (ChatGPT Plus, Claude Pro, or Google AI Pro) beats the API by a wide margin — you'd burn through $20 of tokens in an afternoon at API rates. If you build agents, scripts, or anything that calls the model programmatically, route through OpenRouter or Vercel AI Gateway — both charge the provider's list price with no inference markup, and one API key reaches all three frontier models. For high-volume background jobs, DeepSeek V4 Flash at $0.14/$0.28 per million tokens does 80% of the work at 3% of the cost. That's the whole answer. Below is the math that proves it.
Prices in this piece were verified the first week of June 2026. They will move. The reasoning won't.
This is the foundation, because every other route is priced relative to it. The three flagships, at standard list rates per million tokens:
| Model | Input / 1M | Output / 1M | Notes |
|---|---|---|---|
| GPT-5.5 | $5.00 | $30.00 | 2x increase over GPT-5.4 (April 2026) |
| GPT-5.5 Pro | $30.00 | $180.00 | The expensive reasoning tier |
| Claude Opus 4.8 | $5.00 | $25.00 | 1M-token context at flat rate |
| Gemini 3.1 Pro | $2.00 | $12.00 | Doubles past 200K-token context |
Two things stand out. Gemini 3.1 Pro is roughly half the price of its rivals on input and a third cheaper on output — Google is buying market share. And GPT-5.5's output token, at $30, is now the most expensive of the three flagships after OpenAI doubled the GPT-5 line in April.
Output tokens are where the bill lives. A typical agentic task reads a lot and writes a little, so the input rate matters for context-heavy work, but a chatty assistant that generates long responses gets billed mostly on output. Watch the right number for your workload.
Almost always, if you're a human typing in a browser.
The consumer subscriptions, verified June 2026:
| Plan | Price/mo | What you get |
|---|---|---|
| ChatGPT Plus | $20 | GPT-5.5, generous daily caps |
| ChatGPT Pro | $200 | Unlimited advanced reasoning |
| Claude Pro | $20 | Opus 4.8, daily message caps |
| Claude Max | $100 / $200 | 5x / 20x Pro usage |
| Google AI Pro | $19.99 | Gemini 3.1 Pro, 2TB storage |
| Google AI Ultra | $249.99 | Highest model access |
Here's the break-even. A heavy interactive session — long documents pasted in, multi-turn back-and-forth — can run 2-3 million tokens in a day. At Opus 4.8 output rates, 3M output tokens alone is $75. One day. The $20 Claude Pro subscription covers that and 29 more days, subject to message caps that most individual users never hit.
The rule: if a human is in the loop reading every response, buy the subscription. You are physically incapable of consuming enough tokens by hand to make per-token API billing cheaper, and the subscription bundles the app, memory, file uploads, and image generation you'd otherwise wire up yourself.
The subscription stops winning the moment the calls become automated. A cron job doesn't get tired. That's the other half of this guide.
When the model runs without you watching — and especially when it runs sometimes.
The subscription is a flat $20 whether you use it or not. The API is $0 when idle. If you have a side project that fires a few thousand calls a week, you might spend $3 a month on the API versus $20 for a seat you barely touch. API billing wins on bursty, automated, or production workloads.
It also wins when you need a model in code. Subscriptions don't give you an endpoint — you can't point a Next.js route or a Python agent at ChatGPT Plus. The moment you're building rather than chatting, you need an API key regardless of cost.
Three levers cut the API bill hard, and they stack:
A gateway. You do not want three billing relationships, three SDKs, and three sets of keys for a project that needs to A/B GPT-5.5 against Opus 4.8 against Gemini. You want one key that reaches everything, and you do not want to pay extra for the convenience.
Two options matter in June 2026, and a price war has driven both to zero markup on inference:
For most builders the two are interchangeable on price. Choose OpenRouter for catalog breadth and provider-routing fallbacks; choose Vercel's gateway if your stack already lives there. Either way you've consolidated three vendors into one line item with no tax on top. If you want the full reasoning behind a gateway-first stack, I broke it down in the AI superpowers stack for 2026.
Different products solving a different problem.
Poe bundles dozens of models behind one $20/month consumer subscription with a points system. It's a fine browser experience if you want to switch between Opus, GPT-5.5, and Gemini in one chat window without three logins. But points convert to tokens at a rate that, for heavy use, costs more than going direct — Poe is paying the API bill and adding margin. Treat it as a convenience subscription, not a cost-optimization play.
Perplexity isn't really a model-access route — it's an answer engine with live web search wrapped around frontier models. Perplexity Pro is $20/month. You use it when you want cited, search-grounded answers, not when you want raw model access. On honesty: Perplexity does run a public affiliate program (powered by Dub — roughly $10 flat per paid signup plus 10% recurring), so unlike OpenRouter, a writer linking it can earn a commission. I'm naming that so you can weigh any Perplexity recommendation — including a neutral one like this — with that incentive in view.
Very cheap, if you stop paying frontier prices for non-frontier work.
Most tasks in a real pipeline — classification, extraction, formatting, routine summarization — don't need Opus 4.8 or GPT-5.5. They need a competent model that costs almost nothing. The cheap tier in June 2026:
| Model | Input / 1M | Output / 1M | Use it for |
|---|---|---|---|
| Gemini 3.1 Flash-Lite | $0.10 | $0.40 | Cheapest proprietary; classification, routing |
| DeepSeek V4 Flash | $0.14 | $0.28 | Bulk processing, drafts, extraction |
| Gemini 2.5 Flash | $0.15 | $0.60 | Fast summarization with web grounding |
| DeepSeek V4 Pro | $1.74 | $3.48 | Reasoning at a fraction of flagship cost |
The math is brutal in the cheap models' favor. DeepSeek V4 Flash output at $0.28 is 107x cheaper than GPT-5.5's $30. Run the routing logic on Flash-Lite, the heavy synthesis on a flagship, and you'll cut a pipeline's bill by 80-90% with no perceptible quality loss on the routed-away tasks.
This is the single biggest lever in the entire guide: route by task difficulty, not by habit. The flagship is for the 10% of work that genuinely needs it. A gateway makes this trivial — same key, swap the model string. If you're building an agent that does this routing for you, the pattern is in build your own Jarvis with Claude Code.
Match the route to the job:
| If you... | Cheapest route | Why |
|---|---|---|
| Chat in a browser daily | $20 subscription (Plus/Pro/AI Pro) | You can't out-type the flat fee |
| Want all 3 flagships in one app | Poe $20 or rotate free tiers | Convenience over cost |
| Build agents or scripts | OpenRouter or Vercel AI Gateway | One key, zero markup |
| Run high-volume background jobs | DeepSeek V4 Flash + Batch API | 100x cheaper for routine work |
| Need cited, searched answers | Perplexity Pro $20 | Search-grounded, not raw access |
| Run a mixed production pipeline | Gateway + task-based routing | Flagship only where it earns it |
Most serious operators end up with two lines on the bill: one $20 subscription for hands-on work, and one gateway key for everything programmatic. That covers the entire spectrum for roughly $20-40 a month plus usage. The models will keep changing names and prices; this shape holds.
For the deeper comparison of what each flagship is actually good at — not just what it costs — see the frontier model landscape for 2026. And if you want the creator-focused build of this whole stack, that's what GenCreator is.
Is the API ever cheaper than a $20 subscription for a single human user? Rarely. A person chatting in a browser tops out around 1-3 million tokens a day, and at flagship output rates a single heavy day can exceed the full $20 monthly subscription cost. Unless your usage is genuinely sporadic — a few sessions a month — the subscription wins for interactive work. The API wins the instant the calls become automated.
Does OpenRouter cost more than going direct to OpenAI or Anthropic? No markup on inference — the catalog price equals the provider's list price. There's a 5.5% fee when you buy pay-as-you-go credits with a card, which you avoid entirely by bringing your own provider key. For the convenience of one key across 300+ models, it's effectively free. Vercel AI Gateway is the same story at 0% markup with $5/month in free credits.
What's the single cheapest way to run a high-volume AI task? DeepSeek V4 Flash ($0.14 input / $0.28 output per million tokens) through the Batch API for another 50% off, with prompt caching on top. For routine classification, extraction, and drafting, that's roughly 1-3% of what a frontier flagship costs, with no meaningful quality drop on those task types.
Should I pay for ChatGPT Pro or Claude Max at $200/month? Only if you're hitting the daily caps on the $20 tier constantly, or you need the highest-volume reasoning access for sustained professional work. Most users never approach those limits. Before upgrading to a $200 seat, check whether the work is actually programmatic — if so, an API key plus task-based model routing will likely cost less than $200 and scale better.
Which gateway has an affiliate program — and does it bias these recommendations? OpenRouter has no public affiliate program, so no writer earns a cut for recommending it; I use it because it's the best default, full stop. Perplexity does run a public affiliate program (Dub-powered, ~$10 flat plus 10% recurring). I name both so you can weight any recommendation against the incentive behind it. Nothing in this guide is sponsored.
Will these prices be accurate next month? No. Frontier pricing moves every few weeks — the GPT-5.5 doubling and the DeepSeek price collapse both happened in 2026 alone. The verified numbers here are a June-2026 snapshot. The decision framework — subscription for humans, gateway for builders, cheap models for bulk — is what survives. Re-check the live pricing pages before committing to a high-volume contract.
Step-by-step guide to setting up ACOS, creating your first agent, and shipping real products with AI.
Start buildingDownload AI architecture templates, multi-agent blueprints, and prompt engineering patterns.
Browse templatesConnect with creators and architects shipping AI products. Weekly office hours, shared resources, direct access.
Join the circleRead on FrankX.AI — AI Architecture, Music & Creator Intelligence
Weekly field notes on AI systems, production patterns, and builder strategy.