Updated 2026-06-26

Compare AI & LLM API prices in one place

Side-by-side API pricing for every major model — GPT-5, Claude, Gemini, Llama, Mistral, DeepSeek and more. Sort by input or output cost per token, see the context window, and find the cheapest model that does the job.

Output price gap
up to 500×
Qwen3 235B A22B vs Claude Fable 5 — same task, wildly different bill
Cheapest output
$0.100
Qwen3 235B A22B · per 1M tokens
Biggest context
2M
Grok 4 Fast
Tracked
43
models · 14 providers

LLM API pricing comparison

All prices in USD per 1,000,000 tokens. Click any column to sort — the table defaults to cheapest reference cost first. Click a model to see its full provider pricing.

Model Provider Input /1M Output /1M Cost / call* Context Type
Gemini 1.5 Flash-8B
Cheapest capable model
Google $0.037 $0.150 $0.0002 1M Fast / cheap
Command R7B
Cheapest Cohere tier
Cohere $0.037 $0.150 $0.0002 128K Fast / cheap
Qwen3 235B A22B
Open weights, very cheap
Alibaba (Qwen) $0.090 $0.100 $0.0002 256K Fast / cheap open
MiMo v2.5
Open, cheap, 1M context
Xiaomi (MiMo) $0.105 $0.280 $0.0004 1M Fast / cheap open
Step 3.5 Flash
Cheapest Step tier
StepFun $0.090 $0.300 $0.0004 256K Fast / cheap
Grok 3 mini xAI (Grok) $0.100 $0.300 $0.0004 128K Fast / cheap
Llama 3.1 8B
Open weights
Meta (Llama) $0.200 $0.200 $0.0004 128K Open weights open
DeepSeek-V4 Flash
Non-thinking mode, very cheap
DeepSeek $0.140 $0.280 $0.0004 128K Fast / cheap open
Llama 4 Scout
Open weights (Groq)
Meta (Llama) $0.110 $0.340 $0.0005 128K Open weights open
Gemini 2.0 Flash Google $0.100 $0.400 $0.0005 1M Fast / cheap
Grok 4 Fast
2M context
xAI (Grok) $0.200 $0.500 $0.0007 2M Fast / cheap
Mistral Small 4 Mistral $0.150 $0.600 $0.0007 128K Fast / cheap
Command R Cohere $0.150 $0.600 $0.0007 128K Balanced
GLM-4.5 Air
Cheap open tier
Zhipu AI (GLM) $0.130 $0.850 $0.0010 128K Fast / cheap open
Llama 4 Maverick
Open weights (Together AI)
Meta (Llama) $0.270 $0.850 $0.0011 500K Open weights open
Codestral
Code-specialised
Mistral $0.300 $0.900 $0.0012 32K Balanced
MiniMax-M2
Open weights
MiniMax $0.260 $1.00 $0.0013 200K Balanced open
DeepSeek-V4 Pro
Thinking mode
DeepSeek $0.435 $0.870 $0.0013 128K Reasoning open
MiMo v2.5 Pro
Open weights, 1M context
Xiaomi (MiMo) $0.435 $0.870 $0.0013 1M Balanced open
Step 3.7 Flash
Fast tier
StepFun $0.200 $1.15 $0.0014 256K Fast / cheap
GPT-5.4 nano
Cheapest OpenAI tier
OpenAI $0.200 $1.25 $0.0015 400K Fast / cheap
MiniMax-M3
Open weights, 1M context
MiniMax $0.300 $1.20 $0.0015 1M Flagship open
Grok Code Fast 1
Code-specialised
xAI (Grok) $0.200 $1.50 $0.0017 256K Balanced
Mistral Large 3
Flagship
Mistral $0.500 $1.50 $0.0020 256K Flagship
Qwen3 Coder
Open, code-specialised, 1M ctx
Alibaba (Qwen) $0.220 $1.80 $0.0020 1M Balanced open
Llama 3.3 70B
Open weights
Meta (Llama) $1.04 $1.04 $0.0021 128K Open weights open
GLM-4.6
Popular open model
Zhipu AI (GLM) $0.430 $1.74 $0.0022 200K Balanced open
Gemini 2.5 Flash Google $0.300 $2.50 $0.0028 1M Balanced
Kimi K2
Open weights
Moonshot AI (Kimi) $0.570 $2.30 $0.0029 128K Balanced open
Kimi K2 Thinking
Open-weight reasoning
Moonshot AI (Kimi) $0.600 $2.50 $0.0031 256K Reasoning open
GLM-5.2
Open weights, 1M context
Zhipu AI (GLM) $0.950 $3.00 $0.0040 1M Flagship open
Qwen3 Max
Flagship
Alibaba (Qwen) $0.780 $3.90 $0.0047 256K Flagship
GPT-5.4 mini OpenAI $0.750 $4.50 $0.0053 400K Balanced
Claude Haiku 4.5
Fastest, near-frontier
Anthropic $1.00 $5.00 $0.0060 200K Fast / cheap
Mistral Medium 3.5 Mistral $1.50 $7.50 $0.0090 256K Balanced
Gemini 2.5 Pro
1M context (≤200k tier)
Google $1.25 $10.00 $0.011 1M Flagship
Command A
Flagship, RAG-tuned
Cohere $2.50 $10.00 $0.013 256K Flagship
GPT-5.4
Balanced flagship
OpenAI $2.50 $15.00 $0.017 400K Flagship
Claude Sonnet 4.6
Best speed/intelligence; caching cuts input ~90%
Anthropic $3.00 $15.00 $0.018 1M Balanced
Grok 4
Flagship
xAI (Grok) $3.00 $15.00 $0.018 256K Flagship
Claude Opus 4.8
Top Opus reasoning/agentic
Anthropic $5.00 $25.00 $0.030 1M Flagship
GPT-5.5
Flagship
OpenAI $5.00 $30.00 $0.035 400K Flagship
Claude Fable 5
Most capable widely released
Anthropic $10.00 $50.00 $0.060 1M Flagship

*Reference cost of one call with 1,000 input + 1,000 output tokens — a neutral yardstick. Use the calculator for your real usage. Cheapest row highlighted. Tick any models to build a shareable price card.

Go deeper

Pricing by provider

Best LLM for…

Frequently asked questions

How is LLM API pricing calculated?

Almost every LLM API charges per token, split into an input (prompt) price and a usually higher output (completion) price, quoted per 1,000,000 tokens. Your bill is (input tokens × input price) + (output tokens × output price). A token is roughly ¾ of a word.

Which LLM API is cheapest?

For high-volume work the cheapest capable models are Google Gemini 1.5 Flash-8B and 2.0 Flash, OpenAI GPT-5.4 nano, and DeepSeek-V4 Flash. The lowest sticker price is not always the cheapest in practice — a weaker model that needs retries or escalation can cost more overall.

What is the difference between input and output token pricing?

Input tokens are what you send (the prompt, system message, context, documents). Output tokens are what the model generates. Output is typically 2–5× more expensive than input, so capping max output length is the fastest way to cut cost.

Are these prices up to date?

Prices are list prices last verified on 2026-06-26 and link to each provider's official pricing page. The AI market moves fast — always confirm the current price with the provider before committing to volume.

How we keep prices honest. Every number is an official list price last verified on 2026-06-26, and each provider page links to the source. Spotted a stale price? The whole catalog lives in one file so it is fast to refresh.