gpt-oss-120b — Available Providers
Last refreshed 2026-06-30. Next refresh: weekly.
Compare pricing and deployment options across 11 providers.
Monthly cost ranking
Ranked for 1M input and 0.2M output tokens per month. Cache and batch discounts are applied only when the provider row has sourced prices.
Traffic profile
US default
Cost bars
Cheapest
Open Inference
Free at the selected traffic profile.
Cache-heavy
Vercel AI Gateway
Cache read is sourced at $0.25/1M.
Recipe-ready
GCP Vertex AI
Install, auth, and call snippets are available from curated provider snippet data.
Provider matrix
Region, SLA, and compliance rows are hidden until curated source fields exist.| Provider | Profile cost | Input / Output | Cache | Deploy | Recipe | Links |
|---|---|---|---|---|---|---|
| Open Inference inference | Free | Free / Free | - | Serverless | Docs only | |
| OpenRouter aggregator | $0.08 | $0.039 / $0.18 | - | Serverless | Docs only | |
| Novita AI inference | $0.10 | $0.050 / $0.25 | - | Serverless | Docs only | |
| GCP Vertex AI hyperscaler | $0.16 | $0.090 / $0.36 | - | Serverless | Snippets | |
| Fireworks AI inference | $0.27 | $0.15 / $0.60 | - | Serverless | Snippets | |
| GroqCloud inference | $0.27 | $0.15 / $0.60 | - | Serverless | Docs only | |
| Together AI inference | $0.27 | $0.15 / $0.60 | - | Serverless | Snippets | |
| Replicate API marketplace | $0.32 | $0.18 / $0.72 | - | Serverless | Snippets | |
| Cloudflare Workers AI inference | $0.50 | $0.35 / $0.75 | - | Serverless | Docs only | |
| Vercel AI Gateway gateway | $0.50 | $0.35 / $0.75 | read $0.25 | Serverless | Snippets | |
| NVIDIA NIM inference | Not enough pricing | - / - | - | Serverless | Docs only |
gpt-oss-120b operational data note: this page ranks sourced token, cache, and batch fields only. Region, SLA, compliance, and latency claims are intentionally omitted until a curated matrix is added.