gpt-oss-120b — Available Providers

Name: gpt-oss-120b
Brand: OpenAI
SKU: gpt-oss-120b
Price: 0.350 USD

Last refreshed 2026-06-30. Next refresh: weekly.

Compare pricing and deployment options across 11 providers.

Monthly cost ranking

Ranked for 1M input and 0.2M output tokens per month. Cache and batch discounts are applied only when the provider row has sourced prices.

Traffic profile

US default

11 providers

Cost bars

Free

$0.08

$0.10

$0.16

$0.27

$0.27

$0.27

$0.32

Free at the selected traffic profile.

Cache-heavy

Vercel AI Gateway

Cache read is sourced at $0.25/1M.

Recipe-ready

GCP Vertex AI

Install, auth, and call snippets are available from curated provider snippet data.

Provider matrix

Region, SLA, and compliance rows are hidden until curated source fields exist.

Open Inference

inference

Free

Input / OutputFree / Free

Cache-

Serverless

OpenRouter

aggregator

$0.08

Input / Output$0.039 / $0.18

Cache-

Serverless

Novita AI

inference

$0.10

Input / Output$0.050 / $0.25

Cache-

Serverless

GCP Vertex AI

hyperscaler

$0.16

Input / Output$0.090 / $0.36

Cache-

ServerlessSnippets

Fireworks AI

inference

$0.27

Input / Output$0.15 / $0.60

Cache-

ServerlessSnippets

GroqCloud

inference

$0.27

Input / Output$0.15 / $0.60

Cache-

Serverless

Together AI

inference

$0.27

Input / Output$0.15 / $0.60

Cache-

ServerlessSnippets

Replicate API

marketplace

$0.32

Input / Output$0.18 / $0.72

Cache-

ServerlessSnippets

Cloudflare Workers AI

inference

$0.50

Input / Output$0.35 / $0.75

Cache-

Serverless

Vercel AI Gateway

gateway

$0.50

Input / Output$0.35 / $0.75

Cacheread $0.25

ServerlessSnippets

NVIDIA NIM

inference

Not enough pricing

Input / Output- / -

Cache-

Serverless

Provider	Profile cost	Input / Output	Cache	Deploy	Recipe
Open Inference inference	Free	Free / Free	-	Serverless	Docs only
OpenRouter aggregator	$0.08	$0.039 / $0.18	-	Serverless	Docs only
Novita AI inference	$0.10	$0.050 / $0.25	-	Serverless	Docs only
GCP Vertex AI hyperscaler	$0.16	$0.090 / $0.36	-	Serverless	Snippets
Fireworks AI inference	$0.27	$0.15 / $0.60	-	Serverless	Snippets
GroqCloud inference	$0.27	$0.15 / $0.60	-	Serverless	Docs only
Together AI inference	$0.27	$0.15 / $0.60	-	Serverless	Snippets
Replicate API marketplace	$0.32	$0.18 / $0.72	-	Serverless	Snippets
Cloudflare Workers AI inference	$0.50	$0.35 / $0.75	-	Serverless	Docs only
Vercel AI Gateway gateway	$0.50	$0.35 / $0.75	read $0.25	Serverless	Snippets
NVIDIA NIM inference	Not enough pricing	- / -	-	Serverless	Docs only

gpt-oss-120b operational data note: this page ranks sourced token, cache, and batch fields only. Region, SLA, compliance, and latency claims are intentionally omitted until a curated matrix is added.

gpt-oss-120b

Context131k

Parameters120B

Released2025-08-05

gpt-oss-120b — Available Providers

Monthly cost ranking

Provider matrix

gpt-oss-120b

Related Models