Llama 4 Maverick 17B Instruct FP8 — Available Providers
Last refreshed 2026-06-29. Next refresh: weekly.
Compare pricing and deployment options across 11 providers.
Monthly cost ranking
Ranked for 1M input and 0.2M output tokens per month. Cache and batch discounts are applied only when the provider row has sourced prices.
Traffic profile
US default
Cost bars
Cheapest
DeepInfra
$0.27 at the selected traffic profile.
Batch jobs
AWS Bedrock
$0.12 / $0.485 batch pricing is sourced for queued workloads.
Recipe-ready
DeepInfra
Install, auth, and call snippets are available from curated provider snippet data.
Provider matrix
Region, SLA, and compliance rows are hidden until curated source fields exist.| Provider | Profile cost | Input / Output | Batch | Deploy | Recipe | Links |
|---|---|---|---|---|---|---|
| DeepInfra inference | $0.27 | $0.15 / $0.60 | - | Serverless | Snippets | |
| OpenRouter aggregator | $0.27 | $0.15 / $0.60 | - | Serverless | Docs only | |
| AWS Bedrock hyperscaler | $0.43 | $0.24 / $0.97 | $0.12 / $0.485 | Serverless | Snippets | |
| Vercel AI Gateway gateway | $0.43 | $0.24 / $0.97 | - | Serverless | Snippets | |
| Novita AI inference | $0.44 | $0.27 / $0.85 | - | Serverless | Docs only | |
| Together AI inference | $0.44 | $0.27 / $0.85 | - | Serverless | Snippets | |
| GCP Vertex AI hyperscaler | $0.58 | $0.35 / $1.15 | - | Serverless | Snippets | |
| Microsoft Foundry hyperscaler | $0.63 | $0.35 / $1.41 | - | ServerlessProvisioned | Docs only | |
| Fireworks AI inference | Not enough pricing | - / - | - | Serverless | Snippets | |
| Inceptron inference | Not enough pricing | - / - | - | Serverless | Docs only | |
| NVIDIA NIM inference | Not enough pricing | - / - | - | Serverless | Docs only |
Llama 4 Maverick 17B Instruct FP8 operational data note: this page ranks sourced token, cache, and batch fields only. Region, SLA, compliance, and latency claims are intentionally omitted until a curated matrix is added.