Mixtral 8x7B — Available Providers
Last refreshed 2026-05-16. Next refresh: weekly.
Compare pricing and deployment options across 18 providers.
Monthly cost ranking
Ranked for 1M input and 0.2M output tokens per month. Cache and batch discounts are applied only when the provider row has sourced prices.
Traffic profile
US default
Cost bars
Cheapest
SiliconFlow
$0.24 at the selected traffic profile.
Recipe-ready
Replicate API
Install, auth, and call snippets are available from curated provider snippet data.
Provider matrix
Region, SLA, and compliance rows are hidden until curated source fields exist.| Provider | Profile cost | Input / Output | Deploy | Recipe | Links |
|---|---|---|---|---|---|
| SiliconFlow inference | $0.24 | $0.20 / $0.20 | Serverless | Docs only | |
| Mistral AI Studio lab | $0.24 | $0.15 / $0.45 | Serverless | Docs only | |
| Bitdeer AI Provider | $0.29 | $0.18 / $0.54 | Serverless | Docs only | |
| Microsoft Foundry hyperscaler | $0.32 | $0.27 / $0.27 | Provisioned | Docs only | |
| Lepton AI API inference | $0.36 | $0.30 / $0.30 | Serverless | Docs only | |
| Replicate API marketplace | $0.40 | $0.20 / $1.00 | Serverless | Snippets | |
| OctoAI API (Deprecated) inference | $0.54 | $0.45 / $0.45 | Serverless | Docs only | |
| AWS Bedrock hyperscaler | $0.59 | $0.45 / $0.70 | Serverless | Snippets | |
| Fireworks AI inference | $0.60 | $0.50 / $0.50 | Serverless | Snippets | |
| GCP Vertex AI hyperscaler | $0.64 | $0.40 / $1.20 | Serverless | Snippets | |
| DeepInfra inference | $0.65 | $0.54 / $0.54 | Serverless | Snippets | |
| Databricks Foundation Model Serving platform | $0.70 | $0.50 / $1.00 | Serverless | Docs only | |
| IBM watsonx platform | $0.72 | $0.60 / $0.60 | Serverless | Docs only | |
| Vultr Provider | $1.10 | $0.55 / $2.75 | Serverless | Docs only | |
| Alibaba Cloud PAI-EAS platform | Not enough pricing | - / - | Serverless | Docs only | |
| Baseten API inference | Not enough pricing | - / - | Serverless | Docs only | |
| NVIDIA NIM inference | Not enough pricing | - / - | Provisioned | Docs only | |
| Scale AI GenAI Platform platform | Not enough pricing | - / - | Serverless | Docs only |
Mixtral 8x7B operational data note: this page ranks sourced token, cache, and batch fields only. Region, SLA, compliance, and latency claims are intentionally omitted until a curated matrix is added.