LLM ReferenceLLM Reference
DeepInfra

DeepInfra Models — Pricing & Benchmarks

58 models available

DeepInfra hosts 58 AI models in this catalog. The lowest listed input price is Mistral NeMo Instruct (2407) at $0.02/1M input tokens. LLM Reference lets you compare these models across all 63 providers without switching tabs.

ModelInput (per 1M)Output (per 1M)Context
Mistral NeMo Instruct (2407)$0.02$0.04128K
Qwen2.5-7B-Instruct$0.03$0.03128K
Qwen3-9B$0.04$0.2256K
CodeGemma 1.1 7B$0.05$0.15
DeepInfra Google Gemma 2B$0.05$0.158K
DeepInfra Google Gemma 7B$0.05$0.158K
DeepInfra Llama 3 8B Instruct$0.05$0.158K
DeepInfra Phi 3 Mini 4K Instruct$0.05$0.154K
Gemma 1.1 7B Instruct$0.05$0.158K
Llama 3 8B Instruct$0.05$0.158K
LLaVA 1.5 7B$0.05$0.15
Mistral 7B Instruct v0.2$0.05$0.1532K
Mistral 7B v0.1$0.05$0.158K
OpenChat 3.6 8B$0.05$0.158K
Qwen2-7B$0.05$0.15128K
WizardLM-2 7B$0.05$0.15
Llama 2 7B Chat$0.07$0.074K
Llama 4 Scout 17B-16E Instruct$0.08$0.30328K
DeepInfra Stable LM 2 12B$0.1$0.34K
Nemotron 3 Super-120B-A12B$0.1$0.51M
Qwen2.5-14B-Instruct$0.10$0.10128K
Llama 2 13B Chat$0.13$0.134K
Phi-3 Medium 4K$0.14$0.414K
Dolphin 2.6 Mixtral 8x7B$0.15$0.45
Llama 4 Maverick 17B Instruct FP8$0.15$0.601M
Mixtral 8x7B Instruct v0.1$0.15$0.4533K
Qwen2-57B-A14B$0.16$0.16
CodeLlama 34B$0.20$0.45100K
DeepInfra StarCoder2 15B$0.2$0.616K
Phind CodeLlama 34B V2$0.20$0.458K
Qwen2.5-Coder-32B$0.20$0.20
StarCoder2 15B$0.20$0.608K
Yi 34B$0.25$0.38200K
Qwen3.5-27B$0.26$2.6262K
DeepSeek V3$0.32$0.8964k
Qwen2.5-72B-Instruct$0.36$0.40128K
Llama 3.1 70B Instruct$0.40$0.40128K
airoboros L2 70B 2.2.1$0.45$0.65
CodeLlama 70B$0.45$0.6516K
DeepInfra CodeLlama 70B Instruct$0.45$0.65100k
DeepInfra Llama 3 70B Instruct$0.45$0.658K
DeepInfra Phi 3 Small 128K Instruct$0.45$0.65128k
DeepInfra Qwen1.5-72B-Chat$0.45$0.6533K
Llama 3 70B Instruct$0.45$0.658K
Qwen2-72B$0.45$0.65128K
Mixtral 8x7B$0.54$0.5432K
DBRX Instruct$0.60$1.2032K
DeepInfra DBRX Instruct$0.6$1.233K
Llama 2 70B Chat$0.64$0.644K
Mixtral 8x22B v0.1$0.65$0.6564K
WizardLM-2 8x22B$0.65$0.65
Zephyr ORPO 141B$0.65$0.65
DeepSeek R1 Distill Llama 70B$0.70$0.80128K
Nemotron 4 340B$4.20$4.204K
Command R128K
Command R+128K
DeepSeek R1128K
Mistral Small32K

Pricing Overview

Cheapest$0.02/1M
Most expensive$4.20/1M

About DeepInfra

DeepInfra offers serverless AI inference with a simple API, supporting hundreds of models across text generation, embeddings, and more. Pay-per-token pricing with no upfront commitments.

Full provider profile →