LLM Reference
DeepInfra

DeepInfra Models — Pricing & Benchmarks

60 models available

DeepInfra hosts 60 AI models in this catalog. The lowest listed input price is Llama 3 8B Instruct at $0.02/1M input tokens. LLM Reference lets you compare these models across all 80 providers without switching tabs.

ModelInput (per 1M)Output (per 1M)Context
Llama 3 8B Instruct$0.02$0.058k
Llama 3.1 8B Instruct$0.02$0.05128k
Mistral NeMo Instruct (2407)$0.02$0.04128k
Qwen2.5-7B-Instruct$0.03$0.03128k
Qwen3-9B$0.04$0.2256k
CodeGemma 1.1 7B$0.05$0.158k
DeepInfra Google Gemma 2B$0.05$0.158k
DeepInfra Google Gemma 7B$0.05$0.158k
DeepInfra Llama 3 8B Instruct$0.05$0.158k
DeepInfra Phi 3 Mini 4K Instruct$0.05$0.154k
Gemma 1.1 7B Instruct$0.05$0.158k
LLaVA 1.5 7B$0.05$0.154k
Mistral 7B Instruct v0.2$0.05$0.1532k
Mistral 7B v0.1$0.05$0.158k
OpenChat 3.6 8B$0.05$0.158k
Qwen2-7B$0.05$0.15128k
WizardLM-2 7B$0.05$0.15
Llama 2 7B Chat$0.07$0.074k
Llama 4 Scout 17B-16E Instruct$0.08$0.310m
DeepInfra Stable LM 2 12B$0.1$0.34k
Nemotron 3 Super-120B-A12B$0.1$0.51.05m
Qwen2.5-14B-Instruct$0.1$0.1128k
Llama 2 13B Chat$0.13$0.134k
Phi-3 Medium 4K$0.14$0.414k
Dolphin 2.6 Mixtral 8x7B$0.15$0.4532k
Llama 4 Maverick 17B Instruct FP8$0.15$0.61m
Mixtral 8x7B Instruct v0.1$0.15$0.4533k
Qwen2-57B-A14B$0.16$0.16
CodeLlama 34B$0.2$0.45100k
DeepInfra StarCoder2 15B$0.2$0.616k
Phind CodeLlama 34B V2$0.2$0.458k
Qwen2.5-Coder-32B$0.2$0.2128k
StarCoder2 15B$0.2$0.68k
Yi 34B$0.25$0.38200k
Qwen3.5-27B$0.26$2.60262k
DeepSeek V3$0.32$0.8964k
Qwen2.5-72B-Instruct$0.36$0.4128k
Llama 3.1 70B Instruct$0.4$0.4128k
airoboros L2 70B 2.2.1$0.45$0.65
CodeLlama 70B$0.45$0.6516k
DeepInfra CodeLlama 70B Instruct$0.45$0.65100k
DeepInfra Llama 3 70B Instruct$0.45$0.658k
DeepInfra Phi 3 Small 128K Instruct$0.45$0.65128k
DeepInfra Qwen1.5-72B-Chat$0.45$0.6533k
Llama 3 70B Instruct$0.45$0.658k
Qwen2-72B$0.45$0.65128k
DeepSeek R1 0528$0.5$2.15130k
Mixtral 8x7B$0.54$0.5432k
DBRX Instruct$0.6$1.2032k
DeepInfra DBRX Instruct$0.6$1.2033k
Llama 2 70B Chat$0.64$0.644k
Mixtral 8x22B v0.1$0.65$0.6564k
WizardLM-2 8x22B$0.65$0.65
Zephyr ORPO 141B$0.65$0.65
DeepSeek R1 Distill Llama 70B$0.7$0.8128k
Nemotron 4 340B$4.20$4.204k
Command R128k
Command R+128k
DeepSeek R1128k
Mistral Small32k

Where else to run this

Pricing Overview

Cheapest$0.02/1M
Most expensive$4.20/1M

About DeepInfra

DeepInfra offers serverless AI inference with a simple API, supporting hundreds of models across text generation, embeddings, and more. Pay-per-token pricing with no upfront commitments.

Full provider profile →