LLM Reference
Fireworks AI

Fireworks AI Models — Pricing & Benchmarks

224 models available

Fireworks AI hosts 224 AI models in this catalog. The lowest listed input price is gpt-oss-20b at $0.07/1M input tokens. LLM Reference lets you compare these models across all 80 providers without switching tabs.

ModelInput (per 1M)Output (per 1M)Context
gpt-oss-20b$0.07$0.3131k
CodeGemma 2B$0.1$0.18k
Cogito v1 Preview Llama 3B$0.1$0.1128k
DeepSeek Coder 1.3B$0.1$0.14k
DeepSeek R1 Distill Qwen-1.5B$0.1$0.1128k
Fireworks Zephyr-7B-beta$0.1$0.18k
Gemma 2B Instruct$0.1$0.12k
Gemma 3 1B Instruct$0.1$0.132k
Llama 3.2 1B$0.1$0.1128k
Llama 3.2 1B Instruct$0.1$0.1128k
Llama 3.2 3B$0.1$0.1128k
Llama 3.2 3B Instruct$0.1$0.1128k
Llama Guard 3 1B$0.1$0.1128k
Phi-2$0.1$0.12k
Phi-3 Mini 128K$0.1$0.1128k
Qwen2.5-0.5B-Instruct$0.1$0.1128k
Qwen2.5-1.5B-Instruct$0.1$0.1128k
Qwen2.5-Coder-1.5B-Instruct$0.1$0.132k
Qwen2.5-Coder-3B-Instruct$0.1$0.132k
Qwen3-0.6B$0.1$0.140k
Qwen3-1.7B$0.1$0.140k
Stable Code 3B$0.1$0.116k
Stable LM 2 Zephyr 1.6B$0.1$0.1
Stable LM Zephyr 3B$0.1$0.1
StarCoder2 3B$0.1$0.18k
Fireworks Llama-3-8B-Instruct$0.15$0.158k
Fireworks Solar-10.7B-Instruct-v1.0$0.15$0.154k
gpt-oss-120b$0.15$0.6131k
Qwen3-VL-30B-A3B$0.15$0.6128k
Chronos Hermes 13B V2$0.2$0.24k
CodeGemma 7B$0.2$0.28k
CodeGemma 7B Instruct$0.2$0.28k
CodeLlama 13B$0.2$0.2100k
CodeLlama 13B Instruct$0.2$0.216k
CodeLlama 13B Python$0.2$0.2100k
CodeLlama 7B$0.2$0.2100k
CodeLlama 7B Instruct$0.2$0.216k
CodeLlama 7B Python$0.2$0.2100k
CodeQwen1.5 7B$0.2$0.264k
Cogito v1 Preview Llama 8B$0.2$0.2128k
Cogito v1 Preview Qwen-14B$0.2$0.2128k
DeepSeek Coder 6.7B$0.2$0.24k
DeepSeek Coder 7B V1.5$0.2$0.216k
DeepSeek Coder V2 Lite Instruct$0.2$0.2128k
DeepSeek R1 0528 Distill Qwen3-8B$0.2$0.2160k
DeepSeek R1 0528 Qwen3-8B$0.2$0.2160k
DeepSeek R1 Distill Llama 8B$0.2$0.2128k
DeepSeek R1 Distill Qwen-14B$0.2$0.2128k
DeepSeek R1 Distill Qwen-7B$0.2$0.2128k
DeepSeek V2 Lite Chat$0.2$0.232k
ELYZA Japanese Llama 2 7B$0.2$0.2
Gemma 2 9B$0.2$0.28k
Gemma 2 9B Instruct$0.2$0.28k
Gemma 3 12B Instruct$0.2$0.2128k
Gemma 3 4B Instruct$0.2$0.2128k
Gemma 7B$0.2$0.28k
Gemma 7B Instruct$0.2$0.28k
GLM-4 9B$0.2$0.2131k
GLM-Z1 9B$0.2$0.2
Hermes 2 Pro Mistral 7B$0.2$0.232k
Japanese Stable VLM$0.2$0.2
Japanese StableLM Gamma 7B$0.2$0.28k
Llama 2 13B$0.2$0.24k
Llama 2 13B Chat$0.2$0.24k
Llama 2 7B$0.2$0.24k
Llama 2 7B Chat$0.2$0.24k
Llama 3 8B$0.2$0.28k
Llama 3 8B Instruct$0.2$0.28k
Llama 3.1 8B Instruct$0.2$0.2128k
Llama 3.2 11B Vision Instruct$0.2$0.2128k
Llama Guard 2 8B$0.2$0.28k
Llama Guard 3 8B$0.2$0.28k
Llama Guard 7B$0.2$0.22k
Mistral 7B Instruct v0.1$0.2$0.28k
Mistral 7B Instruct v0.2$0.2$0.232k
Mistral 7B Instruct v0.3$0.2$0.232k
Mistral 7B OpenOrca$0.2$0.28k
Mistral 7B v0.1$0.2$0.28k
Mistral NeMo (2407)$0.2$0.2128k
MythoMax L2 13B$0.2$0.24k
Nous Capybara 7B V1.9$0.2$0.2
Nous Hermes Llama 2 13B$0.2$0.2
Nous Hermes Llama 2 7B$0.2$0.2
OpenChat 3.5 (0106)$0.2$0.28k
OpenHermes 2 Mistral 7B$0.2$0.232k
OpenHermes 2.5 Mistral 7B$0.2$0.232k
Phi-3 Vision$0.2$0.2128k
Pythia 12B$0.2$0.22k
Qwen-14B$0.2$0.232k
Qwen2-7B$0.2$0.2128k
Qwen2.5-14B$0.2$0.2128k
Qwen2.5-14B-Instruct$0.2$0.2128k
Qwen2.5-7B$0.2$0.2128k
Qwen2.5-7B-Instruct$0.2$0.2128k
Qwen2.5-Coder-14B-Instruct$0.2$0.2128k
Qwen2.5-Coder-7B-Instruct$0.2$0.2128k
Qwen3-14B$0.2$0.240k
Qwen3-4B$0.2$0.240k
Qwen3-8B$0.2$0.2128k
Snorkel Mistral PairRM$0.2$0.232k
SOLAR 10.7B$0.2$0.24k
StarCoder$0.2$0.28k
StarCoder2 15B$0.2$0.28k
StarCoder2 7B$0.2$0.28k
Toppy M 7B$0.2$0.24k
Yi 6B$0.2$0.2200k
Yi 9B$0.2$0.24k
Zephyr 7B Beta$0.2$0.2
Fireworks CodeLlama-34b-Instruct$0.3$0.316k
MiniMax M2.7$0.3$1.20205k
MiniMax-M1-240k$0.3$1.20240k
MiniMax-M2-240k$0.3$1.20240k
MiniMax-M2-240k-0905$0.3$1.20240k
MiniMax-M2-80k-0905$0.3$1.2080k
MiniMax-M2.5$0.3$1.20
Fireworks Yi-34B-Chat$0.4$0.44k
DeepSeek Coder V2 Lite$0.5$0.5128k
Dolphin 2.6 Mixtral 8x7B$0.5$0.532k
Firefunction V1$0.5$0.58k
Mixtral 8x7B$0.5$0.532k
Nous Hermes 2 Mixtral 8x7B$0.5$0.532k
Phi 3.5 MoE Instruct$0.5$0.5128k
Phi 4 Reasoning$0.5$0.5128k
Phi 4 Reasoning Plus$0.5$0.5128k
Qwen3-30B-A3B$0.5$0.5128k
DeepSeek Prover V2$0.56$1.68160k
DeepSeek R1$0.56$1.68128k
DeepSeek R1 0528$0.56$1.68130k
DeepSeek R1 Basic$0.56$1.68160k
DeepSeek V2.5$0.56$1.68128k
DeepSeek V3$0.56$1.6864k
DeepSeek V3 0324$0.56$1.68160k
DeepSeek V3.1$0.56$1.6864k
DeepSeek V3.2$0.56$1.68160k
GLM 4.7$0.6$2.20200k
GLM-4.7$0.6$2.20128k
Kimi K2 Instruct$0.6$2.50131k
Kimi K2 Instruct 0905$0.6$2.50131k
Kimi K2 Thinking$0.6$2.50256k
Kimi K2.5$0.6$3.00256k
Fireworks Qwen-72B-Chat$0.8$0.833k
CodeLlama 34B$0.9$0.9100k
CodeLlama 34B Instruct$0.9$0.916k
CodeLlama 34B Python$0.9$0.9100k
CodeLlama 70B$0.9$0.916k
CodeLlama 70B Instruct$0.9$0.916k
CodeLlama 70B Python$0.9$0.916k
Cogito v1 Preview Llama 70B$0.9$0.9128k
Cogito v1 Preview Qwen-32B$0.9$0.9128k
DeepSeek Coder 33B$0.9$0.916k
DeepSeek Coder 33B Instruct$0.9$0.916k
DeepSeek R1 Distill Llama 70B$0.9$0.9128k
DeepSeek R1 Distill Qwen-32B$0.9$0.9128k
Dolphin 2.9.2 Qwen2-72B$0.9$0.9128k
FARE-20B$0.9$0.9128k
Firefunction V2$0.9$0.932k
FireLLaVA 13B$0.9$0.94k
Gemma 2 27B Instruct$0.9$0.98k
Gemma 3 27B Instruct$0.9$0.9128k
GLM Z1 32B$0.9$0.9128k
GLM Z1 Rumination 32B$0.9$0.9128k
GLM-4.5$0.9$0.9128k
GLM-4.5-Air$0.9$0.9128k
GLM-4.6$0.9$0.9198k
GLM-4.7 Flash$0.9$0.9198k
Japanese StableLM 70B$0.9$0.94k
KAT Coder$0.9$0.9256k
KAT Dev 32B$0.9$0.9
KAT Dev 72B Exp$0.9$0.9
Llama 2 70B$0.9$0.94k
Llama 2 70B Chat$0.9$0.94k
Llama 3 70B Instruct$0.9$0.98k
Llama 3.1 70B Instruct$0.9$0.9128k
Llama 3.2 90B Vision Instruct$0.9$0.9128k
Llama 3.3 70B$0.9$0.98k
LLaVA 1.6 Hermes Yi 34B$0.9$0.9200k
MiniMax M2$0.9$0.9197k
MiniMax-M1-80k$0.9$0.980k
MiniMax-M2-80k$0.9$0.980k
Mistral Large$0.9$0.932k
Mistral NeMo Instruct (2407)$0.9$0.9128k
Mistral Small 3.1 24B Instruct$0.9$0.9128k
Nous Capybara 34B$0.9$0.9200k
Nous Hermes 2 Yi 34B$0.9$0.9200k
Nous Hermes Llama 2 70B$0.9$0.9
Phi 3.5 Mini Instruct$0.9$0.9128k
Phi 4 Multimodal Instruct$0.9$0.9128k
Phi-4 14B$0.9$0.916k
Phi-4 Mini$0.9$0.9128k
Phind CodeLlama 34B Python V1$0.9$0.98k
Phind CodeLlama 34B V1$0.9$0.98k
Phind CodeLlama 34B V2$0.9$0.98k
Qwen-72B$0.9$0.932k
Qwen1.5-72B$0.9$0.932k
Qwen2-72B$0.9$0.9128k
Qwen2-VL-72B-Instruct$0.9$0.932k
Qwen2.5-32B$0.9$0.9128k
Qwen2.5-32B-Instruct$0.9$0.9128k
Qwen2.5-72B$0.9$0.9128k
Qwen2.5-72B-Instruct$0.9$0.9128k
Qwen2.5-Coder-32B$0.9$0.9128k
Qwen2.5-Coder-32B-Instruct$0.9$0.9128k
Qwen3-32B$0.9$0.940k
Yi 34B$0.9$0.9200k
Yi 34B 200K$0.9$0.9200k
Yi Large$0.9$0.932k
Kimi K2.6$0.95$4.00262k
GLM-5$1.00$3.20200k
DBRX Instruct$1.20$1.2032k
DeepSeek Coder V2$1.20$1.20128k
DeepSeek Coder V2 Instruct$1.20$1.20128k
ERNIE 4.5$1.20$1.208k
Mixtral 8x22B Instruct v0.1$1.20$1.2064k
Mixtral 8x22B v0.1$1.20$1.2064k
Qwen3-235B-A22B$1.20$1.20128k
GLM-5.1$1.40$4.40200k
Fireworks DBRX-Instruct$1.50$1.5033k
DeepSeek V4 Pro$1.74$3.481m
Llama 3.1 405B Instruct$3.00$3.00128k
Llama 4 Maverick 17B Instruct FP81m
Llama 4 Scout 17B-16E Instruct10m
Mistral Small32k
Nemotron 3 Super-120B-A12B1.05m
Qwen3-Coder-480B-A35B-Instruct262k

Where else to run this

Pricing Overview

Cheapest$0.07/1M
Most expensive$3.00/1M

About Fireworks AI

The Fireworks AI Platform is a comprehensive generative AI solution that enables developers and businesses to build, customize, and deploy AI models at scale. It supports a diverse range of cutting-edge open-source models, including Meta's Llama and Stable Diffusion, for tasks such as natural language processing and image generation. The platform's serverless architecture allows for quick deployment without extensive infrastructure management, operating on a pay-as-you-go basis. Users can fine-tune models using parameter-efficient techniques, ensuring tailored AI solutions that maintain high performance for specific business needs. Optimized for high throughput and low latency, the platform can handle trillions of inferences daily while providing a seamless user experience. It offers tools for efficient model maintenance and iteration, allowing businesses to focus on innovation rather than complex AI model management. The platform's design facilitates easy integration and customization, enabling organizations to effectively scale their AI-powered solutions. With its cost-efficient approach and comprehensive features, the Fireworks AI Platform empowers businesses to leverage advanced AI capabilities for enhanced productivity and competitive advantage in their respective markets.

Full provider profile →