LLM ReferenceLLM Reference
Fireworks AI

Fireworks AI Models — Pricing & Benchmarks

230 models available

Fireworks AI hosts 230 AI models in this catalog. The lowest listed input price is Firefunction V2 at Free, with 1 free-tier model. LLM Reference lets you compare these models across all 63 providers without switching tabs.

ModelInput (per 1M)Output (per 1M)Context
Firefunction V2FreeFree32K
gpt-oss-20b$0.07$0.3131K
CodeGemma 2B$0.10$0.10
Cogito v1 Preview Llama 3B$0.1$0.1128K
DeepSeek Coder 1.3B$0.10$0.104K
DeepSeek R1 Distill Qwen-1.5B$0.1$0.1128K
Fireworks Zephyr-7B-beta$0.1$0.18K
Gemma 2B Instruct$0.1$0.12K
Gemma 3 1B Instruct$0.1$0.132K
Llama 3.2 1B$0.1$0.1128K
Llama 3.2 1B Instruct$0.1$0.1128K
Llama 3.2 3B$0.1$0.1128K
Llama 3.2 3B Instruct$0.1$0.1128K
Llama Guard 3 1B$0.1$0.1
Phi-2$0.10$0.10
Phi-3 Mini 128K$0.10$0.10128K
Qwen2.5-0.5B-Instruct$0.1$0.1128K
Qwen2.5-1.5B-Instruct$0.1$0.1128K
Qwen2.5-Coder-1.5B-Instruct$0.1$0.1
Qwen2.5-Coder-3B-Instruct$0.1$0.1
Qwen3-0.6B$0.1$0.140K
Qwen3-1.7B$0.1$0.140K
Qwen3-Coder-3B-Instruct$0.1$0.1256K
Stable Code 3B$0.10$0.1016K
Stable LM 2 Zephyr 1.6B$0.10$0.10
Stable LM Zephyr 3B$0.10$0.10
StarCoder2 3B$0.10$0.108K
Fireworks Llama-3-8B-Instruct$0.15$0.158K
Fireworks Solar-10.7B-Instruct-v1.0$0.15$0.154K
gpt-oss-120b$0.15$0.6131K
Qwen3-VL-30B-A3B$0.15$0.6
Chronos Hermes 13B V2$0.20$0.20
CodeGemma 7B$0.2$0.2
CodeGemma 7B Instruct$0.20$0.20
CodeLlama 13B$0.20$0.20100K
CodeLlama 13B Instruct$0.2$0.216K
CodeLlama 13B Python$0.20$0.20100K
CodeLlama 7B$0.20$0.20100K
CodeLlama 7B Instruct$0.2$0.216K
CodeLlama 7B Python$0.20$0.20100K
CodeQwen1.5 7B$0.20$0.20
Cogito v1 Preview Llama 8B$0.2$0.2128K
Cogito v1 Preview Qwen-14B$0.2$0.2128K
DeepSeek Coder 6.7B$0.20$0.204K
DeepSeek Coder 7B V1.5$0.20$0.20
DeepSeek Coder V2 Lite Instruct$0.2$0.2128K
DeepSeek R1 0528 Distill Qwen3-8B$0.2$0.2160K
DeepSeek R1 0528 Qwen3-8B$0.2$0.2160K
DeepSeek R1 Distill Llama 8B$0.2$0.2128K
DeepSeek R1 Distill Qwen-14B$0.2$0.2128K
DeepSeek R1 Distill Qwen-7B$0.2$0.2128K
DeepSeek V2 Lite Chat$0.2$0.232K
ELYZA Japanese Llama 2 7B$0.20$0.20
Gemma 2 9B$0.2$0.28K
Gemma 2 9B Instruct$0.2$0.28K
Gemma 3 12B Instruct$0.2$0.2128K
Gemma 3 4B Instruct$0.2$0.2128K
Gemma 7B$0.2$0.28K
Gemma 7B Instruct$0.20$0.208K
GLM-4 9B$0.2$0.2131K
GLM-Z1 9B$0.2$0.2
Hermes 2 Pro Mistral 7B$0.20$0.2032K
Japanese Stable VLM$0.20$0.20
Japanese StableLM Gamma 7B$0.20$0.20
Llama 2 13B$0.2$0.24K
Llama 2 13B Chat$0.2$0.24K
Llama 2 7B$0.2$0.24K
Llama 2 7B Chat$0.20$0.204K
Llama 3 8B$0.2$0.28K
Llama 3 8B Instruct$0.2$0.28K
Llama 3.1 8B Instruct$0.2$0.2128K
Llama 3.2 11B Vision Instruct$0.2$0.2128K
Llama Guard 2 8B$0.20$0.208K
Llama Guard 3 8B$0.2$0.28K
Llama Guard 7B$0.20$0.202K
Mistral 7B Instruct v0.1$0.2$0.28K
Mistral 7B Instruct v0.2$0.2$0.232K
Mistral 7B Instruct v0.3$0.2$0.232K
Mistral 7B OpenOrca$0.20$0.208K
Mistral 7B v0.1$0.20$0.208K
Mistral NeMo (2407)$0.2$0.2128K
MythoMax L2 13B$0.2$0.2
Nous Capybara 7B V1.9$0.20$0.20
Nous Hermes Llama 2 13B$0.20$0.20
Nous Hermes Llama 2 7B$0.20$0.20
OpenChat 3.5 (0106)$0.20$0.208K
OpenHermes 2 Mistral 7B$0.20$0.2032K
OpenHermes 2.5 Mistral 7B$0.20$0.2032K
Phi-3 Vision$0.2$0.2128K
Pythia 12B$0.20$0.20
Qwen-14B$0.20$0.2032K
Qwen2-7B$0.2$0.2128K
Qwen2.5-14B$0.2$0.2128K
Qwen2.5-14B-Instruct$0.2$0.2128K
Qwen2.5-7B$0.2$0.2128K
Qwen2.5-7B-Instruct$0.2$0.2128K
Qwen2.5-Coder-14B-Instruct$0.2$0.2
Qwen2.5-Coder-7B-Instruct$0.2$0.2
Qwen3-14B$0.2$0.240K
Qwen3-4B$0.2$0.240K
Qwen3-8B$0.2$0.2128K
Qwen3-Coder-14B-Instruct$0.2$0.2256K
Qwen3-Coder-8B-Instruct$0.2$0.2256K
Snorkel Mistral PairRM$0.20$0.20
SOLAR 10.7B$0.2$0.2
StarCoder$0.2$0.28K
StarCoder2 15B$0.20$0.208K
StarCoder2 7B$0.2$0.28K
Toppy M 7B$0.20$0.204K
Yi 6B$0.20$0.20200K
Yi 9B$0.2$0.2
Zephyr 7B Beta$0.20$0.20
Fireworks CodeLlama-34b-Instruct$0.3$0.316K
MiniMax M2.7$0.30$1.20205K
MiniMax-M1-240k$0.3$1.2240K
MiniMax-M2-240k$0.3$1.2240K
MiniMax-M2-240k-0905$0.3$1.2240K
MiniMax-M2-80k-0905$0.3$1.280K
MiniMax-M2.5$0.3$1.2
Fireworks Yi-34B-Chat$0.4$0.44K
DeepSeek Coder V2 Lite$0.5$0.5128K
Dolphin 2.6 Mixtral 8x7B$0.50$0.50
Firefunction V1$0.50$0.508K
Mixtral 8x7B$0.5$0.532K
Nous Hermes 2 Mixtral 8x7B$0.50$0.50
Phi 3.5 MoE Instruct$0.5$0.5128K
Phi 4 Reasoning$0.5$0.5128K
Phi 4 Reasoning Plus$0.5$0.5128K
Qwen3-30B-A3B$0.5$0.5
DeepSeek Prover V2$0.56$1.68160K
DeepSeek R1$0.56$1.68128K
DeepSeek R1 0528$0.56$1.68160K
DeepSeek R1 Basic$0.56$1.68160K
DeepSeek V2.5$0.56$1.68128K
DeepSeek V3$0.56$1.6864k
DeepSeek V3 0324$0.56$1.68160K
DeepSeek V3.1$0.56$1.6864K
DeepSeek V3.2$0.56$1.68160K
GLM 4.7$0.60$2.20200K
GLM-4.7$0.60$2.20
Kimi K2 Instruct$0.60$2.50
Kimi K2 Instruct 0905$0.6$2.5256K
Kimi K2 Thinking$0.6$2.5256K
Kimi K2.5$0.60$3.00256K
Fireworks Qwen-72B-Chat$0.8$0.833K
CodeLlama 34B$0.90$0.90100K
CodeLlama 34B Instruct$0.9$0.916K
CodeLlama 34B Python$0.90$0.90100K
CodeLlama 70B$0.90$0.9016K
CodeLlama 70B Instruct$0.9$0.916K
CodeLlama 70B Python$0.90$0.9016K
Cogito v1 Preview Llama 70B$0.9$0.9128K
Cogito v1 Preview Qwen-32B$0.9$0.9128K
DeepSeek Coder 33B$0.90$0.9016K
DeepSeek Coder 33B Instruct$0.9$0.9
DeepSeek R1 Distill Llama 70B$0.9$0.9128K
DeepSeek R1 Distill Qwen-32B$0.9$0.9128K
Dolphin 2.9.2 Qwen2-72B$0.90$0.90128K
FARE-20B$0.9$0.9128K
FireLLaVA 13B$0.9$0.9
Gemma 2 27B Instruct$0.9$0.98K
Gemma 3 27B Instruct$0.9$0.9128K
GLM Z1 32B$0.9$0.9128K
GLM Z1 Rumination 32B$0.9$0.9128K
GLM-4.5$0.9$0.9128K
GLM-4.5-Air$0.9$0.9128K
GLM-4.6$0.9$0.9198K
GLM-4.7 Flash$0.9$0.9198K
Japanese StableLM 70B$0.90$0.90
KAT Coder$0.9$0.9256K
KAT Dev 32B$0.9$0.9128K
KAT Dev 72B Exp$0.9$0.9128K
Llama 2 70B$0.9$0.94K
Llama 2 70B Chat$0.9$0.94K
Llama 3 70B Instruct$0.9$0.98K
Llama 3.1 70B Instruct$0.9$0.9128K
Llama 3.2 90B Vision Instruct$0.9$0.9128K
Llama 3.3 70B$0.9$0.98K
LLaVA 1.6 Hermes Yi 34B$0.90$0.90200K
MiniMax M2$0.9$0.9197K
MiniMax-M1-80k$0.9$0.980K
MiniMax-M2-80k$0.9$0.980K
Mistral Large$0.9$0.932k
Mistral NeMo Instruct (2407)$0.9$0.9128K
Mistral Small 3.1 24B Instruct$0.9$0.9128K
Nous Capybara 34B$0.90$0.90200K
Nous Hermes 2 Yi 34B$0.90$0.90200K
Nous Hermes Llama 2 70B$0.90$0.90
Phi 3.5 Mini Instruct$0.9$0.9128K
Phi 4 Multimodal Instruct$0.9$0.9128K
Phi-4 14B$0.9$0.9
Phi-4 Mini$0.9$0.9
Phind CodeLlama 34B Python V1$0.90$0.908K
Phind CodeLlama 34B V1$0.90$0.908K
Phind CodeLlama 34B V2$0.90$0.908K
Qwen-72B$0.90$0.90
Qwen1.5-72B$0.90$0.90
Qwen2-72B$0.9$0.9128K
Qwen2-VL-72B-Instruct$0.9$0.932K
Qwen2.5-32B$0.9$0.9128K
Qwen2.5-32B-Instruct$0.9$0.9128K
Qwen2.5-72B$0.9$0.9128K
Qwen2.5-72B-Instruct$0.9$0.9128K
Qwen2.5-Coder-32B$0.9$0.9
Qwen2.5-Coder-32B-Instruct$0.9$0.9
Qwen3-32B$0.9$0.940K
Qwen3-Coder-32B-Instruct$0.9$0.9256K
Qwen3-Coder-72B-Instruct$0.9$0.9256K
Yi 34B$0.90$0.90200K
Yi 34B 200K$0.9$0.9200K
Yi Large$0.90$0.9032K
Kimi K2.6$0.95$4.00262K
Kimi K2.5$0.99$4.94256K
GLM-5$1.00$3.20200k
DBRX Instruct$1.2$1.232K
DeepSeek Coder V2$1.2$1.2128K
DeepSeek Coder V2 Instruct$1.2$1.2128K
ERNIE 4.5$1.2$1.28K
Mixtral 8x22B Instruct v0.1$1.2$1.264K
Mixtral 8x22B v0.1$1.2$1.264K
Qwen3-235B-A22B$1.2$1.2128K
Qwen3-Coder-480B-A35B-Instruct$1.2$1.2256K
GLM-5.1$1.40$4.40200k
Fireworks DBRX-Instruct$1.5$1.533K
DeepSeek V4 Pro$1.74$3.481M
Llama 3.1 405B Instruct$3$3128K
Llama 4 Maverick 17B Instruct FP81M
Llama 4 Scout 17B-16E Instruct328K
Mistral Small32K
Nemotron 3 Super-120B-A12B1M

Pricing Overview

Cheapest$0.07/1M
Most expensive$3.00/1M
1 free tier model

About Fireworks AI

The Fireworks AI Platform is a comprehensive generative AI solution that enables developers and businesses to build, customize, and deploy AI models at scale. It supports a diverse range of cutting-edge open-source models, including Meta's Llama and Stable Diffusion, for tasks such as natural language processing and image generation. The platform's serverless architecture allows for quick deployment without extensive infrastructure management, operating on a pay-as-you-go basis. Users can fine-tune models using parameter-efficient techniques, ensuring tailored AI solutions that maintain high performance for specific business needs. Optimized for high throughput and low latency, the platform can handle trillions of inferences daily while providing a seamless user experience. It offers tools for efficient model maintenance and iteration, allowing businesses to focus on innovation rather than complex AI model management. The platform's design facilitates easy integration and customization, enabling organizations to effectively scale their AI-powered solutions. With its cost-efficient approach and comprehensive features, the Fireworks AI Platform empowers businesses to leverage advanced AI capabilities for enhanced productivity and competitive advantage in their respective markets.

Full provider profile →