LLM ReferenceLLM Reference
Microsoft Foundry

Microsoft Foundry Models — Pricing & Benchmarks

124 models available · Microsoft

Microsoft Foundry hosts 124 AI models in this catalog. The lowest listed input price is Mistral Ministral 3B at $0.04/1M input tokens. LLM Reference lets you compare these models across all 63 providers without switching tabs.

ModelInput (per 1M)Output (per 1M)Context
Mistral Ministral 3B$0.04$0.04
Prompt Guard 86M$0.05$0.05512
DeciCoder 1B$0.07$0.07
Dolly 2.0 12B$0.07$0.07
Phi-1.5$0.07$0.07
Phi-2$0.07$0.07
Qwen2-1.5B$0.07$0.07
Embed English v3.0$0.1512
Embed Multilingual v3.0$0.1512
Mistral Small 2503$0.1$0.333K
Embed v4.0$0.12128k
Mistral 7B v0.1$0.14$0.148K
Qwen2-7B$0.15$0.15128K
Llama 4 Scout 17B-16E Instruct$0.2$0.78328K
Grok 3 Mini$0.25$1.27131k
Mixtral 8x7B$0.27$0.2732K
Phi-3 Mini 4k$0.28$0.844K
Phi-3 Vision$0.28$0.84128K
Codestral 2501$0.3$0.9262K
Llama 3.1 8B Instruct$0.3$0.61128K
Mistral NeMo Instruct (2407)$0.3$0.3128K
Phi-3 Mini 128K$0.3$0.9128K
Phi-3 Small 8K$0.32$0.968K
Llama 4 Maverick 17B Instruct FP8$0.35$1.411M
Phi-3 Small 128K$0.35$1.05128K
MAI-Transcribe-1$0.36
Dolphin 2.9 Llama 3 8B$0.37$1.10
Hermes 2 Pro Llama 3 8B$0.37$1.10
Llama 3 8B Gradient 262K$0.37$1.10262K
Llama 3 8B Instruct$0.37$1.18K
Llama 3.2 11B Vision Instruct$0.37$0.37128K
Llama Guard 3 8B$0.37$1.108K
Nous Llama 3 8B$0.37$1.108K
NVIDIA Llama 3 ChatQA 8B$0.37$1.10
Mistral Medium 2505$0.4$2128k
Phi-3 Medium 4K$0.45$1.354K
Command R$0.5$1.5128K
Jamba-Instruct$0.5$0.7256K
Phi-3 Medium 128K$0.5$1.5128K
CodeLlama 7B$0.52$0.67100K
CodeLlama 7B Python$0.52$0.67100K
DeciLM 7B$0.52$0.67
Falcon 7B$0.52$0.67
Llama 2 7B Chat$0.52$0.674K
Orca 2 7B$0.52$0.67
SOLAR 10.7B$0.52$0.67
Llama 3.3 70B Instruct (free)$0.71$0.7166K
CodeLlama 13B$0.81$0.94100K
CodeLlama 13B Python$0.81$0.94100K
Fugaku-LLM 13B$0.81$0.94
Llama 2 13B Chat$0.81$0.944K
Orca 2 13B$0.81$0.94
WizardLM 13B V1.1$0.81$0.94
Claude Haiku 4.5$1$5200k
Mistral Small$1$332K
Qwen2-72B$1.00$2.00128K
Smaug 72B$1.00$2.00
Command R 08-2024$1.5$2131K
Qwen1.5-110B$1.50$2.50
CodeLlama 34B$1.54$1.77100K
CodeLlama 34B Python$1.54$1.77100K
Falcon 40B$1.54$1.77
Llama 2 70B Chat$1.54$1.774K
Arctic$2.00$2.004K
Mixtral 8x22B v0.1$2.00$6.0064K
Llama 3.2 90B Vision Instruct$2.04$2.04128K
Command A (03-2025)$2.5$10256k
Command R+ 08-2024$2.5$10131K
Llama 3.1 70B Instruct$2.68$3.54128K
DBRX Instruct$2.70$2.7032K
Claude Sonnet 4.5$3$15200K
Claude Sonnet 4.6$3$151M
Command R+$3$15128K
Grok-3$3$151M
Mistral Large 2 (2407)$3$9128K
Jais 30B$3.2$9.71
CodeLlama 70B$3.78$11.3416K
CodeLlama 70B Python$3.78$11.3416K
Llama 3 70B Instruct$3.78$11.348K
Llama 3 TenyxChat 70B$3.78$11.34
NVIDIA Llama 3 ChatQA 70B$3.78$11.34
Mistral Large$4$1232k
Claude Opus 4.5$5$25200K
Claude Opus 4.6$5$251M
Claude Opus 4.7$5$251M
MAI-Image-2$5.00$33.00
Llama 3.1 405B Instruct$5.33$16128K
Claude Opus 4.1$15$75200k
MAI-Voice-1$22.00
Bria 2.3 Fast
Claude 3.5 Sonnet200K
Claude Mythos Preview1M
DeepSeek R1128K
DeepSeek R1 0528160K
DeepSeek V364k
DeepSeek V3 0324160K
DeepSeek V3.164K
DeepSeek V3.2160K
DeepSeek V3.2 Speciale164K
DeepSeek V4 Flash1M
FLUX.1.1 [pro]
Grok 4256k
Grok 4 Fast Non-Reasoning
Grok 4 Fast Reasoning
Grok 4.31M
Grok Code Fast 1262K
Kimi K2.5256K
Kimi K2.6262K
MAI-Image-2e33K
Mistral Document AI 2505
Mistral Document AI 2512
Mistral Large 3 675B Instruct128K
Phi 4 Multimodal Instruct128K
Phi 4 Reasoning128K
Phi-4 14B
Rerank English V34K
Rerank Multilingual V34K
Rerank v3.54K
Rerank v4.0 Fast32k
Rerank v4.0 Pro32k
Stable Diffusion 3.5 Large
Stable Image Core
Stable Image Ultra
TimeGEN-1

Pricing Overview

Cheapest$0.04/1M
Most expensive$22.00/1M

About Microsoft Foundry

Microsoft Foundry offers a comprehensive platform-as-a-service for enterprise AI operations. It provides multiple deployment options including Serverless APIs (pay-as-you-go), Global Standard (shared managed capacity), Provisioned Throughput Units (reserved capacity), batch processing, and bring-your-own model deployments. The platform features a unified control plane for models, agents, tools, and observability. Its Agent Service enables building and deploying AI agents with built-in tracing, monitoring, and governance. Evaluation and monitoring tools assess model performance, safety, and groundedness. Foundry supports seamless upgrades from Azure OpenAI with non-destructive migration, maintaining existing deployments while unlocking multi-provider model access and advanced platform capabilities.

Full provider profile →