LLM Reference
Microsoft Foundry

Microsoft Foundry Models — Pricing & Benchmarks

135 models available · Microsoft

Microsoft Foundry hosts 135 AI models in this catalog. The lowest listed input price is Mistral Ministral 3B at $0.04/1M input tokens. LLM Reference lets you compare these models across all 80 providers without switching tabs.

ModelInput (per 1M)Output (per 1M)Context
Mistral Ministral 3B$0.04$0.04
Prompt Guard 86M$0.05$0.05512
DeciCoder 1B$0.07$0.074k
Dolly 2.0 12B$0.07$0.07
Phi-1.5$0.07$0.072k
Phi-2$0.07$0.072k
Qwen2-1.5B$0.07$0.07
Cohere Embed English v3.0$0.1512
Cohere Embed Multilingual v3.0$0.1512
Mistral Small 2503$0.1$0.333k
Cohere Embed v4.0$0.12128k
Phi 4 Reasoning Plus$0.125$0.5128k
Mistral 7B v0.1$0.14$0.148k
Qwen2-7B$0.15$0.15128k
Llama 4 Scout 17B-16E Instruct$0.2$0.7810m
Grok 3 Mini$0.25$1.27131k
Mixtral 8x7B$0.27$0.2732k
Phi-3 Mini 4k$0.28$0.844k
Phi-3 Vision$0.28$0.84128k
Codestral 2501$0.3$0.9262k
Llama 3.1 8B Instruct$0.3$0.61128k
Mistral NeMo Instruct (2407)$0.3$0.3128k
Phi-3 Mini 128K$0.3$0.9128k
Phi-3 Small 8K$0.32$0.968k
Llama 4 Maverick 17B Instruct FP8$0.35$1.411m
Phi-3 Small 128K$0.35$1.05128k
MAI-Transcribe-1$0.36
Dolphin 2.9 Llama 3 8B$0.37$1.108k
Hermes 2 Pro Llama 3 8B$0.37$1.108k
Llama 3 8B Gradient 262K$0.37$1.10262k
Llama 3 8B Instruct$0.37$1.108k
Llama 3.2 11B Vision Instruct$0.37$0.37128k
Llama Guard 3 8B$0.37$1.108k
Nous Llama 3 8B$0.37$1.108k
NVIDIA Llama 3 ChatQA 8B$0.37$1.108k
Mistral Medium 2505$0.4$2.00128k
Phi-3 Medium 4K$0.45$1.354k
Command R$0.5$1.50128k
Jamba-Instruct$0.5$0.7256k
Phi-3 Medium 128K$0.5$1.50128k
CodeLlama 7B$0.52$0.67100k
CodeLlama 7B Python$0.52$0.67100k
DeciLM 7B$0.52$0.678k
Falcon 7B$0.52$0.67
Llama 2 7B Chat$0.52$0.674k
Orca 2 7B$0.52$0.674k
SOLAR 10.7B$0.52$0.674k
Llama 3.3 70B Instruct (free)$0.71$0.7166k
MAI-Code-1-Flash$0.75$4.50256k
CodeLlama 13B$0.81$0.94100k
CodeLlama 13B Python$0.81$0.94100k
Fugaku-LLM 13B$0.81$0.944k
Llama 2 13B Chat$0.81$0.944k
Orca 2 13B$0.81$0.944k
WizardLM 13B V1.1$0.81$0.942k
Claude Haiku 4.5$1.00$5.00200k
Mistral Small$1.00$3.0032k
Qwen2-72B$1.00$2.00128k
Smaug 72B$1.00$2.0032k
Command R 08-2024$1.50$2.00131k
Qwen1.5-110B$1.50$2.5032k
CodeLlama 34B$1.54$1.77100k
CodeLlama 34B Python$1.54$1.77100k
Falcon 40B$1.54$1.77
Llama 2 70B Chat$1.54$1.774k
Arctic$2.00$2.004k
Mixtral 8x22B v0.1$2.00$6.0064k
Llama 3.2 90B Vision Instruct$2.04$2.04128k
Command A (03-2025)$2.50$10.00256k
Command R+ 08-2024$2.50$10.00131k
Llama 3.1 70B Instruct$2.68$3.54128k
DBRX Instruct$2.70$2.7032k
Claude Sonnet 4.5$3.00$15.00200k
Claude Sonnet 4.6$3.00$15.001m
Command R+$3.00$15.00128k
Grok-3$3.00$15.00131k
Mistral Large 2 (2407)$3.00$9.00128k
Jais 30B$3.20$9.712k
CodeLlama 70B$3.78$11.3416k
CodeLlama 70B Python$3.78$11.3416k
Llama 3 70B Instruct$3.78$11.348k
Llama 3 TenyxChat 70B$3.78$11.34
NVIDIA Llama 3 ChatQA 70B$3.78$11.348k
Mistral Large$4.00$12.0032k
Claude Opus 4.5$5.00$25.00200k
Claude Opus 4.6$5.00$25.001m
Claude Opus 4.7$5.00$25.001m
MAI-Image-2$5.00$33.00
Llama 3.1 405B Instruct$5.33$16.00128k
Claude Fable 5$10.00$50.001m
Claude Opus 4.1$15.00$75.00200k
MAI-Voice-1$22.00
Bria 2.3 Fast
Claude 3.5 Sonnet200k
Claude Mythos Preview1m
Claude Opus 4.81m
Claude Sonnet 51m
Cohere Rerank v3.54k
Cohere Rerank v4.0 Fast32k
Cohere Rerank v4.0 Pro32k
DeepSeek R1128k
DeepSeek R1 0528130k
DeepSeek V364k
DeepSeek V3 0324160k
DeepSeek V3.164k
DeepSeek V3.2160k
DeepSeek V3.2 Speciale164k
DeepSeek V4 Flash1m
FLUX.1.1 [pro]
Grok 4256k
Grok 4 Fast Non-Reasoning2m
Grok 4 Fast Reasoning2m
Grok 4.31m
Grok Code Fast 1262k
Kimi K2.5256k
Kimi K2.6262k
MAI-Code-1
MAI-Image-2.532k
MAI-Image-2.5-Flash32k
MAI-Image-2e33k
MAI-Thinking-1256k
MAI-Transcribe-1.5
MAI-Voice-2
Mistral Document AI 2505
Mistral Document AI 2512
Mistral Large 3 675B Instruct128k
Phi 4 Multimodal Instruct128k
Phi 4 Reasoning128k
Phi-4 14B16k
Rerank English V34k
Rerank Multilingual V34k
Stable Diffusion 3.5 Large
Stable Image Core
Stable Image Ultra
TimeGEN-1

Where else to run this

Pricing Overview

Cheapest$0.04/1M
Most expensive$22.00/1M

About Microsoft Foundry

Microsoft Foundry offers a comprehensive platform-as-a-service for enterprise AI operations. It provides multiple deployment options including Serverless APIs (pay-as-you-go), Global Standard (shared managed capacity), Provisioned Throughput Units (reserved capacity), batch processing, and bring-your-own model deployments. The platform features a unified control plane for models, agents, tools, and observability. Its Agent Service enables building and deploying AI agents with built-in tracing, monitoring, and governance. Evaluation and monitoring tools assess model performance, safety, and groundedness. Foundry supports seamless upgrades from Azure OpenAI with non-destructive migration, maintaining existing deployments while unlocking multi-provider model access and advanced platform capabilities.

Full provider profile →