LLM ReferenceLLM Reference
Microsoft Foundry

Phi-3 Mini 128K on Microsoft Foundry

Phi-3 · Microsoft Research

ServerlessProvisionedOpen Source

Get Started with Phi-3 Mini 128K on Microsoft Foundry

Microsoft Foundry offers access to Phi-3 Mini 128K with a 128K context window. Microsoft Foundry is a unified enterprise AI platform that significantly expands beyond Azure OpenAI. It functions as a multi-provider hosting and deployment platform for LLMs, supporting models from OpenAI, Anthropic, DeepSeek, xAI, Meta, Mistral, NVIDIA, and others. Foundry integrates agent services, evaluation, observability, and governance into a single Azure control plane. Key capabilities include a multi-provider model catalog, Model Router for intelligent prompt routing, Foundry Agent Service for building and deploying AI agents with built-in tracing and monitoring, and enterprise-grade governance with RBAC, compliance, and regional deployments. For broader model catalog including Claude, DeepSeek, Grok, Llama, Mistral, and NVIDIA Nemotron, Foundry is the recommended platform over Azure OpenAI.

Pricing

TypePrice (per 1M)
Input tokens$0.30
Output tokens$0.90

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About Phi-3 Mini 128K

Phi-3 Mini-128K-Instruct, developed by Microsoft, is a 3.8 billion-parameter large language model renowned for its lightweight, open-source architecture. Despite its modest size, it excels in reasoning tasks, particularly in math and logic, and showcases strong code generation capabilities. A standout feature is its remarkable ability to handle up to 128,000 tokens, allowing it to process extensive text documents and codebases efficiently. While it has limitations in factual knowledge and focuses primarily on English, it strikes a balance between performance and efficiency, making it ideal for resource-constrained environments. The model is available on platforms like Azure AI Studio and Hugging Face and benefits from training on high-quality synthetic and publicly available data, with fine-tuning to improve instruction adherence and safety.

Model Specs

Released2024-04-23
Parameters3.8B
Context128K
ArchitectureDecoder Only

GPU-Hour Providers(1)