LLM ReferenceLLM Reference
Microsoft Foundry

Phi-3 Medium 128K on Microsoft Foundry

Phi-3 · Microsoft Research

ServerlessProvisionedOpen Source

Get Started with Phi-3 Medium 128K on Microsoft Foundry

Microsoft Foundry offers access to Phi-3 Medium 128K with a 128K context window. Microsoft Foundry is a unified enterprise AI platform that significantly expands beyond Azure OpenAI. It functions as a multi-provider hosting and deployment platform for LLMs, supporting models from OpenAI, Anthropic, DeepSeek, xAI, Meta, Mistral, NVIDIA, and others. Foundry integrates agent services, evaluation, observability, and governance into a single Azure control plane. Key capabilities include a multi-provider model catalog, Model Router for intelligent prompt routing, Foundry Agent Service for building and deploying AI agents with built-in tracing and monitoring, and enterprise-grade governance with RBAC, compliance, and regional deployments. For broader model catalog including Claude, DeepSeek, Grok, Llama, Mistral, and NVIDIA Nemotron, Foundry is the recommended platform over Azure OpenAI.

Pricing

TypePrice (per 1M)
Input tokens$0.50
Output tokens$1.50

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About Phi-3 Medium 128K

The Phi-3 Medium 128K is an open-source, 14-billion parameter language model by Microsoft, designed for efficient operation in resource-limited environments. Noted for its state-of-the-art performance on reasoning tasks, it excels in language understanding, code generation, and logical reasoning while offering a long context window of up to 128,000 tokens, making it ideal for applications like summarizing lengthy documents. Its dense decoder-only Transformer architecture has been refined with supervised fine-tuning and preference optimization to enhance instruction-following capabilities. Additionally, Phi-3 Medium 128K is optimized for diverse hardware platforms, ensuring broad accessibility and performance 12.

Model Specs

Released2024-05-21
Parameters14B
Context128K
ArchitectureDecoder Only

GPU-Hour Providers(1)