Router profile
Azure AI Foundry Model Router
Microsoft
Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.
Type
Router
Lead directory segment
Pricing model
Passthrough
Model count pending
Hosting
Provider-native
No self-host flag
Data retention
Retains data
Verify for production policy
At a glance
- Decision mechanism
- Predictive learned
- Optimizes for
- CostQualityReliability
- Routing scope
- Cross-tier
- Decision timing
- Pre-generation
- Deployment path
- Proxy in path
- Openness
- Provider-native
- API compatibility
- OpenAI
Routes to these providers
Microsoft Foundry offers a comprehensive platform-as-a-service for enterprise AI operations. It provides multiple deployment options including Serverless APIs (pay-as-you-go), Global Standard (shared managed capacity), Provisioned Throughput Units (reserved capacity), batch processing, and bring-your-own model deployments. The platform features a unified control plane for models, agents, tools, and observability. Its Agent Service enables building and deploying AI agents with built-in tracing, monitoring, and governance. Evaluation and monitoring tools assess model performance, safety, and groundedness. Foundry supports seamless upgrades from Azure OpenAI with non-destructive migration, maintaining existing deployments while unlocking multi-provider model access and advanced platform capabilities.
Azure OpenAI Service hosts OpenAI's GPT-4o, GPT-4, GPT-3.5, and embedding models on Microsoft Azure with enterprise SLAs. Deployments run in customer-selected regions with private networking, role-based access control, and capacity options spanning Standard pay-per-token, Provisioned Throughput Units (PTUs) for reserved capacity, Global Standard shared capacity, and Batch processing. Azure OpenAI sits inside the wider Microsoft Foundry / Azure AI Studio control plane, which adds an evaluation, monitoring, and Agent Service layer on top of the base model APIs. For workloads that need non-OpenAI models (Claude, DeepSeek, Grok, Llama, Mistral, NVIDIA Nemotron), Microsoft Foundry is the broader catalog; Azure OpenAI is the OpenAI-specific entry point. The service is API-compatible with the OpenAI SDK in most flows, so teams typically swap base URLs and authentication rather than rewriting calls.
Pricing & data handling
No separate routing fee; pay per underlying Azure model tokens. Three modes: Balanced (broad distribution by complexity), Cost (cheapest capable model), Quality (frontier models preferred). Supports model subsets and automatic failover. Current version: 2025-11-18.
- Retention
- Retains data
- Self-host
- Not indicated
- Last checked
- 2026-06-08
Sources & freshness
- homepage, status, type, modes, pricing_model · checked 2026-06-08
- how_it_works · checked 2026-06-08
- model_catalog_entry · checked 2026-06-08
Last reviewed 2026-06-08.
Compare & related routers
Compare Azure AI Foundry Model Router against another router without mixing model rows into the same view.
Compare with AIRouterCommercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.
AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.
AI-powered LLM router that analyzes each prompt in real-time to select the optimal model, targeting 20–97% cost reduction while maintaining quality; San Francisco startup reportedly nearing $1.3B valuation.
Commercial LLM router that dynamically routes each query to the best-suited model with load balancing and fallback handling, charging 3% of underlying AI spend.