Router profile

Azure AI Foundry Model Router

Microsoft

RouterAging · 2026-06-08

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

Type

Router

Lead directory segment

Pricing model

Passthrough

Model count pending

Hosting

Provider-native

No self-host flag

Data retention

Retains data

Verify for production policy

At a glance

Decision mechanism: Predictive learned
Optimizes for: CostQualityReliability
Routing scope: Cross-tier
Decision timing: Pre-generation
Deployment path: Proxy in path
Openness: Provider-native
API compatibility: OpenAI

Routes to these providers

Microsoft Foundry

Microsoft Foundry offers a comprehensive platform-as-a-service for enterprise AI operations. It provides multiple deployment options including Serverless APIs (pay-as-you-go), Global Standard (shared managed capacity), Provisioned Throughput Units (reserved capacity), batch processing, and bring-your-own model deployments. The platform features a unified control plane for models, agents, tools, and observability. Its Agent Service enables building and deploying AI agents with built-in tracing, monitoring, and governance. Evaluation and monitoring tools assess model performance, safety, and groundedness. Foundry supports seamless upgrades from Azure OpenAI with non-destructive migration, maintaining existing deployments while unlocking multi-provider model access and advanced platform capabilities.

Azure OpenAI

Azure OpenAI Service hosts OpenAI's GPT-4o, GPT-4, GPT-3.5, and embedding models on Microsoft Azure with enterprise SLAs. Deployments run in customer-selected regions with private networking, role-based access control, and capacity options spanning Standard pay-per-token, Provisioned Throughput Units (PTUs) for reserved capacity, Global Standard shared capacity, and Batch processing. Azure OpenAI sits inside the wider Microsoft Foundry / Azure AI Studio control plane, which adds an evaluation, monitoring, and Agent Service layer on top of the base model APIs. For workloads that need non-OpenAI models (Claude, DeepSeek, Grok, Llama, Mistral, NVIDIA Nemotron), Microsoft Foundry is the broader catalog; Azure OpenAI is the OpenAI-specific entry point. The service is API-compatible with the OpenAI SDK in most flows, so teams typically swap base URLs and authentication rather than rewriting calls.

Pricing & data handling

No separate routing fee; pay per underlying Azure model tokens. Three modes: Balanced (broad distribution by complexity), Cost (cheapest capable model), Quality (frontier models preferred). Supports model subsets and automatic failover. Current version: 2025-11-18.

Retention: Retains data
Self-host: Not indicated
Last checked: 2026-06-08

Sources & freshness

homepage, status, type, modes, pricing_model · checked 2026-06-08
how_it_works · checked 2026-06-08
model_catalog_entry · checked 2026-06-08

Last reviewed 2026-06-08.

Compare & related routers

Compare Azure AI Foundry Model Router against another router without mixing model rows into the same view.

Compare with LiteLLM

AIRouter

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

Amazon Bedrock Intelligent Prompt Routing

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

Martian

AI-powered LLM router that analyzes each prompt in real-time to select the optimal model, targeting 20–97% cost reduction while maintaining quality; San Francisco startup reportedly nearing $1.3B valuation.

Neutrino AI

Commercial LLM router that dynamically routes each query to the best-suited model with load balancing and fallback handling, charging 3% of underlying AI spend.