LLM Reference

Router profile

Azure AI Foundry Model Router

Microsoft

Visit Azure AI Foundry Model Router
RouterFresh · 2026-06-08
Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

Type

Router

Lead directory segment

Pricing model

Passthrough

Model count pending

Hosting

Provider-native

No self-host flag

Data retention

Retains data

Verify for production policy

At a glance

Decision mechanism
Predictive learned
Optimizes for
CostQualityReliability
Routing scope
Cross-tier
Decision timing
Pre-generation
Deployment path
Proxy in path
Openness
Provider-native
API compatibility
OpenAI

Routes to these providers

Microsoft Foundry

Microsoft Foundry offers a comprehensive platform-as-a-service for enterprise AI operations. It provides multiple deployment options including Serverless APIs (pay-as-you-go), Global Standard (shared managed capacity), Provisioned Throughput Units (reserved capacity), batch processing, and bring-your-own model deployments. The platform features a unified control plane for models, agents, tools, and observability. Its Agent Service enables building and deploying AI agents with built-in tracing, monitoring, and governance. Evaluation and monitoring tools assess model performance, safety, and groundedness. Foundry supports seamless upgrades from Azure OpenAI with non-destructive migration, maintaining existing deployments while unlocking multi-provider model access and advanced platform capabilities.

Azure OpenAI

Azure OpenAI Service hosts OpenAI's GPT-4o, GPT-4, GPT-3.5, and embedding models on Microsoft Azure with enterprise SLAs. Deployments run in customer-selected regions with private networking, role-based access control, and capacity options spanning Standard pay-per-token, Provisioned Throughput Units (PTUs) for reserved capacity, Global Standard shared capacity, and Batch processing. Azure OpenAI sits inside the wider Microsoft Foundry / Azure AI Studio control plane, which adds an evaluation, monitoring, and Agent Service layer on top of the base model APIs. For workloads that need non-OpenAI models (Claude, DeepSeek, Grok, Llama, Mistral, NVIDIA Nemotron), Microsoft Foundry is the broader catalog; Azure OpenAI is the OpenAI-specific entry point. The service is API-compatible with the OpenAI SDK in most flows, so teams typically swap base URLs and authentication rather than rewriting calls.

Pricing & data handling

No separate routing fee; pay per underlying Azure model tokens. Three modes: Balanced (broad distribution by complexity), Cost (cheapest capable model), Quality (frontier models preferred). Supports model subsets and automatic failover. Current version: 2025-11-18.

Retention
Retains data
Self-host
Not indicated
Last checked
2026-06-08

Sources & freshness

Last reviewed 2026-06-08.

Compare & related routers

Compare Azure AI Foundry Model Router against another router without mixing model rows into the same view.

Compare with AIRouter
AIRouter

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

Amazon Bedrock Intelligent Prompt Routing

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

Martian

AI-powered LLM router that analyzes each prompt in real-time to select the optimal model, targeting 20–97% cost reduction while maintaining quality; San Francisco startup reportedly nearing $1.3B valuation.

Neutrino AI

Commercial LLM router that dynamically routes each query to the best-suited model with load balancing and fallback handling, charging 3% of underlying AI spend.