17 matchesCost objective

Cheapest LLM gateway options

Cost-focused routing can lower spend by sending simple requests to cheaper models or cheaper provider routes. Use this page for gateways and routers whose verified objective includes cost optimization.

Open in directory

Filtered router list

L

BerriAI

Editorial pick

GatewayOpen sourceSelf-hostedCostReliability

~100 models · Free OSS

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

AgingView detail

O

OpenRouter, Inc.

Editorial pick

HybridCommercialHosted SaaSCostQuality

~400 models · Passthrough

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

AgingView detail

P

Portkey AI

Editorial pick

GatewayOpen sourceHosted SaaSReliabilityCost

~1.6K models · Subscription

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

AgingView detail

A

Heureka Labs UG

RouterCommercialHosted SaaSCostQuality

Passthrough + fee

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

AgingView detail

AB

Amazon Bedrock Intelligent Prompt Routing

Amazon Web Services

RouterProvider-nativeProvider-nativeCostQuality

Passthrough

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

AgingView detail

AA

Azure AI Foundry Model Router

Microsoft

RouterProvider-nativeProvider-nativeCostQuality

Passthrough

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

AgingView detail

H

Helicone

GatewayOpen sourceHosted SaaSReliabilityCost

Subscription

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

AgingView detail

KA

Kong AI Gateway

Kong Inc.

GatewayCommercialSelf-hostedCostLatency

Subscription

Multi-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.

AgingView detail

M

Martian, Inc.

RouterCommercialHosted SaaSCostQuality

Passthrough + fee

AI-powered LLM router that analyzes each prompt in real-time to select the optimal model, targeting 20–97% cost reduction while maintaining quality; San Francisco startup reportedly nearing $1.3B valuation.

AgingView detail

NA

Neutrino AI

RouterCommercialHosted SaaSCostQuality

Passthrough + fee

Commercial LLM router that dynamically routes each query to the best-suited model with load balancing and fallback handling, charging 3% of underlying AI spend.

AgingView detail

ND

Not Diamond

Editorial pick

RouterCommercialHosted SaaSCostQuality

Enterprise quote

Predictive model router that determines the best LLM for each query; claims up to 25% accuracy gains and 10x cost reduction; powers OpenRouter's auto mode and is positioned specifically for coding agents.

AgingView detail

OA

OpenAI Auto Routing (GPT-5 Auto)

OpenAI

RouterProvider-nativeProvider-nativeQualityCost

Passthrough

OpenAI's native auto-routing mode (GPT-5 Auto) that dynamically routes each API request between GPT-5 and GPT-5 Instant based on prompt complexity, with no extra charge beyond model token costs.

AgingView detail

R

Requesty

HybridCommercialHosted SaaSCostLatency

~400 models · Passthrough + fee

AI gateway to 400+ LLM providers with intelligent routing, caching, guardrails, and governance; flat 5% markup on model costs with no subscription fee.

AgingView detail

R

Respan (formerly Keywords AI)

HybridCommercialHosted SaaSCostQuality

~250 models · Subscription

Unified LLM engineering platform (gateway + observability + evals + prompt management) routing across 250+ models; previously Keywords AI, rebranded February 2026.

AgingView detail

R

LMSYS (lm-sys)

RouterOpen sourceSelf-hostedCostQuality

Free OSS

Open-source LLM routing framework from LMSYS that routes simpler queries to a cheaper weak model and harder ones to a stronger frontier model, achieving 35–85% cost reduction on benchmarks.

AgingView detail

U

Unify AI

RouterCommercialHosted SaaSCostQuality

Subscription

Benchmark-driven LLM router using a neural scorer and live runtime benchmarks refreshed every 10 minutes to route each request to the optimal endpoint across 100+ providers.

AgingView detail

VS

vLLM Semantic Router

Red Hat / vLLM Project

RouterOpen sourceSelf-hostedCostLatency

Free OSS

Open-source Mixture-of-Models router that semantically classifies each request and routes it to the best backend (local, private, or frontier) by cost, latency, privacy, or safety, deployed as an Envoy External Processor.

AgingView detail

Related decision paths

These links keep the SEO view tied to the single router store, target-provider graph, and model decision pages.

Full routers directory

Compare every verified gateway, router, and hybrid row in the shared store.

Same filter in /routers

Use the directory controls to add openness, hosting, API, and routing-scope filters.

Inspect the provider that one or more listed routers can target.

Inspect the provider that one or more listed routers can target.

Google AI Studio

Inspect the provider that one or more listed routers can target.

Inspect the provider that one or more listed routers can target.

Cheap LLM leaderboard

Compare direct model price and quality watermarks.

LLM cost optimization routers

Use the broader cost-routing query view.

Frontier pricing pulse

Watch current output pricing before changing routes.

Machine-readable source

This page is generated from data/seed/router.json. Agents can consume the same catalog through /api/routers.