LLM ReferenceLLM Reference

Best LLMs by Use Case

Last refreshed 2026-05-18. Next refresh: weekly.

Find the best large language model for your specific use case. Each category ranks models by task-fit signals, freshness, and pricing from tracked providers.

Best LLMs for Code Generation

Compare coding-capable models by sourced software-engineering benchmarks, context window, provider coverage, and tracked token pricing.

Best LLMs for RAG

Compare models for RAG, document QA, retrieval-heavy assistants, and long-context grounding by context window, document benchmarks, tool support, and pricing.

Best AI Agents & Agentic Models

Compare frontier models for multi-step AI agent workflows across SWE-bench Verified, tau-bench, and MultiChallenge. Built for coding agents, tool-use agents, and long-horizon task automation.

Best LLMs for Classification

Compare models for routing, moderation, extraction, safety labels, and structured classification by sourced benchmark coverage and pricing.

Best Open Source LLMs

Top open-weight language models you can run locally or self-host, ranked by sourced capability signals, parameter scale, and release freshness.

Best Multimodal / Vision LLMs

Top multimodal models that understand images, video, and documents, ranked by vision benchmarks, capabilities, pricing, and context window.

Best LLMs for Reasoning & Math

Top AI models for complex reasoning, math, and step-by-step problem solving, ranked by sourced reasoning benchmarks and release freshness.

Best Small Language Models (SLMs)

Efficient small language models for edge deployment, cost-sensitive workloads, or on-device inference. Under 10B parameters with strong benchmark scores.

Best LLMs for Function Calling & Tool Use

Top AI models for agentic workflows, tool use, and API integration. Compare models with native function calling support and structured outputs.

Cheapest LLM APIs You Can Call Right Now

Cheapest LLM APIs you can call right now, ranked by strict lowest tracked input price with an MMLU or GPQA quality watermark beside each row.

Best Long Context LLMs

AI models with the largest context windows for processing long documents, codebases, and extended conversations. Sorted by context window size.

Best Mainstream LLM APIs, Ranked

The best mainstream APIs, ranked by capability first: GPQA Diamond, MMLU fallback, then lowest tracked input price.

Best LLMs for Enterprise

Production-grade LLMs available on enterprise provider surfaces with function calling and structured outputs, ranked by MMLU and context window.

Best Free LLMs You Can Use Right Now

Free-to-use large language models ranked with zero-dollar hosted tiers first, then open-weight models you can self-host without token fees.

Best LLMs for Writing

Top language models for long-form writing, essays, and creative prose. Ranked by Chatbot Arena human-preference scores with MMLU as a fallback.

Best LLMs for Marketing

Top language models for marketing copy, ad creative, email, social posts, and brand-voice content. Ranked by Chatbot Arena human-preference scores with MMLU as a fallback.

Best LLMs for Customer Support

Function-calling models for support bots, ranked by tau-bench service-task performance with BFCL fallback and a $25 per 1k conversation cost gate.