LLM Reference

Best Free LLMs You Can Use Right Now (2026)

Last refreshed 2026-06-04. Next refresh: weekly.

Free-to-use large language models ranked with zero-dollar hosted tiers first, then open-weight models you can self-host without token fees.

Need the self-host and license-first view? Compare the open-source leaderboard for open-weight models ranked separately from hosted free-tier availability.

Verdict

Use Nemotron 3 Nano Omni for free-tier work today.

Gemma 4 E4B IT is the runner-up, 2 points back on MMLU-Pro.

Researched 21d agoWhy this pickMethodology

How we rank

Free leaders prioritize models with a current zero-dollar hosted tier, then fall back to open-weight models you can self-host without token fees.

  1. EligibilityModels qualify if at least one current public/self-serve provider row lists both input and output token pricing as $0, or if the model is open-weight/self-host-capable under the same license flags used by /best/open-source.
  2. Primary rankingZero-cost hosted availability comes first. Within the hosted-free and open-weight fallback groups, rows sort by MMLU-Pro, then GPQA Diamond, then MMLU, then newer release. Podium cards keep hosted $0 routes ahead of open-weight fallback rows; a fallback card is labeled as self-host/no-token-fee rather than paid hosted output.
  3. Variant collapseWe keep one row per model family (`familySlug` + parameter tier). When headline scores tie within ±0.5 pt (±10 Elo on Chatbot Arena), we pick the canonical SKU by lowest tracked input price, then GA over preview or limited access, then newest `release`. A folded sibling within the benchmark noise band can show a "Tied within margin" chip on that score cell.
  4. Open-weight fallbackOpen-weight fallback rows represent local or self-hosted use where the model itself has no token fee; you still need to budget your own compute.
  5. Related viewUse /best/open-source when the decision is license and self-host posture first; this free page answers what you can use at zero dollars right now.
#ModelInput $/1MOutput $/1M
1Nemotron 3 Nano Omni

Capability signal: MMLU-Pro 71.8%

FreeFree
2Gemma 4 E4B IT
Tools

Capability signal: MMLU-Pro 69.4%

FreeFree
3Gemma 4 E2B IT
Tools

Capability signal: MMLU-Pro 60%

FreeFree
4CoBuddy
ReasoningTools

Capability signal:

FreeFree
5Laguna XS.2

Capability signal:

FreeFree
6Laguna M.1

Capability signal:

FreeFree
7Gemma 3

Capability signal:

FreeFree
8Gemma 3n

Capability signal:

FreeFree
9MedGemma
VisionTools

Capability signal:

FreeFree
10MedSigLIP
VisionTools

Capability signal:

FreeFree
11T5Gemma
Tools

Capability signal:

FreeFree
12Qwen3.5-397B-A17B
ReasoningTools

Capability signal: MMLU-Pro 87.8%

$0.39$2.34
13DeepSeek V4 Pro
ReasoningTools

Capability signal: MMLU-Pro 87.5%

$0.43$0.87
14Kimi K2.5
Tools

Capability signal: MMLU-Pro 87.1%

$0.44$2.00
15DeepSeek V4 Flash
ReasoningTools

Capability signal: MMLU-Pro 86.2%

$0.10$0.20
16Qwen3.6-27B
ReasoningVisionTools

Capability signal: MMLU-Pro 86.2%

$0.32$3.20
17Mistral Large 3 675B Instruct

Capability signal: MMLU-Pro 85.5%

$0.50$1.50
18Qwen3.6-35B-A3B
Tools

Capability signal: MMLU-Pro 85.2%

$0.15$1.00
19DeepSeek R1 0528
Reasoning

Capability signal: MMLU-Pro 85%

$0.50$1.68
20Kimi K2.6
ReasoningVisionTools

Capability signal: MMLU-Pro 84.6%

$0.73$3.40

Honorable mentions

Next seats in this ranking. Lines below are from each model's stored description in LLMReference seed data—spot-check the model page before relying on a capability claim.

  • CoBuddy is a Baidu Qianfan code generation model optimized for coding tasks and AI agent workflows. OpenRouter lists the free variant with a 131K context window, native tool support, reasoning support, and FP8 quantization for high-throughput inference.

    Capability signal

  • Poolside Laguna XS.2 — small-tier coding-focused language model, available as a free preview on OpenRouter.

    Capability signal

  • Poolside Laguna M.1 — medium-tier coding-focused language model, available as a free preview on OpenRouter.

    Capability signal