LLM Reference

Best LLM for translation

Last refreshed 2026-06-30. Next refresh: weekly.

Compare multilingual LLMs and dedicated translation models for text, document, and live speech translation. Ranked by context length and general-language benchmarks until translation-specific leaderboard rows land in seed data.

Verdict

Use Llama 4 Scout 17B-16E Instruct for this task today.

MiniMax-01 is the runner-up: Current leader vs No. 2 on Pick.

Researched 23d agoWhy this pickMethodology

How we rank

Translation picks are methodology-forward until dedicated translation benchmark rows land in seed data. We surface tagged translation models, Qwen-MT specialists, and long-context chat models teams commonly route for multilingual work.

  1. EligibilityModels with translation use-case tags, translation specialization, Qwen-MT family membership, or ≥128K context on general chat-completion routes (excluding embeddings and media-only specialists).
  2. Primary rankingDeclared context window (wider first), then MMLU when present, then newer release. Realtime speech translation models appear in the table but are labeled separately from text-generation LLMs.
  3. Benchmark gapNo translation-quality leaderboard is standardized in seed yet — do not treat this page as a definitive translation-quality ranking until WMT or vendor-neutral multilingual rows are sourced.
  4. Pricing columnLowest tracked provider input/output where a public rate card exists.
#ModelInput $/1MOutput $/1M
1RWKV-7 Goose 2.9B
2RWKV-7 Goose 1.5B
3RWKV-7 Goose 0.4B
4RWKV-7 Goose 0.1B
5RWKV-6 Finch 14B
6RWKV-6 Finch 7B
7RWKV-6 Finch 3B
8RWKV-6 Finch 1.6B
9LTM-2-mini
10Llama 4 Scout 17B-16E Instruct
Vision
$0.08$0.22
11LTM-1
12MiniMax-01
Vision
$0.20$1.10
13Gemini 3.5 Pro
PreviewReasoningVision
14Grok 4.20 Multi-Agent
ReasoningVisionTools
$1.25$2.50
15Gemini 1.5 Pro 002
16Gemini 1.5 Pro Experimental 0827
17Gemini 1.5 Pro Experimental 0801
18Gemini 1.5 Pro
$1.25$5.00
19GPT-5.5
ReasoningVisionTools
$5.00$30.00
20GPT-5.5 Pro
ReasoningVisionTools
$30.00$180.00

Honorable mentions

Next seats in this ranking. Lines below are from each model's stored description in LLMReference seed data—spot-check the model page before relying on a capability claim.

  • Stable Gemini 1.5 Pro release (February variant) optimized for complex reasoning and high-quality multimodal analysis. Supports 2M context for extended document and video processing.

    See leaderboard

    Rank

  • Updated Pro experimental variant with refinements to reasoning depth and creative task performance.

    See leaderboard

    Rank

  • Experimental Pro variant with enhanced reasoning and multimodal understanding for complex problem-solving tasks.

    See leaderboard

    Rank

Frequently asked questions

Which LLM is best for translation?

Llama 4 Scout 17B-16E Instruct is the current LLMReference top pick for translation. The verdict uses the stored category signal Pick: Current leader. Output pricing starts at $0.22 per 1M tokens. Review the linked model and provider pages before production use because availability and pricing can change.

How does Llama 4 Scout 17B-16E Instruct compare to MiniMax-01 for translation?

Llama 4 Scout 17B-16E Instruct leads MiniMax-01 in the visible shortlist on Pick: Current leader versus No. 2. The pricing cards show Llama 4 Scout 17B-16E Instruct: output pricing starts at $0.22 per 1m tokens and MiniMax-01: output pricing starts at $1.10 per 1m tokens.

How does LLMReference rank LLMs for translation?

LLMReference ranks LLMs for translation from stored model, benchmark, freshness, and pricing data. The current methodology summary is: Translation picks are methodology-forward until dedicated translation benchmark rows land in seed data. We surface tagged translation models, Qwen-MT specialists, and long-context chat models teams commonly route for multilingual work.

How often is this list updated?

The LLM rankings on this page are updated daily as new benchmark scores, provider availability, and pricing data are tracked. The "as of" date at the top of the page shows the most recent refresh.

How do you decide which models appear in the top 3?

The podium picks are driven by the primary benchmark signal for this category (shown in the Methodology section), filtered to non-deprecated models with confirmed API availability. In ties, we prefer the more recently released model.

Are preview or beta models included?

Preview models appear in the "Watch list" section but are not in the main ranked podium unless the category explicitly allows it (e.g., /best/coding and /best/agents, where preview models often lead benchmarks).

Can I compare two specific models head-to-head?

Yes — use the Compare tool at llmreference.com/compare for a side-by-side breakdown of context window, pricing, benchmarks, and provider availability.

Is the pricing data real-time?

Pricing is tracked from provider documentation and updated regularly. It reflects the best available public data, not live API quotes — always verify before billing.