LLM Reference

The transcription leaderboard · for creatives

Best for transcription

4 editor picks · 6 eligible models · Speech to text, low WER on the messy stuff.

See raw /best
EDITOR'S CHOICEResearch date unavailable

Whisper large-v3-turbo

OpenAI
Excellent

Low WER on the messy, real-world audio.

Lowest WER on noisy real-world audio with the broadest language coverage; cheap to self-host.

The numbers
Pricing
see model page
Context
stt
Pros
  • +Best accuracy on noisy audio
  • +98+ languages
  • +Open weights
Cons
  • Diarization needs add-ons

Also worth picking

The runners-up

ranked by editorial pick order
Editorial tiersExcellentStrongSolid
#ModelTierPricingEditor's note
#2
Nova-3
Deepgram
Cheapest credible STT with the fastest streaming latency in production (Deepgram).
#3
AssemblyAI
Assemblyai
Best speaker diarization and PII redaction out of the box.
#4
Flux
Deepgram
Deepgram's conversational ASR tuned for low-latency voice agents.

Eligibility

6 models are eligible for this board

Eligibility means tagged with useCases: [stt]. Pins must come from this pool.

All picks