OpenAI Whisper Models by OpenAI
4 models2022–2024
Details
ResearcherOpenAI
LicenseProprietary
Commercial useCommercial use with conditions
Models4
Released2022–2024
Capabilities
MultimodalAll models
Links
WebsiteAbout
OpenAI's Whisper family provides multilingual automatic speech recognition and translation models exposed through the OpenAI Audio API.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
4 in view
Whisper large-v3-turboCurrent
Use when the workload needs speech recognition, multimodal inputs, and audio.
2024-01speech recognitionmultimodal inputsaudio
Whisper BaseCurrent
Use when the workload needs speech recognition, 74M parameters, and multimodal inputs.
2022-12speech recognition74M parametersmultimodal inputs
Whisper Tiny ENCurrent
Use when the workload needs speech recognition, 39M parameters, and multimodal inputs.
2022-12speech recognition39M parametersmultimodal inputs
WhisperCurrent
Use when the workload needs speech recognition, multimodal inputs, and audio.
2022-09speech recognitionmultimodal inputsaudio
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Whisper large-v3-turbo | Use when the workload needs speech recognition, multimodal inputs, and audio. | 2024-01 | speech recognitionmultimodal inputsaudio | Current |
| Whisper Base | Use when the workload needs speech recognition, 74M parameters, and multimodal inputs. | 2022-12 | speech recognition74M parametersmultimodal inputs | Current |
| Whisper Tiny EN | Use when the workload needs speech recognition, 39M parameters, and multimodal inputs. | 2022-12 | speech recognition39M parametersmultimodal inputs | Current |
| Whisper | Use when the workload needs speech recognition, multimodal inputs, and audio. | 2022-09 | speech recognitionmultimodal inputsaudio | Current |
Release Timeline
3 release groups2024-01
1 current
Whisper large-v3-turbo
Currentspeech recognitionmultimodal inputsaudio
2022-12
2 current
Whisper Base
Currentspeech recognition74M parametersmultimodal inputs
Whisper Tiny EN
Currentspeech recognition39M parametersmultimodal inputs
2022-09
1 current
Whisper
Currentspeech recognitionmultimodal inputsaudio
Specifications(4 models)
| Model | Released | Parameters | Multimodal |
|---|---|---|---|
| Whisper large-v3-turbo | 2024-01 | — | Yes |
| Whisper Base | 2022-12 | 74M | Yes |
| Whisper Tiny EN | 2022-12 | 39M | Yes |
| Whisper | 2022-09 | — | Yes |
Frequently Asked Questions
- What is OpenAI Whisper used for?
- OpenAI Whisper is used for audio, speech recognition, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
- How does OpenAI Whisper compare to GPT Realtime 2?
- OpenAI Whisper by OpenAI is strongest where you need audio, while GPT Realtime 2 by OpenAI is the closest related family to check for realtime voice. OpenAI Whisper has 4 listed variants, while GPT Realtime 2 reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
- Which OpenAI Whisper model should I use?
- If price is the main constraint, use the pricing table first because OpenAI Whisper does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Whisper large-v3-turbo with multimodal inputs.





