What is OpenAI Whisper used for?

OpenAI Whisper is used for audio, speech recognition, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.

How does OpenAI Whisper compare to GPT Realtime 2?

OpenAI Whisper by OpenAI is strongest where you need audio, while GPT Realtime 2 by OpenAI is the closest related family to check for realtime voice. OpenAI Whisper has 4 listed variants, while GPT Realtime 2 reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.

Which OpenAI Whisper model should I use?

If price is the main constraint, use the pricing table first because OpenAI Whisper does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Whisper large-v3-turbo with multimodal inputs.

OpenAI Whisper Models by OpenAI

OpenAIProprietaryAudio

4 models2022–2024

Details

ResearcherOpenAI

LicenseProprietary

Commercial useCommercial use: conditional

Models4

Released2022–2024

Capabilities

MultimodalAll models

Links

Website

About

OpenAI's Whisper family provides multilingual automatic speech recognition and translation models exposed through the OpenAI Audio API.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

4 in view

Whisper large-v3-turboCurrent

Use when the workload needs speech recognition, multimodal inputs, and audio.

2024-01speech recognitionmultimodal inputsaudio

Whisper BaseCurrent

Use when the workload needs speech recognition, 74M parameters, and multimodal inputs.

2022-12speech recognition74M parametersmultimodal inputs

Whisper Tiny ENCurrent

Use when the workload needs speech recognition, 39M parameters, and multimodal inputs.

2022-12speech recognition39M parametersmultimodal inputs

WhisperCurrent

Use when the workload needs speech recognition, multimodal inputs, and audio.

2022-09speech recognitionmultimodal inputsaudio

Current OpenAI Whisper variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Whisper large-v3-turbo	Use when the workload needs speech recognition, multimodal inputs, and audio.	2024-01	speech recognitionmultimodal inputsaudio	Current
Whisper Base	Use when the workload needs speech recognition, 74M parameters, and multimodal inputs.	2022-12	speech recognition74M parametersmultimodal inputs	Current
Whisper Tiny EN	Use when the workload needs speech recognition, 39M parameters, and multimodal inputs.	2022-12	speech recognition39M parametersmultimodal inputs	Current
Whisper	Use when the workload needs speech recognition, multimodal inputs, and audio.	2022-09	speech recognitionmultimodal inputsaudio	Current

Release Timeline

3 release groups

2024-01

1 current

Whisper large-v3-turbo

speech recognitionmultimodal inputsaudio

Current

2022-12

2 current

Whisper Base

speech recognition74M parametersmultimodal inputs

Current

Whisper Tiny EN

speech recognition39M parametersmultimodal inputs

Current

2022-09

1 current

Whisper

speech recognitionmultimodal inputsaudio

Current

Specifications(4 models)

OpenAI Whisper model specifications comparison
Model	Released	Parameters	Multimodal
Whisper large-v3-turbo	2024-01	—	Yes
Whisper Base	2022-12	74M	Yes
Whisper Tiny EN	2022-12	39M	Yes
Whisper	2022-09	—	Yes

Frequently Asked Questions

What is OpenAI Whisper used for?: OpenAI Whisper is used for audio, speech recognition, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
How does OpenAI Whisper compare to GPT Realtime 2?: OpenAI Whisper by OpenAI is strongest where you need audio, while GPT Realtime 2 by OpenAI is the closest related family to check for realtime voice. OpenAI Whisper has 4 listed variants, while GPT Realtime 2 reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
Which OpenAI Whisper model should I use?: If price is the main constraint, use the pricing table first because OpenAI Whisper does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Whisper large-v3-turbo with multimodal inputs.

Models(4)

Whisper large-v3-turbo

2024-01

MultimodalOpen Source

Whisper Base

2022-1274M

MultimodalOpen Source

Whisper Tiny EN

2022-1239M

MultimodalOpen Source

Whisper

2022-09

MultimodalOpen Source

OpenAI Whisper Models by OpenAI

Details

Capabilities

Links

About

Current Variants

Release Timeline

Specifications(4 models)

Frequently Asked Questions

Related Model Families

Models(4)