LLM ReferenceLLM Reference

GPT Realtime Whisper

gpt-realtime-whisper

Proprietary

About

GPT Realtime Whisper is OpenAI's streaming speech-to-text model, released May 7, 2026. It transcribes spoken audio live as a speaker talks rather than waiting for utterance completion, making it suitable for live captions, meeting notes, classroom transcripts, and real-time agent pipelines. The model is exposed through /v1/realtime/transcription_sessions and is priced per minute at $0.017 rather than per token.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
OpenAI APIServerless

API Versions

gpt-realtime-whisper

Rankings

Specifications

Released2026-05-07
Architecturetransformer
Specializationtranscription
LicenseProprietary
Trainingpretrained

Created by

Cutting-edge research and development.

San Francisco, California, United States
Founded 2015
Website

Providers(1)