GPT Realtime 2
gpt-realtime-2
ProprietaryMultimodal
About
GPT Realtime 2 is OpenAI's second-generation real-time voice model, released May 7, 2026. It is a GPT-5-class speech-to-speech model for voice agents with five reasoning intensity levels, parallel tool calls, spoken preambles, and recovery behavior on failed tasks. The model supports audio and text interaction through the Realtime API with a 128K token context window. Audio token pricing is $32 per 1M input tokens, $0.40 per 1M cached input tokens, and $64 per 1M output tokens.
GPT Realtime 2 has a 128K-token context window.
GPT Realtime 2 input tokens at $32/1M, output at $64/1M.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| OpenAI API | $32 | $64 | Serverless |
API Versions
gpt-realtime-2Specifications
FamilyGPT Realtime 2
Released2026-05-07
Context131K
ArchitectureDecoder Only
Specializationgeneral
LicenseProprietary
Trainingpretrained
Created by
Cutting-edge research and development.
San Francisco, California, United States
Founded 2015
Website