GPT Realtime 2 Models by OpenAI
OpenAIProprietary
3 models2026Up to 131K ctxFrom $32/1M input
About
GPT Realtime 2 is OpenAI's second generation real-time voice model family, released May 7, 2026. The family includes gpt-realtime-2 for voice reasoning, gpt-realtime-translate for live speech-to-speech translation, and gpt-realtime-whisper for streaming speech-to-text transcription. These models use distinct Realtime API endpoints and supersede the GPT-4o Realtime Preview generation for voice-agent workflows.
Specifications(3 models)
| Model | Released | Context | Multimodal | Reasoning | Fn Calling | Tool Use |
|---|---|---|---|---|---|---|
| GPT Realtime 2 | 2026-05 | 131K | Yes | Yes | Yes | Yes |
| GPT Realtime Translate | 2026-05 | — | Yes | No | No | No |
| GPT Realtime Whisper | 2026-05 | — | No | No | No | No |
Available From(1 provider)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| GPT Realtime 2 | OpenAI API | $32 | $64 | Serverless |
Frequently Asked Questions
- What is GPT Realtime 2 used for?
- GPT Realtime 2 is used for translation, transcription, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
- How does GPT Realtime 2 compare to GPT-5.4?
- GPT Realtime 2 by OpenAI is strongest where you need translation, while GPT-5.4 by OpenAI is the closest related family to check for cybersecurity. GPT Realtime 2 has 3 listed variants and reaches up to 131K context, while GPT-5.4 reaches up to 1.1M context, so compare the specs and pricing tables before choosing a production model.
- Which GPT Realtime 2 model should I use?
- For the lowest listed input price, start with GPT Realtime 2 through OpenAI API at $32/1M input tokens. For the most capable/latest local choice, evaluate GPT Realtime 2 with 131K context and reasoning, tool use, function calling, and multimodal inputs.


