GPT Realtime 2 Models by OpenAI
OpenAIProprietary
3 models2026Up to 131k ctxFrom $32/1M input
Details
ResearcherOpenAI
LicenseProprietary
Commercial useCommercial use: conditional
Models3
Released2026
Max context131k
Capabilities
Multimodal2 of 3 models
Reasoning1 of 3 models
Function Calling1 of 3 models
Tool Use1 of 3 models
Links
WebsiteAbout
GPT Realtime 2 is OpenAI's second generation real-time voice model family, released May 7, 2026. The family includes gpt-realtime-2 for voice reasoning, gpt-realtime-translate for live speech-to-speech translation, and gpt-realtime-whisper for streaming speech-to-text transcription. These models use distinct Realtime API endpoints and supersede the GPT-4o Realtime Preview generation for voice-agent workflows.
Current Variants
Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.
3 in view
GPT Realtime 2Current
Use when the workload needs realtime voice, 131k context, and reasoning.
2026-05realtime voice131k contextreasoning
GPT Realtime TranslateCurrent
Use when the workload needs translation, multimodal inputs, and audio.
2026-05translationmultimodal inputsaudio
GPT Realtime WhisperCurrent
Use when the workload needs speech recognition and audio.
2026-05speech recognitionaudio
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| GPT Realtime 2 | Use when the workload needs realtime voice, 131k context, and reasoning. | 2026-05 | realtime voice131k contextreasoning | Current |
| GPT Realtime Translate | Use when the workload needs translation, multimodal inputs, and audio. | 2026-05 | translationmultimodal inputsaudio | Current |
| GPT Realtime Whisper | Use when the workload needs speech recognition and audio. | 2026-05 | speech recognitionaudio | Current |
Release Timeline
1 release group2026-05
3 current
GPT Realtime 2
Currentrealtime voice131k contextreasoning
GPT Realtime Translate
Currenttranslationmultimodal inputsaudio
GPT Realtime Whisper
Currentspeech recognitionaudio
Specifications(3 models)
| Model | Released | Context | Multimodal | Reasoning | Fn Calling | Tool Use |
|---|---|---|---|---|---|---|
| GPT Realtime 2 | 2026-05 | 131k | Yes | Yes | Yes | Yes |
| GPT Realtime Translate | 2026-05 | — | Yes | No | No | No |
| GPT Realtime Whisper | 2026-05 | — | No | No | No | No |
Available From(1 provider)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| GPT Realtime 2 | OpenAI API | $32 | $64 | Serverless |
Frequently Asked Questions
- What is GPT Realtime 2 used for?
- GPT Realtime 2 is used for realtime voice, translation, and speech recognition. The family description and listed model capabilities point to those workloads as the best fit.
- How does GPT Realtime 2 compare to GPT-5.4?
- GPT Realtime 2 by OpenAI is strongest where you need realtime voice, while GPT-5.4 by OpenAI is the closest related family to check for cybersecurity. GPT Realtime 2 has 3 listed variants and reaches up to 131k context, while GPT-5.4 reaches up to 1.05m context, so compare the specs and pricing tables before choosing a production model.
- Which GPT Realtime 2 model should I use?
- For the lowest listed input price, start with GPT Realtime 2 through OpenAI API at $32/1M input tokens. For the most capable/latest local choice, evaluate GPT Realtime 2 with 131k context and reasoning, tool use, function calling, and multimodal inputs.






