LLM Reference

GPT Realtime 2 Models by OpenAI

OpenAIProprietary
3 models2026Up to 131k ctxFrom $32/1M input

Details

ResearcherOpenAI
LicenseProprietary
Commercial useCommercial use: conditional
Models3
Released2026
Max context131k

Capabilities

Multimodal2 of 3 models
Reasoning1 of 3 models
Function Calling1 of 3 models
Tool Use1 of 3 models

Links

Website

About

GPT Realtime 2 is OpenAI's second generation real-time voice model family, released May 7, 2026. The family includes gpt-realtime-2 for voice reasoning, gpt-realtime-translate for live speech-to-speech translation, and gpt-realtime-whisper for streaming speech-to-text transcription. These models use distinct Realtime API endpoints and supersede the GPT-4o Realtime Preview generation for voice-agent workflows.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

3 in view

Use when the workload needs realtime voice, 131k context, and reasoning.

2026-05realtime voice131k contextreasoning

Use when the workload needs translation, multimodal inputs, and audio.

2026-05translationmultimodal inputsaudio

Use when the workload needs speech recognition and audio.

2026-05speech recognitionaudio

Release Timeline

1 release group
2026-05
3 current
GPT Realtime 2
realtime voice131k contextreasoning
Current
GPT Realtime Translate
translationmultimodal inputsaudio
Current
GPT Realtime Whisper
speech recognitionaudio
Current

Specifications(3 models)

GPT Realtime 2 model specifications comparison
ModelReleasedContextMultimodalReasoningFn CallingTool Use
GPT Realtime 22026-05131kYesYesYesYes
GPT Realtime Translate2026-05YesNoNoNo
GPT Realtime Whisper2026-05NoNoNoNo

Available From(1 provider)

Pricing

GPT Realtime 2 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
GPT Realtime 2OpenAI API$32$64Serverless

Frequently Asked Questions

What is GPT Realtime 2 used for?
GPT Realtime 2 is used for realtime voice, translation, and speech recognition. The family description and listed model capabilities point to those workloads as the best fit.
How does GPT Realtime 2 compare to GPT-5.4?
GPT Realtime 2 by OpenAI is strongest where you need realtime voice, while GPT-5.4 by OpenAI is the closest related family to check for cybersecurity. GPT Realtime 2 has 3 listed variants and reaches up to 131k context, while GPT-5.4 reaches up to 1.05m context, so compare the specs and pricing tables before choosing a production model.
Which GPT Realtime 2 model should I use?
For the lowest listed input price, start with GPT Realtime 2 through OpenAI API at $32/1M input tokens. For the most capable/latest local choice, evaluate GPT Realtime 2 with 131k context and reasoning, tool use, function calling, and multimodal inputs.