LLM ReferenceLLM Reference

GPT Realtime 2

gpt-realtime-2

ProprietaryMultimodal

About

GPT Realtime 2 is OpenAI's second-generation real-time voice model, released May 7, 2026. It is a GPT-5-class speech-to-speech model for voice agents with five reasoning intensity levels, parallel tool calls, spoken preambles, and recovery behavior on failed tasks. The model supports audio and text interaction through the Realtime API with a 128K token context window. Audio token pricing is $32 per 1M input tokens, $0.40 per 1M cached input tokens, and $64 per 1M output tokens.

GPT Realtime 2 has a 128K-token context window.

GPT Realtime 2 input tokens at $32/1M, output at $64/1M.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
OpenAI API$32$64Serverless

API Versions

gpt-realtime-2

Rankings

Specifications

Released2026-05-07
Context131K
ArchitectureDecoder Only
Specializationgeneral
LicenseProprietary
Trainingpretrained

Created by

Cutting-edge research and development.

San Francisco, California, United States
Founded 2015
Website

Providers(1)