LLM ReferenceLLM Reference

GPT Realtime Models by OpenAI

OpenAIProprietary
3 models2025–2026Up to 32K ctxFrom $0.6/1M input

About

OpenAI realtime voice models for text, audio, and image inputs with text and audio outputs over the Realtime API.

Specifications(3 models)

GPT Realtime model specifications comparison
ModelReleasedContextVisionMultimodalFn CallingTool Use
gpt-realtime-1.52026-0532KYesYesYesYes
gpt-realtime2025-1032KNoYesNoNo
gpt-realtime-mini2025-1032KNoYesNoNo

Available From(1 provider)

Pricing

GPT Realtime model pricing by provider
ModelProviderInput / 1MOutput / 1MType
gpt-realtime-miniOpenAI API$0.6$2.4Serverless
gpt-realtime-1.5OpenAI API$4$16Serverless
gpt-realtimeOpenAI API$4$16Serverless

Frequently Asked Questions

What is GPT Realtime used for?
GPT Realtime is used for vision and multimodal work and agent workflows and tool use. The family description and listed model capabilities point to those workloads as the best fit.
How does GPT Realtime compare to GPT Realtime 2?
GPT Realtime by OpenAI is strongest where you need vision and multimodal work, while GPT Realtime 2 by OpenAI is the closest related family to check for translation. GPT Realtime has 3 listed variants and reaches up to 32K context, while GPT Realtime 2 reaches up to 131K context, so compare the specs and pricing tables before choosing a production model.
Which GPT Realtime model should I use?
For the lowest listed input price, start with gpt-realtime-mini through OpenAI API at $0.6/1M input tokens. For the most capable/latest local choice, evaluate gpt-realtime-1.5 with 32K context and tool use, function calling, and multimodal inputs.

Models(3)