GPT Realtime Models by OpenAI
OpenAIProprietary
3 models2025–2026Up to 32K ctxFrom $0.6/1M input
About
OpenAI realtime voice models for text, audio, and image inputs with text and audio outputs over the Realtime API.
Specifications(3 models)
| Model | Released | Context | Vision | Multimodal | Fn Calling | Tool Use |
|---|---|---|---|---|---|---|
| gpt-realtime-1.5 | 2026-05 | 32K | Yes | Yes | Yes | Yes |
| gpt-realtime | 2025-10 | 32K | No | Yes | No | No |
| gpt-realtime-mini | 2025-10 | 32K | No | Yes | No | No |
Available From(1 provider)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| gpt-realtime-mini | OpenAI API | $0.6 | $2.4 | Serverless |
| gpt-realtime-1.5 | OpenAI API | $4 | $16 | Serverless |
| gpt-realtime | OpenAI API | $4 | $16 | Serverless |
Frequently Asked Questions
- What is GPT Realtime used for?
- GPT Realtime is used for vision and multimodal work and agent workflows and tool use. The family description and listed model capabilities point to those workloads as the best fit.
- How does GPT Realtime compare to GPT Realtime 2?
- GPT Realtime by OpenAI is strongest where you need vision and multimodal work, while GPT Realtime 2 by OpenAI is the closest related family to check for translation. GPT Realtime has 3 listed variants and reaches up to 32K context, while GPT Realtime 2 reaches up to 131K context, so compare the specs and pricing tables before choosing a production model.
- Which GPT Realtime model should I use?
- For the lowest listed input price, start with gpt-realtime-mini through OpenAI API at $0.6/1M input tokens. For the most capable/latest local choice, evaluate gpt-realtime-1.5 with 32K context and tool use, function calling, and multimodal inputs.


