What is GPT Realtime used for?

GPT Realtime is used for realtime voice, vision and multimodal work, and agent workflows and tool use. The family description and listed model capabilities point to those workloads as the best fit.

How does GPT Realtime compare to GPT Realtime 2?

GPT Realtime by OpenAI is strongest where you need realtime voice, while GPT Realtime 2 by OpenAI is the closest related family to check for realtime voice. GPT Realtime has 3 listed variants and reaches up to 32k context, while GPT Realtime 2 reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.

Which GPT Realtime model should I use?

For the lowest listed input price, start with gpt-realtime-mini through OpenAI API at $0.6/1M input tokens. For the most capable/latest local choice, evaluate gpt-realtime with 32k context and multimodal inputs.

GPT Realtime Models by OpenAI

OpenAIProprietary

3 models2025Up to 32k ctxFrom $0.6/1M input

Details

ResearcherOpenAI

LicenseProprietary

Commercial useCommercial use: conditional

Models3

Released2025

Max context32k

Capabilities

Vision1 of 3 models

MultimodalAll models

Function Calling1 of 3 models

Tool Use1 of 3 models

Links

Website

About

OpenAI realtime voice models for text, audio, and image inputs with text and audio outputs over the Realtime API.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

2 in view1 retired

gpt-realtimeCurrent

Use when the workload needs realtime voice, 32k context, and multimodal inputs.

2025-10realtime voice32k contextmultimodal inputs

gpt-realtime-miniCurrent

Use when the workload needs realtime voice, 32k context, and multimodal inputs.

2025-10realtime voice32k contextmultimodal inputs

Current GPT Realtime variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
gpt-realtime	Use when the workload needs realtime voice, 32k context, and multimodal inputs.	2025-10	realtime voice32k contextmultimodal inputs	Current
gpt-realtime-mini	Use when the workload needs realtime voice, 32k context, and multimodal inputs.	2025-10	realtime voice32k contextmultimodal inputs	Current

Release Timeline

2 release groups

2025-12

1 retired

gpt-realtime-1.5

realtime voice32k contexttool use

Replaced

2025-10

2 current

gpt-realtime

realtime voice32k contextmultimodal inputs

Current

gpt-realtime-mini

realtime voice32k contextmultimodal inputs

Current

Replaced By

gpt-realtime-1.5GPT Realtime 2

Replaced

Keep for legacy integrations; evaluate GPT Realtime 2 before new work.

Specifications(3 models)

GPT Realtime model specifications comparison
Model	Released	Context	Vision	Multimodal	Fn Calling	Tool Use
gpt-realtime	2025-10	32k	No	Yes	No	No
gpt-realtime-mini	2025-10	32k	No	Yes	No	No

Available From(1 provider)

OpenAI API

Pricing

GPT Realtime model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
gpt-realtime-mini	OpenAI API	$0.6	$2.4	Serverless
gpt-realtime	OpenAI API	$4	$16	Serverless

Frequently Asked Questions

What is GPT Realtime used for?: GPT Realtime is used for realtime voice, vision and multimodal work, and agent workflows and tool use. The family description and listed model capabilities point to those workloads as the best fit.
How does GPT Realtime compare to GPT Realtime 2?: GPT Realtime by OpenAI is strongest where you need realtime voice, while GPT Realtime 2 by OpenAI is the closest related family to check for realtime voice. GPT Realtime has 3 listed variants and reaches up to 32k context, while GPT Realtime 2 reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
Which GPT Realtime model should I use?: For the lowest listed input price, start with gpt-realtime-mini through OpenAI API at $0.6/1M input tokens. For the most capable/latest local choice, evaluate gpt-realtime with 32k context and multimodal inputs.

Models(3)