LLM Reference

gpt-realtime

Released
2025-10-06
Last refreshed
2026-05-10
Status
Researched 47d ago
ProprietaryCommercial use: conditionalMultimodalVision

gpt-realtime is worth evaluating for vision when its provider route and context window match the workload.

Use it for

  • Teams evaluating vision
  • Workloads that can use a 32k context window
  • Buyers comparing 1 tracked provider route

Do not use it for

  • Strict JSON or tool-calling flows
Specifications
Released
2025-10-06
Context
32k
Max output
4,096
Architecture
Decoder Only
Knowledge cutoff
2023-10
Specialization
realtime-voice
Openness
Proprietary
License
ProprietaryCommercial use: conditional
Training
Pretrained
Created by

Cutting-edge research and development.

San Francisco, California, United States
Founded 2015
Website
Pricing
Output / 1M
$16.00
Input / 1M
$4.00

Cheapest of 1 route · OpenAI API · cache read $0.400

About

Realtime model capable of text and audio inputs and outputs via the Realtime API.

gpt-realtime is a proprietary model in the GPT Realtime family. The structured metadata tracks a 32k-token context window, multimodal input, and audio. This page tracks provider routes through OpenAI API, with the cheapest tracked route listed at $4 input and $16 output per 1M tokens. No headline benchmark score is tracked for gpt-realtime yet.

Top use-case fit

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MCacheRoute
OpenAI API$4.00$16.00read $0.400
Serverless

Available via routers & gateways(15)

Capabilities

MultimodalAudio

Benchmark peer barsfor Vision

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.