llmreference

GPT Audio

gpt-audio

Researched 10d ago

Last refreshed 2026-05-11. Next refresh: weekly.

ProprietaryMultimodalLong contextVision

GPT Audio is worth evaluating for long context and vision when its provider route and context window match the workload.

Decision context: Long context task fit, 2 tracked provider routes, and research from 2026-05-10.

Use it for

  • Teams evaluating long context and vision
  • Workloads that can use a 128K context window
  • Buyers comparing 2 tracked provider routes

Do not use it for

  • Strict JSON or tool-calling flows

Cheapest output

$10.00

OpenAI API per 1M tokens

Provider routes

2

Tracked API hosts

Quality / dollar

Unknown

No task benchmark coverage yet

Freshness

2026-05-10

Researched 10d ago

fresh

Top use-case fit

Long context

Included by capability and metadata signals in the decision map.

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

ProviderInput / 1MOutput / 1MRoute
OpenAI API$2.50$10.00
Serverless
OpenRouter$2.50$10.00
Serverless

Benchmark peer barsfor Long context

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

About

Audio model for inputs/outputs via Chat Completions API. Replaces deprecated gpt-4o-audio-preview-2024-12-17.

GPT Audio has a 128K-token context window.

GPT Audio input tokens at $2.5/1M, output at $10/1M.

Capabilities

MultimodalAudio

Rankings

Specifications

FamilyGPT Audio
Released2024-10-01
Context128K
Max output16,384
ArchitectureDecoder Only
Knowledge cutoff2023-10
Specializationgeneral
LicenseProprietary

Created by

Cutting-edge research and development.

San Francisco, California, United States
Founded 2015
Website