LLM Reference

Whisper

Released
2023-03-01
Last refreshed
2026-06-07
Status
Researched today
ProprietaryCommercial use with conditionsMultimodalVisionAudio

Whisper is worth evaluating for vision when its provider route and context window match the workload.

Use it for

  • Teams evaluating vision
  • Buyers comparing 1 tracked provider route

Do not use it for

  • Strict JSON or tool-calling flows
Specifications
Released
2023-03-01
Architecture
transformer
Specialization
transcription
Openness
Proprietary
License
ProprietaryCommercial use with conditions
Training
pretrained
Created by

Cutting-edge research and development.

San Francisco, California, United States
Founded 2015
Website
Pricing
Output / 1M
-
Input / 1M
-

Cheapest of 1 route · OpenAI API

About

Whisper 1 is OpenAI's general-purpose speech recognition API model, based on Whisper large-v2, released March 2023. Supports multilingual transcription across 50+ languages, speech translation into English, and language identification. Priced at $0.006/min of audio (flat per-minute rate). Exposed via /v1/audio/transcriptions and /v1/audio/translations. For new applications, gpt-4o-transcribe offers better accuracy. API ID: whisper-1.

Whisper is a proprietary model in the OpenAI Transcribe family. The structured metadata tracks multimodal input and audio. This page tracks provider routes through OpenAI API. No headline benchmark score is tracked for Whisper yet.

Top use-case fit

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
OpenAI API--
ServerlessPartial

Capabilities

MultimodalAudio

Benchmark peer barsfor Vision

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

API versions

whisper-1