LLM Reference

StepAudio 2.5 Realtime on StepFun

StepAudio 2.5 · StepFun

Serverless

Last refreshed 2026-05-27. Next refresh: weekly.

Why use StepAudio 2.5 Realtime on StepFun?

StepFun offers StepAudio 2.5 Realtime with competitive pricing. StepFun is a Chinese AI company providing API access to its Step series of large language and multimodal models.

Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced

Setup recipe

Docs fallback
Install
Use the provider REST API or SDK
Auth
Create a provider API key
Call
model: step-2.5-realtime
Model ID
step-2.5-realtime

Request example

Curated snippets for this provider are not sourced yet. Use StepFun documentation with model ID step-2.5-realtime.

Gotchas

  • Use provider model ID "step-2.5-realtime", not the LLMReference slug "step-audio-2-5-realtime".

Capabilities

MultimodalAudio

About StepAudio 2.5 Realtime

StepAudio 2.5 Realtime is StepFun's end-to-end real-time conversational voice model. It handles speech input and produces speech output through a single unified architecture with no intermediate ASR/TTS pipeline steps. Key capabilities include persona-consistent roleplay via dedicated RLHF training on million-scale persona data, paralinguistic comprehension (detecting and responding to tone, emotion, and speaking rate), and low-latency dialogue. Supports Chinese and English. Available via WebSocket API (step-2.5-realtime). Analogous in function to OpenAI's GPT Realtime models.

FAQ

What API model ID do I use for StepAudio 2.5 Realtime on StepFun?

Use the model ID step-2.5-realtime when calling StepFun's API.

Who created StepAudio 2.5 Realtime?

StepAudio 2.5 Realtime was created by StepFun as part of the StepAudio 2.5 model family.

Is StepAudio 2.5 Realtime open source?

StepAudio 2.5 Realtime is not open source; the seed data lists it as proprietary.

Get Started

Model Specs

Released2026-05-24

Related Models on StepFun