StepAudio 2.5 TTS on StepFun

Name: StepAudio 2.5 TTS on StepFun
Brand: StepFun
SKU: step-audio-2-5-tts-stepfun

Serverless

Last refreshed 2026-06-29. Next refresh: weekly.

Why use StepAudio 2.5 TTS on StepFun?

StepFun offers StepAudio 2.5 TTS with competitive pricing. StepFun is a Chinese AI company providing API access to its Step series of large language and multimodal models.

Input / 1M

Output / 1M

Cache

Not sourced

Batch

Not sourced

Setup recipe

Docs fallback

Install

Use the provider REST API or SDK

Auth

Create a provider API key

Call

model: step-audio-2.5-tts

Model ID

step-audio-2.5-tts

Request example

Curated snippets for this provider are not sourced yet. Use StepFun documentation with model ID step-audio-2.5-tts.

Gotchas

Use provider model ID "step-audio-2.5-tts", not the LLMReference slug "step-audio-2-5-tts".

Capabilities

MultimodalAudio

About StepAudio 2.5 TTS

StepAudio 2.5 TTS is StepFun's contextual text-to-speech model with fine-grained expressive control. Unlike tag-based TTS systems, it accepts plain natural language instructions to control emotion, pacing, pauses, and delivery. Supports zero-shot voice cloning with full timbre and emotion control. Priced at $0.85 per 10,000 characters (input text). Supports Chinese and English. Available via StepFun API (model: step-audio-2.5-tts). Part of the unified StepAudio 2.5 architecture described in arXiv:2605.23463.

FAQ

What API model ID do I use for StepAudio 2.5 TTS on StepFun?

Use the model ID step-audio-2.5-tts when calling StepFun's API.

Who created StepAudio 2.5 TTS?

StepAudio 2.5 TTS was created by StepFun as part of the StepAudio 2.5 model family.

Is StepAudio 2.5 TTS open source?

StepAudio 2.5 TTS is not open source; the seed data lists it as proprietary.

Get Started

Model Card Docs Portal

Model Specs

Released2026-04-16

Related Models on StepFun

StepAudio 2.5 Realtime StepAudio 2.5 ASR

Provider

StepFun All models on StepFun →Provider setup guide →