Using Step 3.7 Flash on StepFun

Implementation guide · Step · StepFun

Serverless

Quick Start

1
Create an account at StepFun and generate an API key.
2
Use the StepFun SDK or REST API to call step-3.7-flash — see the documentation for request format.
3
You'll be billed $0.20/1M input, $1.15/1M output tokens.

API Portal Documentation Model Card

Code Examples

See StepFun documentation for integration details.

About StepFun

StepFun is a Chinese AI company providing API access to its Step series of large language and multimodal models.

View all models on StepFun →

Pricing on StepFun

Type	Price (per 1M)
Input tokens	$0.20
Output tokens	$1.15
Image input	$1.00
Video input	$1.00

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsPrompt Caching

About Step 3.7 Flash

Step 3.7 Flash is StepFun's open-weights multimodal Mixture-of-Experts model for agentic coding, tool use, long-context reasoning, image understanding, and video understanding. It combines a 196B-parameter language backbone with a 1.8B-parameter vision encoder, activates about 11B parameters per token, supports a 256K-token context window, and exposes low, medium, and high reasoning levels for speed/depth tradeoffs. StepFun reports leading open-model results on ClawEval-1.1, SimpleVQA with Search, and SWE-bench Pro at launch. Weights are available on Hugging Face under Apache 2.0.

Full model details →

Model Specs

Released2026-05-29

Parameters198B (11B active)

Context256k

ArchitectureMixture of Experts