Step-1.5V
ProprietaryMultimodal
About
Step-1.5V is StepFun's multimodal language model with vision capabilities, building on Step-1 with image understanding.
Capabilities
MultimodalFunction CallingTool UseJSON Mode
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| StepFun API | — | — | Serverless |