MiMo-V2-Omni
mimo-v2-omni
ProprietaryMultimodal
About
Xiaomi MiMo-V2-Omni multimodal language model. Part of the MiMo V2 series; the Omni variant adds multimodal (image) understanding. Distinct from MiMo V2.5 which focuses on math reasoning.
MiMo-V2-Omni has a 256K-token context window.
MiMo-V2-Omni input tokens at $0.4/1M, output at $2/1M.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| OpenRouter | $0.4 | $2 | Serverless |