Xiaomi MiMo-V2-Flash
xiaomi-mimo-v2-flash
Last refreshed 2026-05-04. Next refresh: weekly.
Xiaomi MiMo-V2-Flash has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Decision context: RAG task fit, 0 tracked provider routes, and research from 2026-05-04.
Use it for
- Teams evaluating rag, agents, and long context
- Workloads that can use a 262K context window
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Vision or document-understanding workloads
- Teams that need a tracked hosted API route today
Cheapest output
-
No tracked output price
Provider routes
0
No provider route in seed
Quality / dollar
Unknown
No task benchmark coverage yet
Freshness
2026-05-04
Researched 17d ago
Top use-case fit
RAG
Included by capability and metadata signals in the decision map.
Agents
Included by capability and metadata signals in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Benchmark peer barsfor RAG
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
About
MiMo-V2-Flash is Xiaomi's efficient open-source Mixture-of-Experts model, announced December 17, 2025 at Xiaomi's Human-Car-Home Ecosystem Partner Conference. It has 309B total parameters with 15B active, uses hybrid attention that interleaves Sliding Window Attention and Global Attention, and extends native 32K context to 256K. Multi-Token Prediction enables about 2.6x speculative decoding speedup. The model was distributed with weights on Hugging Face and ranked highly on SWE-Bench Verified and multilingual benchmarks at research time.
Xiaomi MiMo-V2-Flash has a 256K-token context window.