Muse Spark
muse-spark
Last refreshed 2026-05-14. Next refresh: weekly.
Muse Spark has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Decision context: Coding task fit, 0 tracked provider routes, and research from 2026-05-14.
Use it for
- Teams evaluating coding, agents, and vision
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Teams that need a tracked hosted API route today
Cheapest output
-
No tracked output price
Provider routes
0
No provider route in seed
Quality / dollar
Unknown
No output-token price in the ladder
Freshness
2026-05-14
Researched 9d ago
Top use-case fit
Coding
1 relevant benchmark in the decision map.
Agents
1 relevant benchmark in the decision map.
Vision
Included by capability and metadata signals in the decision map.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Benchmark peer barsfor Coding
Migration checks
No linked migration route is available for this model yet.
About
Muse Spark is the first model in Meta's Muse family, developed by Meta Superintelligence Labs (MSL). It is a natively multimodal reasoning model with capabilities including tool-use, visual chain-of-thought reasoning, and multi-agent orchestration. Muse Spark achieves 58% on Humanity's Last Exam and 38% on FrontierScience Research benchmarks, while being competitive with Llama 4 Maverick at over 10x less compute. Available via meta.ai and the Meta AI app; private API preview only — not open-source.
Capabilities
Benchmark Scores(3)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Chatbot Arena | 1491.0 | Arena Elo | https://arena.ai/leaderboard |
| Google-Proof Q&A | 89.5 | diamond | https://datacamp.com/blog/muse-spark-review; https://labellerr.com/blog/muse-spark-benchmarks/ |
| SWE-bench Verified | 77.4 | SWE-bench Verified | https://benchlm.ai/benchmarks/sweVerified; https://llm-stats.com/benchmarks/swe-bench-verified |