Muse Spark

Name: Muse Spark
Author: AI at Meta

Released

2026-04-08

Last refreshed

2026-05-14

Status

Researched 54d ago

ProprietaryCommercial use: unknownMultimodalCodingAgentsVisionJSON / Tool use

Muse Spark is a released coding, agents, and vision model; evaluate it while provider pricing coverage matures.

Use it for

Teams evaluating coding, agents, and vision

Do not use it for

Cost-sensitive launches that need sourced token pricing
Teams that need a tracked hosted API route today

Specifications

Family: Muse
Released: 2026-04-08
Architecture: Decoder Only
Specialization: reasoning
Openness: Proprietary
Weights: Not released
Code: Unknown

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

No tracked provider token pricing is available yet.

Links

Website

About

Muse Spark is the first model in Meta's Muse family, developed by Meta Superintelligence Labs (MSL). It is a natively multimodal reasoning model with capabilities including tool-use, visual chain-of-thought reasoning, and multi-agent orchestration. Muse Spark achieves 58% on Humanity's Last Exam and 38% on FrontierScience Research benchmarks, while being competitive with Llama 4 Maverick at over 10x less compute. Available via meta.ai and the Meta AI app; private API preview only — not open-source.

Muse Spark is a proprietary model in the Muse family. The structured metadata tracks multimodal input, reasoning, function calling, and tool use. Headline tracked benchmarks include MMMU Pro 80.4, Chatbot Arena 1491.0, and Google-Proof Q&A 89.5.

Top use-case fit: coding, agents, and build tasks

Coding

1 relevant benchmark in the decision map.

Agents

1 relevant benchmark in the decision map.

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

VisionMultimodalReasoningFunction CallingTool Use

Benchmark peer barsfor Coding

SWE-bench VerifiedRank 28 of 80

Claude Fable 5

96.0

Claude Mythos Preview

93.9

Claude Opus 4.8

88.6

Claude Opus 4.7

87.6

Muse Sparkcurrent

77.4

Benchmark scores(4)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Source
MMMU Pro	80.4	LLM-Stats aggregator	https://llm-stats.com/benchmarks/mmmu-pro
Chatbot Arena	1491.0	Arena Elo	https://arena.ai/leaderboard
Google-Proof Q&A	89.5	diamond	https://datacamp.com/blog/muse-spark-review; https://labellerr.com/blog/muse-spark-benchmarks/
SWE-bench Verified	77.4	SWE-bench Verified	https://benchlm.ai/benchmarks/sweVerified; https://llm-stats.com/benchmarks/swe-bench-verified