LLM Reference

MiniMax M2.7 vs MiniMax M3

MiniMax M3 is the June 2026 successor to M2.7 for long-context, multimodal, and computer-use agent workflows. M3 moves to MiniMax Sparse Attention, expands the working window to 1M tokens, adds native image/video input, and reports stronger coding-agent scores; M2.7 remains the cheaper text-only option when a 205K context window is enough.

Pick MiniMax M3 for million-token documents, large repositories, image/video inputs, computer-use automation, and the stronger current coding-agent benchmark set: 59.0% SWE-Bench Pro, 66.0% Terminal-Bench 2.1, 74.2% MCP-Atlas, and 83.5 BrowseComp. Pick MiniMax M2.7 for high-volume text-only work under roughly 200K tokens where the lower $0.30/M input route matters more than M3's added modalities and long-context efficiency.

Decision scorecard

Local evidence first
SignalMiniMax M2.7MiniMax M3
Best forreasoning-heavy apps, tool-calling agents, and provider-routed productionreasoning-heavy apps, multimodal apps, and tool-calling agents
Decision fitCoding, RAG, and AgentsCoding, RAG, and Agents
Context window205k1m
Cheapest output$1.20/1M tokens$2.40/1M tokens
Provider routes4 tracked1 tracked
Shared benchmarks3 rowsSWE-bench Pro leader

Decision tradeoffs

Choose MiniMax M2.7 when...
  • MiniMax M2.7 has the lower cheapest tracked output price at $1.20/1M tokens.
  • MiniMax M2.7 has broader tracked provider coverage for fallback and procurement flexibility.
  • Local decision data tags MiniMax M2.7 for Coding, RAG, and Agents.
Choose MiniMax M3 when...
  • MiniMax M3 holds a shared-benchmark lead on SWE-bench Pro, ahead by 2.8 points.
  • MiniMax M3 has the larger context window for long prompts, retrieval packs, or transcript analysis.
  • MiniMax M3 uniquely exposes Vision, Multimodal, and Code execution in local model data.
  • Local decision data tags MiniMax M3 for Coding, RAG, and Agents.

Monthly cost at traffic

Estimate token spend from the cheapest tracked input and output route or tier on this page.

Lower estimate MiniMax M2.7

MiniMax M2.7

$523

Cheapest tracked route/tier: OpenRouter

MiniMax M3

$1,080

Cheapest tracked route/tier: MiniMax <=512K input tokens (standard)

Estimated monthly gap: $557. Batch, cache, alternate speed tiers, and negotiated pricing are excluded from this local estimate.

Switch friction

MiniMax M2.7 -> MiniMax M3
  • No overlapping tracked provider route is sourced for MiniMax M2.7 and MiniMax M3; plan for SDK, billing, or endpoint changes.
  • MiniMax M3 is $1.20/1M tokens higher on cheapest tracked output pricing, so quality gains need to justify the spend.
  • MiniMax M3 adds Vision, Multimodal, and Code execution in local capability data.
MiniMax M3 -> MiniMax M2.7
  • No overlapping tracked provider route is sourced for MiniMax M3 and MiniMax M2.7; plan for SDK, billing, or endpoint changes.
  • MiniMax M2.7 is $1.20/1M tokens lower on cheapest tracked output pricing before cache, batch, or negotiated discounts.
  • Check replacement coverage for Vision, Multimodal, and Code execution before moving production traffic.

Specs

Specification
Released2026-03-182026-06-01
Context window205k1m
Parameters10B active
Architecturedecoder onlydecoder only
LicenseMIT(OSI)Proprietary
OpennessOpen sourceProprietary
Commercial useCommercial use allowedCommercial use with conditions
Knowledge cutoff--

Pricing and availability

Pricing attributeMiniMax M2.7MiniMax M3
Input price$0.28/1M tokens
<=512K input tokens (standard)
$0.60/1M tokens
Regular list price before the temporary launch discount.
>512K input tokens (limited)
$1.20/1M tokens
MiniMax marks inputs above 512K as limited quantity for a limited time.
Output price$1.20/1M tokens
<=512K input tokens (standard)
$2.40/1M tokens
Regular list price before the temporary launch discount.
>512K input tokens (limited)
$4.80/1M tokens
MiniMax marks inputs above 512K as limited quantity for a limited time.
Providers

Capabilities

CapabilityMiniMax M2.7MiniMax M3
VisionNoYes
MultimodalNoYes
ReasoningYesYes
Function callingYesYes
Tool useYesYes
Structured outputsYesYes
Code executionNoYes
IDE integrationNoNo
Computer useNoNo
Parallel agentsNoNo

Benchmarks

BenchmarkMiniMax M2.7MiniMax M3
SWE-bench Pro56.259.0
Google-Proof Q&A87.492.9
Terminal-Bench 2.057.066.0

Deep dive

The practical split is context and modality. M2.7 is a text-first 205K-context model, while M3 is positioned around 1M-token coding, agent, and long-video workflows. MiniMax's M3 release also describes image and video input plus desktop computer operation, so screenshots, papers with figures, long videos, and computer-use tasks should start with M3.

The coding benchmark direction favors M3, but read the rows by benchmark version. M3 reports 59.0% on SWE-Bench Pro versus M2.7 at 56.2%, which is a clean same-suite upgrade. M3's 66.0% Terminal-Bench score is on Terminal-Bench 2.1, while M2.7's 57.0% listing is Terminal Bench 2, so the direction is useful but not a strict same-version delta.

M3's architectural claim is MiniMax Sparse Attention. MiniMax says MSA cuts per-token compute at 1M context to roughly one twentieth of the previous generation, with more than 9x faster prefill and more than 15x faster decoding. Treat those as vendor-reported long-context efficiency claims until your own workload confirms them.

Cost is the reason to keep M2.7 in the shortlist. The tracked M2.7 routes sit around $0.30/M input and $1.20/M output, while M3's standard route is $0.60/M input and $2.40/M output, with a higher long-context tier above 512K input tokens. For plain text generation inside M2.7's window, M3's extra capability may not pay for itself.

Do not fill the gaps with unrelated academic rows. The M3 release data used here does not publish MMLU, GPQA, HumanEval, or LiveCodeBench scores, so this comparison leans on the sourced coding-agent, terminal, MCP, and browsing benchmarks plus the provider pricing table.

FAQ

Is MiniMax M3 better than MiniMax M2.7?

M3 is the stronger default for long-context and agentic work because it adds 1M context, native multimodal input, computer-use support, and higher sourced coding-agent benchmark rows. M2.7 is still the better cost pick for text-only jobs that fit inside roughly 205K tokens.

Which one is cheaper to run?

M2.7 is cheaper on the currently tracked standard token prices: about $0.30/M input and $1.20/M output versus M3 at $0.60/M input and $2.40/M output. M3 also has a higher tier above 512K input tokens, so long-context cost should be estimated separately.

Can MiniMax M2.7 handle images or video?

No sourced M2.7 row in this comparison marks image or video input. M3 is the MiniMax model in this pair positioned for native multimodal work, including image and video input, plus computer-use automation for agent workflows.

Are the M3 and M2.7 Terminal-Bench numbers directly comparable?

Only directionally. M3's sourced score is Terminal-Bench 2.1, while M2.7's sourced OpenRouter listing refers to Terminal Bench 2. The comparison should mention that version difference instead of treating the gap as a clean same-harness delta.

What benchmark rows are intentionally missing for MiniMax M3?

The current M3 release material used here does not publish MMLU, GPQA, HumanEval, or LiveCodeBench rows. This page should not infer those scores from neighboring models or older MiniMax releases.

Continue comparing

Last reviewed: 2026-06-09. Data sourced from public model cards and provider documentation.