Quick Start
- 1
- 2
- 3You'll be billed $0.60/1M input, $2.40/1M output tokens.
Code Examples
About MiniMax
MiniMax is a multimodal foundation model and API platform for text, speech, video, image, and music generation with agent tools.
Pricing on MiniMax
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.60 |
| Output tokens | $2.40 |
| Image input | $1.00 |
| Video input | $1.00 |
Capabilities
VisionMultimodalReasoningFunction CallingTool UsePrompt Caching
About MiniMax M3
MiniMax M3 is a frontier multimodal model released May 31, 2026, accepting text, image, and video inputs with up to 1 million tokens of context. Built on MiniMax Sparse Attention, it targets production coding and long-horizon agent workflows with lower long-context attention cost, 9.7x faster prefill, 15.6x faster decode versus MiniMax M2 at 1M tokens, interleaved reasoning between tool calls, function calling, and automatic prompt caching.
Model Specs
Released2026-05-31
Context1m
ArchitectureMiniMax Sparse Attention (MSA)