Mistral Small 4
Mistral Small 4 is worth evaluating for rag, agents, and long context when its provider route and context window match the workload.
Use it for
- Teams evaluating rag, agents, and long context
- Workloads that can use a 256k context window
- Buyers comparing 3 tracked provider routes
Do not use it for
- Workloads where another current model has stronger sourced task evidence
- Family
- Mistral Small
- Released
- 2026-03-16
- Context
- 256k
- Parameters
- 119B (6.5B active)
- Architecture
- Mixture of Experts
- Knowledge cutoff
- 2025-06
- Specialization
- general
- Openness
- Open source
- License
- Apache 2.0OSI-approvedCommercial use: permitted
- Training
- Pretrained
Cheapest of 3 routes · Mistral AI Studio
About
Mistral Small 4 is a hybrid 119B MoE model unifying instruct, reasoning, and coding capabilities. Features configurable reasoning effort per request and native function calling with JSON output support.
Mistral Small 4 is an open-source model in the Mistral Small family. The structured metadata tracks a 256k-token context window, multimodal input, function calling, and tool use. This page tracks provider routes through OpenRouter, NVIDIA NIM, and Mistral AI Studio, with the cheapest tracked route listed at $0.1 input and $0.3 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 76.9, MMMU Pro 60.0, and MMLU PRO 78.0.
Top use-case fit: coding, agents, and build tasks
RAG
Included by capability and metadata signals in the decision map.
Agents
Q/$ A1 relevant benchmark in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Provider price ladder
Compare all 3Compare API pricing across 3 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Mistral AI Studio | $0.100 | $0.300 | Serverless |
| OpenRouter | $0.150 | $0.600 | Serverless |
| NVIDIA NIM | - | - | ServerlessPartial |
Available via routers & gateways(11)
LiteLLM
GatewayOpen-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.
OpenRouter
HybridUnified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.
Portkey
GatewayProduction AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.
AIRouter
RouterCommercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.
Martian
RouterAI-powered LLM router that analyzes each prompt in real-time to select the optimal model, targeting 20–97% cost reduction while maintaining quality; San Francisco startup reportedly nearing $1.3B valuation.
Neutrino AI
RouterCommercial LLM router that dynamically routes each query to the best-suited model with load balancing and fallback handling, charging 3% of underlying AI spend.
Capabilities
Benchmark peer barsfor Agents
Benchmark scores(4)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Google-Proof Q&A | 76.9 | diamond | https://artificialanalysis.ai/leaderboards/models |
| MMMU Pro | 60.0 | LLM-Stats aggregator | https://llm-stats.com/benchmarks/mmmu-pro |
| MMLU PRO | 78.0 | Widely reported (accuracy) | https://aidailypost.com/news/mistral-small-4-matches-medium-31-large-3-mmlu-pro-cuts-inference-cost |
| τ-bench | 65.8 | TAU-bench rank 36 of 37 (pass_rate) | https://benchlm.ai/benchmarks/tauBench |
Migration checks
No linked migration route is available for this model yet.
Rankings & picks(2)
Compare Mistral Small 4 with other models
- Mistral Small 4 vs MiniCPM-V 4.626
- Mistral Small 4 vs Gemini 2.5 Flash14
- Mistral Small 4 vs Llama 4 Scout 17B-16E Instruct14
- Mistral Small 4 vs Llama 3 Taiwan 70B Instruct10
- Mistral Small 4 vs ELYZA Japanese Llama 2 7B8
- Mistral Small 4 vs Together AI - Llama 3 8B Lite7
- Mistral Small 4 vs Llama Guard 3 1B7
- Mistral Small 4 vs Claude 3.5 Haiku6
Comparison and alternatives
Browse all comparisons →Frequently asked questions
What is the context window of Mistral Small 4?
Mistral Small 4 has a context window of 256k tokens.
How much does Mistral Small 4 cost?
Mistral Small 4 pricing ranges from $0.10/1M to $0.15/1M input tokens depending on the provider.
When was Mistral Small 4 released?
Mistral Small 4 was released on 2026-03-16.
Which providers offer Mistral Small 4?
Mistral Small 4 is available from 3 providers: OpenRouter, NVIDIA NIM, Mistral AI Studio.
What benchmarks has Mistral Small 4 been tested on?
Mistral Small 4 has been evaluated on 4 benchmarks, including Google-Proof Q&A, MMMU Pro, MMLU PRO, τ-bench.
Cheapest of 3 routes · Mistral AI Studio