Phi-3 Mini 4k
phi-3-mini-4k
Last refreshed 2026-05-16. Next refresh: weekly.
Phi-3 Mini 4k is worth evaluating for coding and classification when its provider route and context window match the workload.
Decision context: Coding task fit, 4 tracked provider routes, and research from 2026-01-01.
Use it for
- Teams evaluating coding and classification
- Workloads that can use a 4K context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
Cheapest output
$0.250
Replicate API per 1M tokens
Provider routes
4
Tracked API hosts
Quality / dollar
Grade B
Ranked by benchmark score divided by cheapest output price
Freshness
2026-01-01
Researched 144d ago
Top use-case fit
Coding
Q/$ B1 relevant benchmark in the decision map.
Classification
Q/$ C3 relevant benchmarks in the decision map.
Provider price ladder
Compare all 4| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Replicate API | $0.050 | $0.250 | Serverless |
| Microsoft Foundry | $0.280 | $0.840 | ServerlessProvisioned |
| Baseten API | - | - | ServerlessPartial |
| NVIDIA NIM | - | - | ProvisionedPartial |
Benchmark peer barsfor Coding
Migration checks
No linked migration route is available for this model yet.
About
The Phi-3 Mini-4K-Instruct model by Microsoft is an advanced, lightweight language model boasting 3.8 billion parameters, optimized for environments with limited computational resources. It excels in various natural language processing tasks, especially in reasoning, text generation, and maintaining multi-turn conversations. Trained on a mix of synthetic and high-quality data, the model is tailored for effective instruction-following. Despite its capabilities, it has limitations in factual knowledge and multilingual support, often requiring external resources to enhance accuracy. The model is ideal for commercial and research applications that demand efficient processing, such as mobile apps and real-time systems.
Phi-3 Mini 4k has a 4K-token context window.
Phi-3 Mini 4k input tokens at $0.05/1M, output at $0.25/1M.
Capabilities
No model capability flags are currently sourced.
Benchmark Scores(6)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Google-Proof Q&A | 40.9 | diamond | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| HellaSwag | 87.1 | 10-shot | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| HumanEval | 59.8 | pass@1 | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| Massive Multitask Language Understanding | 68.2 | 5-shot | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| Instruction-Following Evaluation | 45.0 | v2 | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| MMLU PRO | 45.7 | — | https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro |