Phi-3 Mini 128K
phi-3-mini-128k
Last refreshed 2026-05-16. Next refresh: weekly.
Phi-3 Mini 128K is worth evaluating for coding, long context, and classification when its provider route and context window match the workload.
Decision context: Coding task fit, 5 tracked provider routes, and research from 2026-01-01.
Use it for
- Teams evaluating coding, long context, and classification
- Workloads that can use a 128K context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
Cheapest output
$0.100
Fireworks AI per 1M tokens
Provider routes
5
Tracked API hosts
Quality / dollar
Grade A
Ranked by benchmark score divided by cheapest output price
Freshness
2026-01-01
Researched 144d ago
Top use-case fit
Coding
Q/$ A1 relevant benchmark in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Classification
Q/$ B3 relevant benchmarks in the decision map.
Provider price ladder
Compare all 5| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Fireworks AI | $0.100 | $0.100 | Provisioned |
| Replicate API | $0.050 | $0.250 | Serverless |
| Microsoft Foundry | $0.300 | $0.900 | ServerlessProvisioned |
| Baseten API | - | - | ServerlessPartial |
Benchmark peer barsfor Coding
Migration checks
No linked migration route is available for this model yet.
About
Phi-3 Mini-128K-Instruct, developed by Microsoft, is a 3.8 billion-parameter large language model renowned for its lightweight, open-source architecture. Despite its modest size, it excels in reasoning tasks, particularly in math and logic, and showcases strong code generation capabilities. A standout feature is its remarkable ability to handle up to 128,000 tokens, allowing it to process extensive text documents and codebases efficiently. While it has limitations in factual knowledge and focuses primarily on English, it strikes a balance between performance and efficiency, making it ideal for resource-constrained environments. The model is available on platforms like Azure AI Studio and Hugging Face and benefits from training on high-quality synthetic and publicly available data, with fine-tuning to improve instruction adherence and safety.
Phi-3 Mini 128K has a 128K-token context window.
Phi-3 Mini 128K input tokens at $0.05/1M, output at $0.25/1M.
Capabilities
No model capability flags are currently sourced.
Benchmark Scores(5)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Google-Proof Q&A | 50.8 | diamond | research |
| HellaSwag | 90.2 | 10-shot | research |
| HumanEval | 75.9 | pass@1 | research |
| Massive Multitask Language Understanding | 76.5 | 5-shot | research |
| MMLU PRO | 43.9 | — | https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro |