Phi-3 Small 128K
phi-3-small-128k
Last refreshed 2026-05-19. Next refresh: weekly.
Phi-3 Small 128K is worth evaluating for coding, long context, and classification when its provider route and context window match the workload.
Decision context: Coding task fit, 2 tracked provider routes, and research from 2026-05-19.
Use it for
- Teams evaluating coding, long context, and classification
- Workloads that can use a 128K context window
- Buyers comparing 2 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
Cheapest output
$1.05
Microsoft Foundry per 1M tokens
Provider routes
2
Tracked API hosts
Quality / dollar
Grade C
Ranked by benchmark score divided by cheapest output price
Freshness
2026-05-19
Researched 6d ago
Top use-case fit
Coding
Q/$ C1 relevant benchmark in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Classification
Q/$ C2 relevant benchmarks in the decision map.
Provider price ladder
Compare all 2| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Microsoft Foundry | $0.350 | $1.05 | ServerlessProvisioned |
| NVIDIA NIM | - | - | ProvisionedPartial |
Benchmark peer barsfor Coding
Migration checks
No linked migration route is available for this model yet.
About
Phi-3 Small 128K is Microsoft Research's Phi-3 model. It offers a 128K-token context window with weights openly available for self-hosting and scores 51.9 on GPQA.
Phi-3 Small 128K has a 128K-token context window.
Phi-3 Small 128K input tokens at $0.35/1M, output at $1.05/1M.
Capabilities
No model capability flags are currently sourced.
Benchmark Scores(4)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Google-Proof Q&A | 51.9 | diamond | research |
| HellaSwag | 90.8 | 10-shot | research |
| HumanEval | 73.8 | pass@1 | research |
| Massive Multitask Language Understanding | 78.1 | 5-shot | research |