LLM Reference

Phi-3 Medium 4K

phi-3-medium-4k

Researched 36d ago

Last refreshed 2026-05-16. Next refresh: weekly.

Open SourceCodingClassificationJSON / Tool use

Phi-3 Medium 4K is worth evaluating for coding, classification, and json / tool use when its provider route and context window match the workload.

Decision context: Coding task fit, 3 tracked provider routes, and research from 2026-04-19.

Use it for

  • Teams evaluating coding, classification, and json / tool use
  • Workloads that can use a 4K context window
  • Buyers comparing 3 tracked provider routes

Do not use it for

  • Vision or document-understanding workloads

Cheapest output

$0.410

DeepInfra per 1M tokens

Provider routes

3

Tracked API hosts

Quality / dollar

Grade C

Ranked by benchmark score divided by cheapest output price

Freshness

2026-04-19

Researched 36d ago

aging

Top use-case fit

Coding

Q/$ C

1 relevant benchmark in the decision map.

Classification

Q/$ C

1 relevant benchmark in the decision map.

JSON / Tool use

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 3
ProviderInput / 1MOutput / 1MRoute
DeepInfra$0.140$0.410
Serverless
Microsoft Foundry$0.450$1.35
ServerlessProvisioned
NVIDIA NIM--
ProvisionedPartial

Benchmark peer barsfor Coding

Migration checks

No linked migration route is available for this model yet.

About

The Phi-3 Medium 4K, developed by Microsoft, is a state-of-the-art large language model with 14 billion parameters. It is engineered for efficiency across various tasks, particularly excelling in reasoning capabilities. This model is designed to handle 4,096 token context lengths, allowing for the processing of longer input sequences. Leveraging a dense, decoder-only Transformer architecture, it incorporates techniques like supervised fine-tuning and direct preference optimization to align with human preferences and safety standards. The model supports multilingual data, although it is primarily trained in English. Its lightweight nature allows for deployment on diverse hardware platforms, making it accessible and versatile for both commercial and research purposes. Safety measures are embedded, although further precautions are advised for applications with higher risks.

Phi-3 Medium 4K has a 4K-token context window.

Phi-3 Medium 4K input tokens at $0.14/1M, output at $0.41/1M.

Capabilities

Structured Outputs

Benchmark Scores(2)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.
BenchmarkScoreVersionSource
HumanEval52.7pass@1https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
Massive Multitask Language Understanding68.95-shothttps://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard

Rankings

Specifications

FamilyPhi-3
Released2024-05-21
Parameters14B
Context4K
ArchitectureDecoder Only
Knowledge cutoff2023-10
Specializationgeneral
Trainingfinetuned

Created by

Advancing the state-of-the-art in AI and computing.

Redmond, Washington, United States
Founded 1991
Website