LLM ReferenceLLM Reference

Best LLMs for Classification (2026)

Compare models for routing, moderation, extraction, safety labels, and structured classification by sourced benchmark coverage and pricing.

#ModelInput $/1MOutput $/1M
1Llama 3.1 405B

MMLU-Pro:

2DeepSeek V3
Tools

MMLU-Pro: 75.87%

$0.10$0.28
3Qwen2.5-72B

MMLU-Pro:

$0.20$0.60
4Llama 3.1 70B Instruct

MMLU-Pro:

$0.40$0.40
5Mistral Large 2
VisionTools

MMLU-Pro: 69.7%

$0.48$1.50
6Mixtral 8x22B v0.1

MMLU-Pro:

GPU·hr$1.00
7Falcon 180B

MMLU-Pro:

8Gemma 2 27B

MMLU-Pro: 56.54%

$0.08$0.24
9Llama 3 70B

MMLU-Pro:

$0.65$2.75
10Qwen2-7B

MMLU-Pro:

$0.05$0.15
11Mistral NeMo Instruct (2407)

MMLU-Pro:

GPU·hr$1.00
12DeepSeek Coder V2 Lite

MMLU-Pro:

$0.50$0.50
13Llama 3 8B Instruct

MMLU-Pro: 40.5%

$0.03$0.04
14Mixtral 8x7B

MMLU-Pro:

$0.15$0.20
15Phi-3 Small 128K

MMLU-Pro:

$0.35$1.05
16Mistral 7B Instruct v0.3
Tools

MMLU-Pro:

$0.20$0.20
17Phi-3 Mini 128K

MMLU-Pro: 43.86%

GPU·hr$1.00
18DeepSeek V4 Pro
ReasoningTools

MMLU-Pro: 87.5%

$0.43$0.87
19Gemini 3 Pro
VisionTools

MMLU-Pro: 90.1%

$1.25$5.00
20DeepSeek Math 7B Instruct

MMLU-Pro: