LLM Reference
Concepts & capability filters
training_technique

Distillation

See matching models with benchmark scores and pricing.

Definition

Distillation transfers knowledge from a large, complex teacher model to a smaller student model by training the student to mimic the teacher's outputs or intermediate representations, creating efficient deployable versions. It reduces model size and inference cost while retaining much of the performance.

Models Mentioning Distillation(12)