LLM Reference
Concepts & capability filters

alignment

See matching models with benchmark scores and pricing.

Definition

Alignment ensures LLMs produce outputs matching human values, preferences, and safety constraints through techniques like RLHF, DPO, or constitutional AI. It addresses the gap between raw predictive power and deployable utility by iteratively refining behaviors via feedback, reducing harms like bias.

Models Mentioning alignment(12)