HHHactive
Helpfulness, Honesty, Harmlessness
Metric: Human Preference Rate (higher is better)Introduced: 2022
About
Anthropic's alignment evaluation covering three core dimensions: helpfulness, honesty, and harmlessness, via human preference comparisons between model responses.