LLM ReferenceLLM Reference
activeAgent

ADRS (UCB-ADRS)

Metric: Task Completion RateIntroduced: 2024

About

UC Berkeley agentic decision-making reliability and safety benchmark testing coherence, persistence, and safety in long-horizon tasks. NOTE: slug contains parentheses — recommend renaming to 'adrs'.