activeAgent
ADRS (UCB-ADRS)
Metric: Task Completion RateIntroduced: 2024
About
UC Berkeley agentic decision-making reliability and safety benchmark testing coherence, persistence, and safety in long-horizon tasks. NOTE: slug contains parentheses — recommend renaming to 'adrs'.