MultiChallengeactiveAgents
MultiChallenge
Metric: % Score (higher is better)
About
Scale AI benchmark for multi-turn instruction following across instruction retention, inference memory, versioned editing, and self-coherence challenges.
Scale AI benchmark for multi-turn instruction following across instruction retention, inference memory, versioned editing, and self-coherence challenges.