MT-BenchactiveArena
MT-Bench
Metric: MT-Bench Score (1-10) (higher is better)Introduced: 2023
About
80 multi-turn conversation questions across 8 categories evaluated by GPT-4 as judge. Scores range from 1-10 per turn.
80 multi-turn conversation questions across 8 categories evaluated by GPT-4 as judge. Scores range from 1-10 per turn.