About
The Tülu 3 family consists of advanced, open-source language models developed by the Allen Institute for AI (Ai2). These models aim to bridge the performance gap between open and closed fine-tuning setups. Unlike many closed models, Tülu 3 publicly shares all its data, data mixes, recipes, code, and evaluation frameworks. Available in sizes of 8 billion and 70 billion parameters, they are based on Llama 3.1 base models and outperform instruct versions of Llama 3.1, Qwen 2.5, and Mistral, as well as closed models like GPT-4o-mini and Claude 3.5-Haiku on various benchmarks. The Tülu 3 training process incorporates supervised fine-tuning, Direct Preference Optimization, and a novel method called Reinforcement Learning with Verifiable Rewards. The release provides comprehensive guidance on evaluation and recipe design, allowing users to adapt the models for diverse use cases, while also including new synthetic instruction datasets.
