
Tulu V2
About
The Tulu V2 family of large language models (LLMs) marks a notable advancement in open-source AI, building on the foundations established by the Allen Institute for AI's prior Tulu models. These models are engineered to serve as effective assistants, achieving near state-of-the-art performance on various benchmarks upon release. A major enhancement in Tulu V2 includes a refined dataset mixture that integrates high-quality instruction datasets while reducing size and improving performance. Tulu V2 models utilize Direct Preference Optimization (DPO) training for better alignment with user preferences, resulting in significant improvements in open-ended generation tasks. Available in different sizes, from 7 billion to 70 billion parameters, these models cater to varying computational needs. The Allen Institute for AI has made the models, datasets, and training code publicly accessible, encouraging further advancements in AI research. The Tulu V2 series includes models trained with both supervised fine-tuning (SFT) and DPO, with the largest DPO model, Tulu V2 + DPO 70B, demonstrating comparable performance to GPT-3.5-turbo-0314 on the ChatArena benchmark. Additional explorations involve QLoRA and CodeLlama training, showcasing potential avenues for further optimization and specialized applications 35.