LLM Reference
Starling

Starling

About

The Starling family of Large Language Models (LLMs) stems from the innovative Berkeley-Nest AI research group. Among its models, Starling-LM-7B-alpha stands out as a 7-billion parameter language model specifically fine-tuned using Reinforcement Learning from AI Feedback (RLAIF). This fine-tuning process harnessed the extensive Nectar dataset, comprising GPT-4-ranked chat prompts and responses. Starling-LM-7B-alpha focuses on enhancing its helpfulness and maintaining a non-harmful approach, evolving from the Openchat 3.5 model. The project also introduced the Starling-RM-7B-alpha reward model, pivotal for RLAIF processes. To foster advancements in RLHF mechanisms and AI safety, the dataset, reward model, and language model are openly accessible. Additionally, a more refined iteration, Starling-LM-7B-beta, has been made available for further research and development.

Models(1)

Details

ResearcherNexusflow
Models1