
OpenChat 2
About
The OpenChat v2 family of large language models leverages offline reinforcement learning techniques, emphasizing conditional and weighted behavior cloning for model training 3711. Built upon the LLaMA-13B architecture, these models are trained using around 80,000 meticulously cleaned ShareGPT conversations. OpenChat-v2 integrates a conditioning strategy, whereas OpenChat-v2-w enhances this with weighted loss for improved results. Despite occasional limitations in handling complex reasoning, mathematical, and coding tasks, these models have demonstrated impressive text generation and conversational abilities, sometimes outperforming ChatGPT and text-davinci-003 in benchmark tests. Nonetheless, they can occasionally produce inaccurate or imaginary content, a phenomenon known as "hallucination" 3711.