Yi-1.5 9B
About
The Yi-1.5 9B, developed by 01.AI, is a sophisticated large language model building on the original Yi model. Designed for interactive applications, it boasts 9 billion parameters and a context length of up to 16,000 tokens, allowing it to handle complex conversations efficiently. Utilizing transformer architecture enhancements such as Generalized Query Attention, SwiGLU activation, and Rotary Position Embedding, it efficiently processes long contexts. The model is pre-trained on 500 billion tokens and fine-tuned on 3 million diverse samples, achieving strong performance in coding, mathematics, reasoning, and instruction following. Despite its strengths, the model shares common LLM limitations, like hallucinations and non-determinism, but these can be mitigated by adjusting generation parameters.