Yi-1.5 6B
About
The Yi-1.5 6B model, developed by 01.AI, is an enhanced version of the original Yi model, equipped with 6 billion parameters. It operates bilingually in English and Chinese, trained on a corpus of 500 billion tokens, and fine-tuned on 3 million diverse samples. Relying on Transformer architecture, it integrates advanced techniques like Grouped-Query Attention (GQA), SwiGLU, and Rotary Position Embedding (RoPE) with ABF for efficient long context processing, supporting a default context window of 4096 tokens. The model excels in coding, math, reasoning, and instruction following, but has been noted for some limitations, including a tendency for hallucinations and less knowledge of fiction compared to its predecessor. Apache 2.0 licensed, Yi-1.5 6B supports both academic and commercial use, with quantized versions available for deployment on consumer-grade hardware, even those with limited resources like smartphones.