Baichuan 13B
About
Baichuan 13B, developed by Baichuan Intelligent Technology, is an open-source large language model (LLM) notable for its commercial usability and advanced performance. With a scale of 13 billion parameters, it surpasses its predecessor, Baichuan 7B, in Chinese and English benchmarks. Utilizing a Transformer architecture with ALiBi (Attention with Linear Biases), the model enhances inference speed. It is trained on an impressive dataset of 1.4 trillion tokens, which is 40% larger than that of LLaMA-13B, and offers a context window length of 4096 tokens. Baichuan 13B provides both pre-training and aligned versions (Baichuan-13B-Base and Baichuan-13B-Chat) and features quantized INT8 and INT4 versions for efficient deployment on consumer-grade GPUs. Despite its capabilities, users should be aware of potential biases due to its training data and limited domain-specific knowledge, and commercial use requires permission.