LLM Reference

Baichuan 13B Chat

About

Baichuan-13B-Chat is a 13-billion parameter large language model from Baichuan Intelligent Technology, building upon their earlier Baichuan-7B model. It excels in natural language processing tasks for both Chinese and English languages. Notable features include a substantial training dataset encompassing 1.4 trillion tokens—40% more than the LLaMA-13B model—and superior dialogue capabilities. The model efficiently operates on consumer-grade GPUs, such as the Nvidia 3090, thanks to its efficient inference. Additionally, it employs ALiBi positional encoding with a context window of 4096 tokens. Baichuan-13B-Chat is open-source with commercial usability under the appropriate licensing and is based on the transformer architecture, providing details on hidden size, layers, and attention heads in its documentation. Its performance on various benchmarks is impressive, surpassing other models of similar size.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyBaichuan
Parameters13B
ArchitectureDecoder Only
Specializationgeneral