LLM ReferenceLLM Reference

Baichuan 13B Chat

About

Baichuan-13B-Chat is a 13-billion parameter large language model from Baichuan Intelligent Technology, building upon their earlier Baichuan-7B model. It excels in natural language processing tasks for both Chinese and English languages. Notable features include a substantial training dataset encompassing 1.4 trillion tokens—40% more than the LLaMA-13B model—and superior dialogue capabilities. The model efficiently operates on consumer-grade GPUs, such as the Nvidia 3090, thanks to its efficient inference. Additionally, it employs ALiBi positional encoding with a context window of 4096 tokens. Baichuan-13B-Chat is open-source with commercial usability under the appropriate licensing and is based on the transformer architecture, providing details on hidden size, layers, and attention heads in its documentation. Its performance on various benchmarks is impressive, surpassing other models of similar size.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyBaichuan
Released2023-06-15
Parameters13B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

Open-source AI with massive context

Beijing, China
Founded 2023
Website