Yi-1.5 6B

About

The Yi-1.5 6B model, developed by 01.AI, is an enhanced version of the original Yi model, equipped with 6 billion parameters. It operates bilingually in English and Chinese, trained on a corpus of 500 billion tokens, and fine-tuned on 3 million diverse samples. Relying on Transformer architecture, it integrates advanced techniques like Grouped-Query Attention (GQA), SwiGLU, and Rotary Position Embedding (RoPE) with ABF for efficient long context processing, supporting a default context window of 4096 tokens. The model excels in coding, math, reasoning, and instruction following, but has been noted for some limitations, including a tendency for hallucinations and less knowledge of fiction compared to its predecessor. Apache 2.0 licensed, Yi-1.5 6B supports both academic and commercial use, with quantized versions available for deployment on consumer-grade hardware, even those with limited resources like smartphones.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyYi-1.5 (2024/05)

Released2024-05-12

Parameters6B

ArchitectureDecoder Only

Specializationgeneral