XuanYuan 6B
About
The XuanYuan 6B model by Duxiaoman-DI is a powerful 6-billion parameter large language model focused on financial applications and general chat functionality. Its design follows the Llama architecture, incorporating features like RoPE positional embedding and SwiGLU activation, with 4096 hidden units and 32 attention heads across 30 layers. Trained using Self-QA and RLHF, the model demonstrates proficiency in multilingual domains, excelling particularly in Chinese. Its capabilities span financial predictive modeling and general chat, with performance rivalling larger 70B parameter models in various language tasks. It requires 12.8 GB of VRAM but offers a 4-bit quantized variant needing only 3 GB, available on platforms like Hugging Face.
Capabilities
MultimodalFunction CallingTool UseJSON Mode
Specifications
FamilyXuanYuan
Released2024-02-03
Parameters6B
ArchitectureDecoder Only
Specializationgeneral