LLM Reference

SEA-LION 3B

About

SEA-LION 3B is a large language model developed by AI Singapore, aimed at enhancing natural language processing for Southeast Asian languages. It leverages the MPT architecture and incorporates a custom SEABPETokenizer to optimize performance for a 256K vocabulary. Trained on 980 billion tokens from diverse sources, including English and SEA languages, it targets text generation tasks like translation and summarization. While notable for its SEA language capabilities, its open-source nature means safety tuning is needed, and its performance can vary outside its training scope. Further details are available on the Hugging Face model card 4.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilySEA-LION
Parameters3B
ArchitectureDecoder Only
Specializationgeneral