LLM Reference

BLOOMZ 3B

About

BLOOMZ 3B is a multilingual large language model created by the BigScience workshop. It leverages around 3 billion parameters and a Transformer architecture, enabling it to perform diverse tasks like translation, summarization, and question answering in several languages. Its fine-tuning on the xP3 dataset allows effective cross-lingual performance, although its efficacy can be influenced by prompt structure. The model’s training required significant computational resources, utilizing 128 A100 GPUs over 2000 finetuning steps. While offering efficient inference and zero-shot capabilities, its functionality may vary with prompt and multilingual tasks without specific adaptations.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyBLOOMZ
Parameters3B
ArchitectureDecoder Only
Specializationgeneral