LLM Reference

BLOOMZ 560M

About

BLOOMZ 560M is a multilingual large language model from the BigScience research workshop. It excels in following instructions across multiple languages through zero-shot learning, without additional training. Developed by fine-tuning BLOOM and mT5 models on a cross-lingual dataset, it exhibits strong cross-lingual generalization. The model operates as a text-to-text transformer, producing coherent outputs in numerous supported languages and even some programming languages. It is versatile in tasks like translation, creative writing, and question answering, with its performance hinging on clear input prompts. Housing 560 million parameters, BLOOMZ 560M can be used with varying VRAM requirements, licensed under bigscience-bloom-rail-1.0. It is predominantly recommended for English, though a fine-tuned version, optimized for chatbots in French and English, may enhance performance within those languages.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyBLOOMZ
Parameters560M
ArchitectureDecoder Only
Specializationgeneral