LLM Reference

MT0 Base

About

The MT0 Base model is a multilingual text-to-text transformer that belongs to the BLOOMZ and mT0 family of models, excelling at zero-shot learning by following human instructions across multiple languages without needing prior explicit training. It is constructed on the architecture of the mt5-base model and fine-tuned using the BigScience xP3 dataset, which is a blend of cross-lingual tasks. A key feature of this model is its ability to generalize effectively across languages and tasks, enabling it to perform well on unseen cross-lingual tasks. Although it is primarily recommended for English prompts, it showcases impressive capabilities in various other languages. The quality of its outputs can be significantly improved through effective prompt engineering, where structured and clear prompts lead to better performance.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyMT0
Parameters580M
ArchitectureDecoder Only
Specializationgeneral