
BLOOMZ
About
The BLOOMZ model family, an innovation from the BigScience workshop, offers an array of multilingual models that excel in instruction following across diverse languages without necessitating additional training 123. These advanced models are fine-tuned iterations of the BLOOM and mT5 architectures, leveraging the comprehensive cross-lingual task mixture (xP3) dataset for enhanced multilingual capabilities 123. BLOOMZ's flexibility is evident in its range of sizes, from 300 million to a substantial 176 billion parameters, to accommodate various computational needs 124. Moreover, the bloomz-mt variants are specifically fine-tuned to handle non-English prompts using the xP3mt dataset, further solidifying their prowess in cross-lingual generalization and task diversity, including machine translation, question answering, and text generation 14.