BLOOMZ Models by BigScience
About
The BLOOMZ model family, an innovation from the BigScience workshop, offers an array of multilingual models that excel in instruction following across diverse languages without necessitating additional training 123. These advanced models are fine-tuned iterations of the BLOOM and mT5 architectures, leveraging the comprehensive cross-lingual task mixture (xP3) dataset for enhanced multilingual capabilities 123. BLOOMZ's flexibility is evident in its range of sizes, from 300 million to a substantial 176 billion parameters, to accommodate various computational needs 124. Moreover, the bloomz-mt variants are specifically fine-tuned to handle non-English prompts using the xP3mt dataset, further solidifying their prowess in cross-lingual generalization and task diversity, including machine translation, question answering, and text generation 14.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 2k context and 176B parameters.
Use when the workload needs 2k context and 7.1B parameters.
Use when the workload needs 2k context and 3B parameters.
Use when the workload needs 2k context and 1.7B parameters.
Use when the workload needs 2k context and 1.1B parameters.
Use when the workload needs 2k context and 560M parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| BLOOMZ 176B | Use when the workload needs 2k context and 176B parameters. | 2022-07 | 2k context176B parameters | Current |
| BLOOMZ 7.1B | Use when the workload needs 2k context and 7.1B parameters. | 2022-07 | 2k context7.1B parameters | Current |
| BLOOMZ 3B | Use when the workload needs 2k context and 3B parameters. | 2022-07 | 2k context3B parameters | Current |
| BLOOMZ 1.7B | Use when the workload needs 2k context and 1.7B parameters. | 2022-07 | 2k context1.7B parameters | Current |
| BLOOMZ 1.1B | Use when the workload needs 2k context and 1.1B parameters. | 2022-07 | 2k context1.1B parameters | Current |
| BLOOMZ 560M | Use when the workload needs 2k context and 560M parameters. | 2022-07 | 2k context560M parameters | Current |
Release Timeline
1 release groupSpecifications(6 models)
| Model | Released | Context | Parameters |
|---|---|---|---|
| BLOOMZ 176B | 2022-07 | 2k | 176B |
| BLOOMZ 7.1B | 2022-07 | 2k | 7.1B |
| BLOOMZ 3B | 2022-07 | 2k | 3B |
| BLOOMZ 1.7B | 2022-07 | 2k | 1.7B |
| BLOOMZ 1.1B | 2022-07 | 2k | 1.1B |
| BLOOMZ 560M | 2022-07 | 2k | 560M |
Frequently Asked Questions
- What is BLOOMZ used for?
- BLOOMZ is used for coding and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
- How does BLOOMZ compare to MT0?
- BLOOMZ by BigScience is strongest where you need coding, while MT0 by BigScience is the closest related family to check for coding. BLOOMZ has 6 listed variants and reaches up to 2k context, while MT0 reaches up to 1k context, so compare the specs and pricing tables before choosing a production model.
- Which BLOOMZ model should I use?
- If price is the main constraint, use the pricing table first because BLOOMZ does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate BLOOMZ 176B with 2k context.

