LLM Reference

Zamba 2 Models by Zyphra

6 models2024Up to 16k ctx

About

The Zamba 2 family of large language models (LLMs) by Zyphra represents a novel integration of state-space Mamba and transformer blocks to achieve optimal performance, especially in low-resource settings. This architecture, built on a Mamba backbone, strategically alternates with transformer blocks to reduce parameters and conserve memory. Enhancements over the previous version include new Mamba2 blocks and dual interleaved attention layers, as well as LoRA projectors to tailor MLPs. The family offers models like the 2.7B and 7B ones, with the larger 7B version performing exceptionally well against peers of similar scale. Zamba2-7B-Instruct, a fine-tuned model variant, extends context length to 16,000 tokens, enhancing its prowess on instruction-following tasks. All models are open-source under the Apache 2.0 license, further promoting accessibility and innovation 235.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

6 in view
Zamba2 7BCurrent

Use when the workload needs 4k context and 7B parameters.

2024-064k context7B parameters

Use when the workload needs 16k context and 7B parameters.

2024-0616k context7B parameters

Use when the workload needs 4k context and 2.7B parameters.

2024-064k context2.7B parameters

Use when the workload needs 4k context and 2.7B parameters.

2024-064k context2.7B parameters

Use when the workload needs 4k context and 1.2B parameters.

2024-064k context1.2B parameters

Use when the workload needs 4k context and 1.2B parameters.

2024-064k context1.2B parameters

Release Timeline

1 release group
2024-06
6 current
Zamba2 1.2B
4k context1.2B parameters
Current
Zamba2 1.2B Instruct
4k context1.2B parameters
Current
Zamba2 2.7B
4k context2.7B parameters
Current
Zamba2 2.7B Instruct
4k context2.7B parameters
Current
Zamba2 7B
4k context7B parameters
Current
Zamba2 7B Instruct
16k context7B parameters
Current

Specifications(6 models)

Zamba 2 model specifications comparison
ModelReleasedContextParameters
Zamba2 7B2024-064k7B
Zamba2 7B Instruct2024-0616k7B
Zamba2 2.7B2024-064k2.7B
Zamba2 2.7B Instruct2024-064k2.7B
Zamba2 1.2B2024-064k1.2B
Zamba2 1.2B Instruct2024-064k1.2B

Frequently Asked Questions

What is Zamba 2 used for?
The Zamba 2 family of large language models (LLMs) by Zyphra represents a novel integration of state-space Mamba and transformer blocks to achieve optimal performance, especially in low-resource settings.
How does Zamba 2 compare to ZAYA1?
Zamba 2 by Zyphra is strongest where you need its listed use cases, while ZAYA1 by Zyphra is the closest related family to check for reasoning. Zamba 2 has 6 listed variants and reaches up to 16k context, while ZAYA1 reaches up to 33k context, so compare the specs and pricing tables before choosing a production model.
Which Zamba 2 model should I use?
If price is the main constraint, use the pricing table first because Zamba 2 does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Zamba2 7B Instruct with 16k context.

Models(6)