LLM Reference
Zamba 2

Zamba 2

About

The Zamba 2 family of large language models (LLMs) by Zyphra represents a novel integration of state-space Mamba and transformer blocks to achieve optimal performance, especially in low-resource settings. This architecture, built on a Mamba backbone, strategically alternates with transformer blocks to reduce parameters and conserve memory. Enhancements over the previous version include new Mamba2 blocks and dual interleaved attention layers, as well as LoRA projectors to tailor MLPs. The family offers models like the 2.7B and 7B ones, with the larger 7B version performing exceptionally well against peers of similar scale. Zamba2-7B-Instruct, a fine-tuned model variant, extends context length to 16,000 tokens, enhancing its prowess on instruction-following tasks. All models are open-source under the Apache 2.0 license, further promoting accessibility and innovation 235.

Models(6)

Details

ResearcherZyphra
Models6