LLM Reference
Zamba

Zamba

About

The Zamba family of large language models (LLMs), developed by Zyphra, features a novel approach by integrating state-space models like Mamba with transformer blocks 124. This combination creates a balance between performance and efficiency, which allows these models to function on a range of hardware, including consumer-grade GPUs 4812. The initial model, Zamba-7B-v1, trained on an extensive dataset, laid the groundwork for subsequent iterations like Zamba2-7B and Zamba2-2.7B, which introduced enhancements such as Mamba2 blocks and shared attention mechanisms to boost performance 813. Although primarily built for general tasks, these models are not tailored for chat-specific applications and do not include moderation features 28.

Models(1)

Details

ResearcherZyphra
Models1