Jamba v0.1

About

Jamba v0.1, developed by AI21 Labs, is a large language model known for its hybrid architecture combining Transformer and Mamba layers. This structure allows it to efficiently manage a vast number of parameters—52 billion in total, with 12 billion active—leveraging mixture-of-experts (MoE) to optimize text processing. Capable of handling a context length of 256K tokens, Jamba v0.1 outperforms many other models in processing extensive inputs. Although computationally demanding, its hybrid design enables throughput gains. However, it lacks instruction tuning and safety moderation, potentially generating inappropriate outputs, and requires fine-tuning for specific tasks. Despite challenges, its ability to fit 140K tokens on a single 80GB GPU and process large datasets makes it remarkable 13.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyJamba

Released2024-03-28

Parameters52B (12B active)

Context256K

ArchitectureMixture of Experts

Knowledge cutoff2024-03

Specializationgeneral

LicenseApache 2.0