What is Mamba 2 used for?

Mamba 2 is used for structured outputs. The family description and listed model capabilities point to those workloads as the best fit.

How does Mamba 2 compare to Mamba?

Mamba 2 by State Spaces is strongest where you need structured outputs, while Mamba by State Spaces is the closest related family to check for adjacent model selection. Mamba 2 has 5 listed variants and reaches up to 2k context, while Mamba reaches up to 2k context, so compare the specs and pricing tables before choosing a production model.

Which Mamba 2 model should I use?

If price is the main constraint, use the pricing table first because Mamba 2 does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Mamba 2 2.7B with 2k context.

Mamba 2 Models by State Spaces

State Spaces

5 models2023Up to 2k ctx

About

Mamba-2 is an innovative state-space model (SSM) architecture that takes significant strides in enhancing the performance and efficiency over Mamba-1, positioning itself as a formidable competitor to traditional transformer-based LLMs. At the heart of Mamba-2 is the Structured State Space Duality (SSD) framework, which creates a theoretical bridge between SSMs and attention mechanisms. This unique duality enables two computational modes: a SSM mode optimized for rapid autoregressive inference and an attention mode that harnesses the optimized matrix multiplications of modern hardware for efficient training. The incorporation of the SSD layer ensures that Mamba-2 excels in training speed compared to Mamba-1 while delivering equal or superior performance on benchmarks, especially those requiring the processing of long sequences and associative recall. The pre-trained models of Mamba-2 vary from 130 million to 2.8 billion parameters, utilizing datasets like Pile and SlimPajama, underscoring its versatility and scalability 5710.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

5 in view

Mamba 2 2.7BCurrent

Use when the workload needs 2k context and 2.7B parameters.

2023-122k context2.7B parameters

Mamba 2 1.3BCurrent

Use when the workload needs 2k context and 1.3B parameters.

2023-122k context1.3B parameters

Mamba 2 780MCurrent

Use when the workload needs 2k context and 780M parameters.

2023-122k context780M parameters

Mamba 2 370MCurrent

Use when the workload needs 2k context and 370M parameters.

2023-122k context370M parameters

Mamba 2 130MCurrent

Use when the workload needs 2k context and 130M parameters.

2023-122k context130M parameters

Current Mamba 2 variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Mamba 2 2.7B	Use when the workload needs 2k context and 2.7B parameters.	2023-12	2k context2.7B parameters	Current
Mamba 2 1.3B	Use when the workload needs 2k context and 1.3B parameters.	2023-12	2k context1.3B parameters	Current
Mamba 2 780M	Use when the workload needs 2k context and 780M parameters.	2023-12	2k context780M parameters	Current
Mamba 2 370M	Use when the workload needs 2k context and 370M parameters.	2023-12	2k context370M parameters	Current
Mamba 2 130M	Use when the workload needs 2k context and 130M parameters.	2023-12	2k context130M parameters	Current

Release Timeline

1 release group

2023-12

5 current

Mamba 2 1.3B

2k context1.3B parameters

Current

Mamba 2 130M

2k context130M parameters

Current

Mamba 2 2.7B

2k context2.7B parameters

Current

Mamba 2 370M

2k context370M parameters

Current

Mamba 2 780M

2k context780M parameters

Current

Specifications(5 models)

Mamba 2 model specifications comparison
Model	Released	Context	Parameters
Mamba 2 2.7B	2023-12	2k	2.7B
Mamba 2 1.3B	2023-12	2k	1.3B
Mamba 2 780M	2023-12	2k	780M
Mamba 2 370M	2023-12	2k	370M
Mamba 2 130M	2023-12	2k	130M

Frequently Asked Questions

What is Mamba 2 used for?: Mamba 2 is used for structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does Mamba 2 compare to Mamba?: Mamba 2 by State Spaces is strongest where you need structured outputs, while Mamba by State Spaces is the closest related family to check for adjacent model selection. Mamba 2 has 5 listed variants and reaches up to 2k context, while Mamba reaches up to 2k context, so compare the specs and pricing tables before choosing a production model.
Which Mamba 2 model should I use?: If price is the main constraint, use the pricing table first because Mamba 2 does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Mamba 2 2.7B with 2k context.