Mamba 2 370M
Mamba 2 370M has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Use it for
- Teams evaluating general LLM work
- Workloads that can use a 2k context window
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
- Family
- Mamba 2
- Released
- 2023-12-08
- Context
- 2k
- Parameters
- 370M
- Architecture
- Decoder Only
- Knowledge cutoff
- 2020
- Specialization
- general
- Training
- finetuned
No tracked provider token pricing is available yet.
About
Mamba 2 370M is a cutting-edge language model that excels in processing extremely long sequences of data through the use of Structured State Space Models (SSMs). Capable of handling contexts up to 256,000 tokens without performance degradation, it significantly advances RNN-based long-context modeling. The model features efficient inference with linear computational complexity, making it faster than traditional transformer models. Despite its relatively compact 370 million parameters, it performs exceptionally well across various NLP tasks, although it faces challenges such as state collapse on overly long sequences and demands extensive training data. Its innovative architecture and state space duality further enhance its efficiency and performance.
Mamba 2 370M is a proprietary model in the Mamba 2 family. The structured metadata tracks a 2k-token context window. No headline benchmark score is tracked for Mamba 2 370M yet.
Top use-case fit
No primary decision-task fit is mapped for this model yet.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Capabilities
No model capability flags are currently sourced.
Benchmark peer barsfor Coding
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.