LLM Reference
Replicate API

Mamba 1.4B on Replicate API

Mamba · State Spaces

Serverless

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

About Mamba 1.4B

Mamba 1.4B is a cutting-edge large language model that utilizes a state-space model (SSM) architecture, designed for efficient processing of extended sequences. Unlike traditional Transformer models, Mamba scales linearly with sequence length, capable of handling up to a million elements, thanks to its selective SSM layer that optimally filters token information. This innovative approach enhances inference speed by forgoing the conventional attention mechanism, enabling a 5x throughput improvement over similarly sized Transformers. Optimized for NVIDIA GPUs, the model offers performance on par with larger models, although it may fall short in some downstream tasks compared to more extensive, fine-tuned solutions. Trained on the Pile dataset, its training specifics vary among sources, reflecting a need for clearer reporting standards.

Get Started

Model Specs

Released2023-12-01
Parameters1.4B
ArchitectureDecoder Only

Related Models on Replicate API