LLM ReferenceLLM Reference
Replicate API

Mamba 370M on Replicate API

Mamba · State Spaces

Serverless

Last refreshed 2026-04-19. Next refresh: weekly.

Why use Mamba 370M on Replicate API?

Replicate API offers Mamba 370M with competitive pricing. Replicate is a cloud-based platform that enables users to run machine learning models easily and efficiently.

Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced

Setup recipe

Python + curl
Install
pip install replicate
Auth
export REPLICATE_API_TOKEN=...
Call
import replicate
output = replicate.run(
    "mamba-370m",
    input={"prompt": "Hello"}
Model ID
mamba-370m

Request example

import replicate

# reads REPLICATE_API_TOKEN from env
# mamba-370m format: "owner/model-name" (latest version) or "owner/model-name:version-hash"
output = replicate.run(
    "mamba-370m",
    input={"prompt": "Hello"}
)
# Output is a list or generator depending on the model
print("".join(output))

Gotchas

  • Replicate uses "owner/model-name" format (e.g. "meta/meta-llama-3-8b-instruct") for the latest version, or "owner/model-name:version-sha" to pin to a specific version. The REST endpoint splits owner and model-name into the path: /v1/models/{owner}/{model-name}/predictions.
  • The examples expect REPLICATE_API_TOKEN; rename it only if your application config maps the new variable.

Capabilities

No model capability flags are currently sourced.

About Mamba 370M

Mamba 370M is a 370-million parameter large language model leveraging a state-space model (SSM) architecture, which differentiates it from traditional transformer models by eschewing attention and MLP blocks in favor of linear scaling with sequence length [6][9]. This design ensures efficient processing of lengthy sequences and is optimized for parallel GPU processing [6]. Notable for its text generation capabilities, Mamba 370M is also utilized for Japanese language processing [10], though the details of its training data vary, with some mentioning the Pile dataset [1]. A known limitation, "state collapse," wherein performance declines with longer sequences, has been addressed with mitigation techniques [7]. Despite these challenges, certain studies have shown Mamba models can handle sequences up to 256K tokens accurately with the right training [7].

FAQ

Who created Mamba 370M?

Mamba 370M was created by State Spaces as part of the Mamba model family.

Is Mamba 370M open source?

Mamba 370M's open source status is unknown in the seed data.

Get Started

Model Specs

Released2023-12-01
ArchitectureDecoder Only

Related Models on Replicate API