LLM Reference

Cerebras GPT 256M

Released
2023-03-13
Last refreshed
2026-05-22
Status
Researched 13d ago
Proprietary

Cerebras GPT 256M has model metadata, but missing tracked provider pricing keeps it from being a default production pick.

Use it for

  • Teams evaluating general LLM work
  • Workloads that can use a 2k context window

Do not use it for

  • Cost-sensitive launches that need sourced token pricing
  • Vision or document-understanding workloads
  • Strict JSON or tool-calling flows
Specifications
Released
2023-03-13
Context
2k
Parameters
256M
Architecture
Decoder Only
Knowledge cutoff
2020
Specialization
general
Training
finetuned
Created by

World's largest AI chip innovation

Sunnyvale, California, United States
Founded 2016
Website
Pricing

No tracked provider token pricing is available yet.

About

The Cerebras GPT 256M is a transformer-based large language model developed by Cerebras Systems, featuring a GPT-3 style architecture with 256 million parameters. It is part of a wider model family trained for compute-optimal performance according to Chinchilla scaling laws. It supports a vocabulary of 50,257 tokens and can handle sequences up to 2048 tokens long. Built for research, it demonstrates capabilities in text generation and language understanding, with potential for fine-tuning for conversational dialogue, though limited by its lack of instruction tuning. Trained on the extensive Pile dataset using Cerebras's weight streaming technique, it benefits from the efficiency of the Cerebras Andromeda AI supercomputer. However, it is not intended for production deployment without additional safety measures. Released under the Apache 2.0 license, it encourages open research but is English-only and unsuitable for machine translation.

Cerebras GPT 256M is a proprietary model in the Cerebras GPT family. The structured metadata tracks a 2k-token context window. No headline benchmark score is tracked for Cerebras GPT 256M yet.

Top use-case fit

No primary decision-task fit is mapped for this model yet.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Coding

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(4)