What is the context window of Cerebras GPT 256M?

Cerebras GPT 256M has a context window of 2k tokens.

When was Cerebras GPT 256M released?

Cerebras GPT 256M was released on 2023-03-13.

Cerebras GPT 256M

Name: Cerebras GPT 256M
Author: Cerebras

Released

2023-03-13

Last refreshed

2026-05-22

Status

Researched 13d ago

Proprietary

Cerebras GPT 256M has model metadata, but missing tracked provider pricing keeps it from being a default production pick.

Use it for

Teams evaluating general LLM work
Workloads that can use a 2k context window

Do not use it for

Cost-sensitive launches that need sourced token pricing
Vision or document-understanding workloads
Strict JSON or tool-calling flows

Specifications

Family: Cerebras GPT
Released: 2023-03-13
Context: 2k
Parameters: 256M
Architecture: Decoder Only
Knowledge cutoff: 2020
Specialization: general
Training: finetuned

Created by

Cerebras

World's largest AI chip innovation

Sunnyvale, California, United States

Founded 2016

Website

Pricing

No tracked provider token pricing is available yet.

About

The Cerebras GPT 256M is a transformer-based large language model developed by Cerebras Systems, featuring a GPT-3 style architecture with 256 million parameters. It is part of a wider model family trained for compute-optimal performance according to Chinchilla scaling laws. It supports a vocabulary of 50,257 tokens and can handle sequences up to 2048 tokens long. Built for research, it demonstrates capabilities in text generation and language understanding, with potential for fine-tuning for conversational dialogue, though limited by its lack of instruction tuning. Trained on the extensive Pile dataset using Cerebras's weight streaming technique, it benefits from the efficiency of the Cerebras Andromeda AI supercomputer. However, it is not intended for production deployment without additional safety measures. Released under the Apache 2.0 license, it encourages open research but is English-only and unsuitable for machine translation.

Cerebras GPT 256M is a proprietary model in the Cerebras GPT family. The structured metadata tracks a 2k-token context window. No headline benchmark score is tracked for Cerebras GPT 256M yet.

Top use-case fit

No primary decision-task fit is mapped for this model yet.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Coding

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(4)

Cheapest LLM APIs You Can Call Right NowListed Best Mainstream LLM APIs, RankedListed Best LLMs for WritingListed Best LLMs for MarketingListed