Cerebras GPT 13B
Cerebras GPT 13B has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Use it for
- Teams evaluating general LLM work
- Workloads that can use a 2k context window
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
- Family
- Cerebras GPT
- Released
- 2023-03-13
- Context
- 2k
- Parameters
- 13B
- Architecture
- Decoder Only
- Knowledge cutoff
- 2020
- Specialization
- general
- Training
- finetuned
About
The Cerebras GPT 13B is a transformer-based large language model developed by Cerebras Systems, notable for its full attention mechanism across all layers, diverging from the sparse banded attention of GPT-3 models. Part of a series ranging from 111 million to 13 billion parameters, it utilizes a GPT-3 style architecture and byte pair encoding for tokenization, following Chinchilla scaling laws for compute optimality. The model was trained with Cerebras' weight streaming technology, enabling efficient data parallelism across nodes. It excels in text prediction and generation, although it is primarily meant for research in scaling laws and NLP, not for tasks like machine translation or dialogue systems. Released under the Apache 2.0 license, it supports free commercial use, with recommendations for additional safety testing before production deployment.
Cerebras GPT 13B is a model in the Cerebras GPT family. The structured metadata tracks a 2k-token context window. No headline benchmark score is tracked for Cerebras GPT 13B yet.
Top use-case fit
No primary decision-task fit is mapped for this model yet.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Capabilities
No model capability flags are currently sourced.
Benchmark peer barsfor Coding
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.