LLM Reference

Cerebras GPT Models by Cerebras

7 models2023Up to 2k ctx

About

The Cerebras GPT family includes seven open-source large language models, ranging from 111 million to 13 billion parameters. These models were developed by Cerebras Systems using the Chinchilla formula, optimizing 20 tokens per parameter to achieve high accuracy within a defined compute budget. Available on Hugging Face under the Apache 2.0 license, these models are accessible for both research and commercial use. Training took place on the Andromeda AI supercomputer, leveraging Cerebras' weight streaming technology for efficient computation across multiple nodes. This setup enhances training speed, reduces costs, and minimizes energy consumption, making them notably efficient compared to other models available 12.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

7 in view

Use when the workload needs 2k context and 13B parameters.

2023-032k context13B parameters

Use when the workload needs 2k context and 7B parameters.

2023-032k context7B parameters

Use when the workload needs 2k context and 2.7B parameters.

2023-032k context2.7B parameters

Use when the workload needs 2k context and 1.3B parameters.

2023-032k context1.3B parameters

Use when the workload needs 2k context, 590M parameters, and reasoning.

2023-032k context590M parametersreasoning

Use when the workload needs 2k context and 256M parameters.

2023-032k context256M parameters

Use when the workload needs 2k context and 111M parameters.

2023-032k context111M parameters

Release Timeline

1 release group
2023-03
7 current
Cerebras GPT 1.3B
2k context1.3B parameters
Current
Cerebras GPT 111M
2k context111M parameters
Current
Cerebras GPT 13B
2k context13B parameters
Current
Cerebras GPT 2.7B
2k context2.7B parameters
Current
Cerebras GPT 256M
2k context256M parameters
Current
Cerebras GPT 590M
2k context590M parametersreasoning
Current
Cerebras GPT 7B
2k context7B parameters
Current

Specifications(7 models)

Cerebras GPT model specifications comparison
ModelReleasedContextParametersReasoningCode Exec
Cerebras GPT 13B2023-032k13BNoNo
Cerebras GPT 7B2023-032k7BNoNo
Cerebras GPT 2.7B2023-032k2.7BNoNo
Cerebras GPT 1.3B2023-032k1.3BNoNo
Cerebras GPT 590M2023-032k590MYesYes
Cerebras GPT 256M2023-032k256MNoNo
Cerebras GPT 111M2023-032k111MNoNo

Frequently Asked Questions

What is Cerebras GPT used for?
Cerebras GPT is used for reasoning, code execution, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does Cerebras GPT compare to Cerebras LLaVA?
Cerebras GPT by Cerebras is strongest where you need reasoning, while Cerebras LLaVA by Cerebras is the closest related family to check for coding. Cerebras GPT has 7 listed variants and reaches up to 2k context, while Cerebras LLaVA reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which Cerebras GPT model should I use?
If price is the main constraint, use the pricing table first because Cerebras GPT does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Cerebras GPT 590M with 2k context and reasoning.

Models(7)