
Cerebras GPT
About
The Cerebras GPT family includes seven open-source large language models, ranging from 111 million to 13 billion parameters. These models were developed by Cerebras Systems using the Chinchilla formula, optimizing 20 tokens per parameter to achieve high accuracy within a defined compute budget. Available on Hugging Face under the Apache 2.0 license, these models are accessible for both research and commercial use. Training took place on the Andromeda AI supercomputer, leveraging Cerebras' weight streaming technology for efficient computation across multiple nodes. This setup enhances training speed, reduces costs, and minimizes energy consumption, making them notably efficient compared to other models available 12.