Pythia 2.8B
About
The Pythia 2.8B model by EleutherAI is engineered for research into large language models (LLMs) and their interpretability. With its 2.8 billion parameters, it employs a decoder-only autoregressive architecture, leveraging flash attention to boost computational efficiency. The model, featuring 32 layers and 32 attention heads, is trained on The Pile dataset, a comprehensive collection of texts. It excels in tasks like text generation and language understanding. However, it's not fine-tuned for end-user applications, and like many LLMs, it may reflect biases present in its training data.
Capabilities
MultimodalFunction CallingTool UseJSON Mode
Specifications
FamilyPythia
Released2023-05-31
Parameters2.8B
ArchitectureDecoder Only
Specializationgeneral