LLM Reference

Pythia 160M

About

The Pythia 160M is a large language model by EleutherAI and part of the Pythia Scaling Suite, consisting of models from 70 million to 12 billion parameters. Built on the GPT-NeoX architecture, it features 12 layers, 768 hidden dimensions, and 12 attention heads, supporting a context length of 2048 tokens. It is trained with The Pile dataset, totaling 299,892,736,000 tokens, across 154 checkpoints. While it excels in text generation and interpretability research, its limitations include generating biased or harmful content, being English-only, and not being fine-tuned for deployment in specific applications. Despite these, it stands out as a research tool aiding the understanding of large language models 2345610.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyPythia
Released2023-05-31
Parameters160M
ArchitectureDecoder Only
Specializationgeneral