LLM Reference

Pythia Models by EleutherAI

10 models2023Up to 2k ctxFrom $0.2/1M input

About

The Pythia large language model (LLM) family, crafted by EleutherAI, comprises 16 models tailored for in-depth research into the nuances of LLM behavior and training dynamics. The models range from 70 million to 12 billion parameters, all trained on the Pile dataset, with the inclusion and exclusion of deduplication, ensuring a uniform data sequence. This consistency allows for comprehensive studies on how scaling parameters affect model performance in a meticulously controlled setting. While not designed for optimal downstream tasks, the Pythia models offer performance akin to other equivalent-sized LLMs and serve primarily educational and research purposes. Publicly accessible, they provide extensive checkpoints and insights into the training process, though they remain not fine-tuned for specific applications and largely cater to English language processing.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

10 in view
Pythia 12BCurrent

Use when the workload needs 2k context and 12B parameters.

2023-052k context12B parameters

Use when the workload needs 2k context and 6.9B parameters.

2023-052k context6.9B parameters

Use when the workload needs 2k context and 2.8B parameters.

2023-052k context2.8B parameters

Use when the workload needs 2k context and 1.4B parameters.

2023-052k context1.4B parameters
Pythia 1BCurrent

Use when the workload needs 2k context and 1B parameters.

2023-052k context1B parameters

Use when the workload needs 2k context and 410M parameters.

2023-052k context410M parameters

Use when the workload needs 2k context and 160M parameters.

2023-052k context160M parameters
Pythia 70MCurrent

Use when the workload needs 2k context and 70M parameters.

2023-052k context70M parameters
Pythia 31MCurrent

Use when the workload needs 2k context and 31M parameters.

2023-052k context31M parameters
Pythia 14MCurrent

Use when the workload needs 2k context and 14M parameters.

2023-052k context14M parameters

Release Timeline

1 release group
2023-05
10 current
Pythia 1.4B
2k context1.4B parameters
Current
Pythia 12B
2k context12B parameters
Current
Pythia 14M
2k context14M parameters
Current
Pythia 160M
2k context160M parameters
Current
Pythia 1B
2k context1B parameters
Current
Pythia 2.8B
2k context2.8B parameters
Current
Pythia 31M
2k context31M parameters
Current
Pythia 410M
2k context410M parameters
Current
Pythia 6.9B
2k context6.9B parameters
Current
Pythia 70M
2k context70M parameters
Current

Specifications(10 models)

Pythia model specifications comparison
ModelReleasedContextParameters
Pythia 12B2023-052k12B
Pythia 6.9B2023-052k6.9B
Pythia 2.8B2023-052k2.8B
Pythia 1.4B2023-052k1.4B
Pythia 1B2023-052k1B
Pythia 410M2023-052k410M
Pythia 160M2023-052k160M
Pythia 70M2023-052k70M
Pythia 31M2023-052k31M
Pythia 14M2023-052k14M

Available From(1 provider)

Pricing

Pythia model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Pythia 12BFireworks AI$0.2$0.2Provisioned

Frequently Asked Questions

What is Pythia used for?
Pythia is used for coding and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
How does Pythia compare to Llemma?
Pythia by EleutherAI is strongest where you need coding, while Llemma by EleutherAI is the closest related family to check for mathematics. Pythia has 10 listed variants and reaches up to 2k context, while Llemma reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which Pythia model should I use?
For the lowest listed input price, start with Pythia 12B through Fireworks AI at $0.2/1M input tokens. For the most capable/latest local choice, evaluate Pythia 12B with 2k context.

Models(10)