What is Pythia used for?

Pythia is used for coding and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.

How does Pythia compare to Llemma?

Pythia by EleutherAI is strongest where you need coding, while Llemma by EleutherAI is the closest related family to check for mathematics. Pythia has 10 listed variants and reaches up to 2k context, while Llemma reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.

Which Pythia model should I use?

Pythia 12B is both the lowest listed input-price option at $0.2/1M input tokens through Fireworks AI and the strongest local starting point with 2k context. Use the provider table if latency, deployment type, or output-token pricing matters more than input price.

Pythia Models by EleutherAI

EleutherAI

10 models2023Up to 2k ctxFrom $0.2/1M input

About

The Pythia large language model (LLM) family, crafted by EleutherAI, comprises 16 models tailored for in-depth research into the nuances of LLM behavior and training dynamics. The models range from 70 million to 12 billion parameters, all trained on the Pile dataset, with the inclusion and exclusion of deduplication, ensuring a uniform data sequence. This consistency allows for comprehensive studies on how scaling parameters affect model performance in a meticulously controlled setting. While not designed for optimal downstream tasks, the Pythia models offer performance akin to other equivalent-sized LLMs and serve primarily educational and research purposes. Publicly accessible, they provide extensive checkpoints and insights into the training process, though they remain not fine-tuned for specific applications and largely cater to English language processing.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

10 in view

Pythia 12BCurrent

Use when the workload needs 2k context and 12B parameters.

2023-052k context12B parameters

Pythia 6.9BCurrent

Use when the workload needs 2k context and 6.9B parameters.

2023-052k context6.9B parameters

Pythia 2.8BCurrent

Use when the workload needs 2k context and 2.8B parameters.

2023-052k context2.8B parameters

Pythia 1.4BCurrent

Use when the workload needs 2k context and 1.4B parameters.

2023-052k context1.4B parameters

Pythia 1BCurrent

Use when the workload needs 2k context and 1B parameters.

2023-052k context1B parameters

Pythia 410MCurrent

Use when the workload needs 2k context and 410M parameters.

2023-052k context410M parameters

Pythia 160MCurrent

Use when the workload needs 2k context and 160M parameters.

2023-052k context160M parameters

Pythia 70MCurrent

Use when the workload needs 2k context and 70M parameters.

2023-052k context70M parameters

Pythia 31MCurrent

Use when the workload needs 2k context and 31M parameters.

2023-052k context31M parameters

Pythia 14MCurrent

Use when the workload needs 2k context and 14M parameters.

2023-052k context14M parameters

Current Pythia variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Pythia 12B	Use when the workload needs 2k context and 12B parameters.	2023-05	2k context12B parameters	Current
Pythia 6.9B	Use when the workload needs 2k context and 6.9B parameters.	2023-05	2k context6.9B parameters	Current
Pythia 2.8B	Use when the workload needs 2k context and 2.8B parameters.	2023-05	2k context2.8B parameters	Current
Pythia 1.4B	Use when the workload needs 2k context and 1.4B parameters.	2023-05	2k context1.4B parameters	Current
Pythia 1B	Use when the workload needs 2k context and 1B parameters.	2023-05	2k context1B parameters	Current
Pythia 410M	Use when the workload needs 2k context and 410M parameters.	2023-05	2k context410M parameters	Current
Pythia 160M	Use when the workload needs 2k context and 160M parameters.	2023-05	2k context160M parameters	Current
Pythia 70M	Use when the workload needs 2k context and 70M parameters.	2023-05	2k context70M parameters	Current
Pythia 31M	Use when the workload needs 2k context and 31M parameters.	2023-05	2k context31M parameters	Current
Pythia 14M	Use when the workload needs 2k context and 14M parameters.	2023-05	2k context14M parameters	Current

Release Timeline

1 release group

2023-05

10 current

Pythia 1.4B

2k context1.4B parameters

Current

Pythia 12B

2k context12B parameters

Current

Pythia 14M

2k context14M parameters

Current

Pythia 160M

2k context160M parameters

Current

Pythia 1B

2k context1B parameters

Current

Pythia 2.8B

2k context2.8B parameters

Current

Pythia 31M

2k context31M parameters

Current

Pythia 410M

2k context410M parameters

Current

Pythia 6.9B

2k context6.9B parameters

Current

Pythia 70M

2k context70M parameters

Current

Specifications(10 models)

Pythia model specifications comparison
Model	Released	Context	Parameters
Pythia 12B	2023-05	2k	12B
Pythia 6.9B	2023-05	2k	6.9B
Pythia 2.8B	2023-05	2k	2.8B
Pythia 1.4B	2023-05	2k	1.4B
Pythia 1B	2023-05	2k	1B
Pythia 410M	2023-05	2k	410M
Pythia 160M	2023-05	2k	160M
Pythia 70M	2023-05	2k	70M
Pythia 31M	2023-05	2k	31M
Pythia 14M	2023-05	2k	14M

Available From(1 provider)

Fireworks AI

Pricing

Pythia model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
Pythia 12B	Fireworks AI	$0.2	$0.2	Provisioned

Frequently Asked Questions

What is Pythia used for?: Pythia is used for coding and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
How does Pythia compare to Llemma?: Pythia by EleutherAI is strongest where you need coding, while Llemma by EleutherAI is the closest related family to check for mathematics. Pythia has 10 listed variants and reaches up to 2k context, while Llemma reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which Pythia model should I use?: For the lowest listed input price, start with Pythia 12B through Fireworks AI at $0.2/1M input tokens. For the most capable/latest local choice, evaluate Pythia 12B with 2k context.