What is GPT-JT used for?

GPT-JT is used for safety and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.

How does GPT-JT compare to Together General?

GPT-JT by Together.ai is strongest where you need safety, while Together General by Together.ai is the closest related family to check for adjacent model selection. GPT-JT has 3 listed variants, while Together General reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.

Which GPT-JT model should I use?

GPT-JT Moderation 6B is both the lowest listed input-price option at $0.2/1M input tokens through Together AI and the strongest local starting point with structured outputs. Use the provider table if latency, deployment type, or output-token pricing matters more than input price.

GPT-JT Models by Together.ai

Together.ai

3 models2023From $0.2/1M input

About

GPT-JT is a series of large language models that originate from a fine-tuned version of EleutherAI's GPT-J 6B model. These models utilize a decentralized training algorithm, allowing them to operate efficiently despite using a network with relatively slow interconnect speeds. This novel approach optimizes the use of diverse hardware resources. The training process integrates various open-source methodologies and datasets, including Google Research's UL2 training objective, Chain-of-Thought prompting, and datasets like BigScience's Public Pool of Prompts (P3) and AllenAI's Natural Instructions (NI). As a result, GPT-JT models exhibit strong performance on classification benchmarks and are known to outperform models with significantly larger parameters. Importantly, these models are available as open-source, inviting community participation for further enhancements145.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

3 in view

GPT-JT 6B V0Current

Use when the workload needs 6B parameters.

2023-036B parameters

GPT-JT 6B V1Current

Use when the workload needs 6B parameters.

2023-036B parameters

GPT-JT Moderation 6BCurrent

Use when the workload needs safety, 6B parameters, and structured outputs.

2023-03safety6B parametersstructured outputs

Current GPT-JT variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
GPT-JT 6B V0	Use when the workload needs 6B parameters.	2023-03	6B parameters	Current
GPT-JT 6B V1	Use when the workload needs 6B parameters.	2023-03	6B parameters	Current
GPT-JT Moderation 6B	Use when the workload needs safety, 6B parameters, and structured outputs.	2023-03	safety6B parametersstructured outputs	Current

Release Timeline

1 release group

2023-03

3 current

GPT-JT 6B V0

6B parameters

Current

GPT-JT 6B V1

6B parameters

Current

GPT-JT Moderation 6B

safety6B parametersstructured outputs

Current

Specifications(3 models)

GPT-JT model specifications comparison
Model	Released	Parameters	Structured Outputs
GPT-JT 6B V0	2023-03	6B	No
GPT-JT 6B V1	2023-03	6B	No
GPT-JT Moderation 6B	2023-03	6B	Yes

Available From(2 providers)

Replicate API

Together AI

Pricing

GPT-JT model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
GPT-JT Moderation 6B	Together AI	$0.2	$0.2	Serverless

Frequently Asked Questions

What is GPT-JT used for?: GPT-JT is used for safety and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does GPT-JT compare to Together General?: GPT-JT by Together.ai is strongest where you need safety, while Together General by Together.ai is the closest related family to check for adjacent model selection. GPT-JT has 3 listed variants, while Together General reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which GPT-JT model should I use?: For the lowest listed input price, start with GPT-JT Moderation 6B through Together AI at $0.2/1M input tokens. For the most capable/latest local choice, evaluate GPT-JT Moderation 6B with structured outputs.