LLM Reference

GPT-JT Models by Together.ai

3 models2023From $0.2/1M input

About

GPT-JT is a series of large language models that originate from a fine-tuned version of EleutherAI's GPT-J 6B model. These models utilize a decentralized training algorithm, allowing them to operate efficiently despite using a network with relatively slow interconnect speeds. This novel approach optimizes the use of diverse hardware resources. The training process integrates various open-source methodologies and datasets, including Google Research's UL2 training objective, Chain-of-Thought prompting, and datasets like BigScience's Public Pool of Prompts (P3) and AllenAI's Natural Instructions (NI). As a result, GPT-JT models exhibit strong performance on classification benchmarks and are known to outperform models with significantly larger parameters. Importantly, these models are available as open-source, inviting community participation for further enhancements145.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

3 in view

Use when the workload needs 6B parameters.

2023-036B parameters

Use when the workload needs 6B parameters.

2023-036B parameters

Use when the workload needs safety, 6B parameters, and structured outputs.

2023-03safety6B parametersstructured outputs

Release Timeline

1 release group
2023-03
3 current
GPT-JT 6B V0
6B parameters
Current
GPT-JT 6B V1
6B parameters
Current
GPT-JT Moderation 6B
safety6B parametersstructured outputs
Current

Specifications(3 models)

GPT-JT model specifications comparison
ModelReleasedParametersStructured Outputs
GPT-JT 6B V02023-036BNo
GPT-JT 6B V12023-036BNo
GPT-JT Moderation 6B2023-036BYes

Available From(2 providers)

Pricing

GPT-JT model pricing by provider
ModelProviderInput / 1MOutput / 1MType
GPT-JT Moderation 6BTogether AI$0.2$0.2Serverless

Frequently Asked Questions

What is GPT-JT used for?
GPT-JT is used for safety and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does GPT-JT compare to Together General?
GPT-JT by Together.ai is strongest where you need safety, while Together General by Together.ai is the closest related family to check for adjacent model selection. GPT-JT has 3 listed variants, while Together General reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which GPT-JT model should I use?
For the lowest listed input price, start with GPT-JT Moderation 6B through Together AI at $0.2/1M input tokens. For the most capable/latest local choice, evaluate GPT-JT Moderation 6B with structured outputs.

Models(3)