LLM Reference

T5 Models by Google DeepMind

Google DeepMindApache 2.0
This model family is considered obsolete. Consider newer alternatives in Related Model Families below.
5 models2020Up to 512 ctx

About

The T5 (Text-to-Text Transfer Transformer) family of large language models, developed by Google AI, signifies a notable leap forward in natural language processing (NLP) 5910. Departing from traditional models tailored for specific tasks, T5 employs a unified text-to-text framework that transforms all NLP problems into text-to-text tasks 5910. This framework allows for uniform utilization of the model, loss function, and hyperparameters across a spectrum of NLP tasks, such as machine translation, summarization, question answering, and classification 5910. Pre-training is executed on the expansive Colossal Clean Crawled Corpus (C4), ensuring access to an extensive range of high-quality text and code 5910. The T5 family comprises models of various sizes, accommodating different computational requirements, with larger models capable of delivering state-of-the-art performance on numerous NLP benchmarks 5910. Enhancements like instruction tuning in subsequent models, such as the Flan-T5 family, build upon the robust T5 architecture to further advance capabilities 10.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

5 in view
T5 11BCurrent

Use when the workload needs 512 context and 11B parameters.

2020-01512 context11B parameters
T5 3BCurrent

Use when the workload needs 512 context and 3B parameters.

2020-01512 context3B parameters
T5 LargeCurrent

Use when the workload needs 512 context and 770M parameters.

2020-01512 context770M parameters
T5 BaseCurrent

Use when the workload needs 512 context and 220M parameters.

2020-01512 context220M parameters
T5 SmallCurrent

Use when the workload needs 512 context and 60M parameters.

2020-01512 context60M parameters

Release Timeline

1 release group
2020-01
5 current
T5 11B
512 context11B parameters
Current
T5 3B
512 context3B parameters
Current
T5 Base
512 context220M parameters
Current
T5 Large
512 context770M parameters
Current
T5 Small
512 context60M parameters
Current

Specifications(5 models)

T5 model specifications comparison
ModelReleasedContextParameters
T5 11B2020-0151211B
T5 3B2020-015123B
T5 Large2020-01512770M
T5 Base2020-01512220M
T5 Small2020-0151260M

Frequently Asked Questions

What is T5 used for?
T5 is used for coding. The family description and listed model capabilities point to those workloads as the best fit.
How does T5 compare to Gemma 4?
T5 by Google DeepMind is strongest where you need coding, while Gemma 4 by Google DeepMind is the closest related family to check for multimodal. T5 has 5 listed variants and reaches up to 512 context, while Gemma 4 reaches up to 256k context, so compare the specs and pricing tables before choosing a production model.
Which T5 model should I use?
If price is the main constraint, use the pricing table first because T5 does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate T5 11B with 512 context.

Models(5)