T5 is used for coding. The family description and listed model capabilities point to those workloads as the best fit.

How does T5 compare to T5Gemma?

T5 by Google DeepMind is strongest where you need coding, while T5Gemma by Google DeepMind is the closest related family to check for agent workflows and tool use. T5 has 5 listed variants and reaches up to 512 context, so compare the specs and pricing tables before choosing a production model.

Which T5 model should I use?

If price is the main constraint, use the pricing table first because T5 does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate T5 11B with 512 context.

T5 Models by Google DeepMind

Google DeepMindApache 2.0Open source

This model family is considered obsolete. Consider newer alternatives in Related Model Families below.

5 models2020Up to 512 ctx

Details

ResearcherGoogle DeepMind

LicenseApache 2.0OSI-approved

Commercial useCommercial use: permitted

Models5

Released2020

Max context512

Links

Website HuggingFace

About

The T5 (Text-to-Text Transfer Transformer) family of large language models, developed by Google AI, signifies a notable leap forward in natural language processing (NLP) 5910. Departing from traditional models tailored for specific tasks, T5 employs a unified text-to-text framework that transforms all NLP problems into text-to-text tasks 5910. This framework allows for uniform utilization of the model, loss function, and hyperparameters across a spectrum of NLP tasks, such as machine translation, summarization, question answering, and classification 5910. Pre-training is executed on the expansive Colossal Clean Crawled Corpus (C4), ensuring access to an extensive range of high-quality text and code 5910. The T5 family comprises models of various sizes, accommodating different computational requirements, with larger models capable of delivering state-of-the-art performance on numerous NLP benchmarks 5910. Enhancements like instruction tuning in subsequent models, such as the Flan-T5 family, build upon the robust T5 architecture to further advance capabilities 10.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

5 in view

T5 11BCurrent

Use when the workload needs 512 context and 11B parameters.

2020-01512 context11B parameters

T5 3BCurrent

Use when the workload needs 512 context and 3B parameters.

2020-01512 context3B parameters

T5 LargeCurrent

Use when the workload needs 512 context and 770M parameters.

2020-01512 context770M parameters

T5 BaseCurrent

Use when the workload needs 512 context and 220M parameters.

2020-01512 context220M parameters

T5 SmallCurrent

Use when the workload needs 512 context and 60M parameters.

2020-01512 context60M parameters

Current T5 variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
T5 11B	Use when the workload needs 512 context and 11B parameters.	2020-01	512 context11B parameters	Current
T5 3B	Use when the workload needs 512 context and 3B parameters.	2020-01	512 context3B parameters	Current
T5 Large	Use when the workload needs 512 context and 770M parameters.	2020-01	512 context770M parameters	Current
T5 Base	Use when the workload needs 512 context and 220M parameters.	2020-01	512 context220M parameters	Current
T5 Small	Use when the workload needs 512 context and 60M parameters.	2020-01	512 context60M parameters	Current

Release Timeline

1 release group

2020-01

5 current

T5 11B

512 context11B parameters

Current

T5 3B

512 context3B parameters

Current

T5 Base

512 context220M parameters

Current

T5 Large

512 context770M parameters

Current

T5 Small

512 context60M parameters

Current

Specifications(5 models)

T5 model specifications comparison
Model	Released	Context	Parameters
T5 11B	2020-01	512	11B
T5 3B	2020-01	512	3B
T5 Large	2020-01	512	770M
T5 Base	2020-01	512	220M
T5 Small	2020-01	512	60M

Frequently Asked Questions

What is T5 used for?: T5 is used for coding. The family description and listed model capabilities point to those workloads as the best fit.
How does T5 compare to T5Gemma?: T5 by Google DeepMind is strongest where you need coding, while T5Gemma by Google DeepMind is the closest related family to check for agent workflows and tool use. T5 has 5 listed variants and reaches up to 512 context, so compare the specs and pricing tables before choosing a production model.
Which T5 model should I use?: If price is the main constraint, use the pricing table first because T5 does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate T5 11B with 512 context.

Models(5)

T5 11B

T5 3B

T5 Large

T5 Base

T5 Small