FLAN-UL2 Models by Google DeepMind
About
The FLAN-UL2 family of large language models is an advancement of the original UL2 model, leveraging the T5 architecture. Its most notable innovation is the significant expansion of the receptive field from 512 to 2048, which greatly enhances its effectiveness for few-shot in-context learning. Unlike the UL2 model, FLAN-UL2 simplifies operations by removing the need for mode switch tokens during inference and fine-tuning. The model is fine-tuned using the "Flan" prompt tuning method and a specially curated dataset, boosting its few-shot learning abilities. Available in multiple sizes, FLAN-UL2 models can be accessed through their GitHub repository for further exploration and utilization 138.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 2k context and 20B parameters.
Use when the workload needs 2k context and 20B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Flan-UL2 | Use when the workload needs 2k context and 20B parameters. | 2022-10 | 2k context20B parameters | Current |
| Flan-UL2 on IBM Watsonx | Use when the workload needs 2k context and 20B parameters. | 2022-10 | 2k context20B parameters | Current |
Release Timeline
1 release groupSpecifications(2 models)
| Model | Released | Context | Parameters |
|---|---|---|---|
| Flan-UL2 | 2022-10 | 2k | 20B |
| Flan-UL2 on IBM Watsonx | 2022-10 | 2k | 20B |
Available From(1 provider)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Flan-UL2 on IBM Watsonx | IBM watsonx | $0.185 | $0.185 | Serverless |
| Flan-UL2 | IBM watsonx | $5 | $5 | Serverless |
Frequently Asked Questions
- What is FLAN-UL2 used for?
- The FLAN-UL2 family of large language models is an advancement of the original UL2 model, leveraging the T5 architecture.
- How does FLAN-UL2 compare to Gemma 4?
- FLAN-UL2 by Google DeepMind is strongest where you need its listed use cases, while Gemma 4 by Google DeepMind is the closest related family to check for multimodal. FLAN-UL2 has 2 listed variants and reaches up to 2k context, while Gemma 4 reaches up to 256k context, so compare the specs and pricing tables before choosing a production model.
- Which FLAN-UL2 model should I use?
- For the lowest listed input price, start with Flan-UL2 on IBM Watsonx through IBM watsonx at $0.185/1M input tokens. For the most capable/latest local choice, evaluate Flan-UL2 on IBM Watsonx with 2k context.






