What is Gemma used for?

Gemma is used for structured outputs, coding, and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.

How does Gemma compare to T5Gemma?

Gemma by Google DeepMind is strongest where you need structured outputs, while T5Gemma by Google DeepMind is the closest related family to check for agent workflows and tool use. Gemma has 12 listed variants and reaches up to 8k context, so compare the specs and pricing tables before choosing a production model.

Which Gemma model should I use?

For the lowest listed input price, start with Gemma 2B Instruct through GCP Vertex AI at $0.04/1M input tokens. For the most capable/latest local choice, evaluate Together AI Gemma-7B-it with 8k context and structured outputs.

Gemma Models by Google DeepMind

Google DeepMindGemmaOpen weightsOpen SourceHighlight

12 models2024Up to 8k ctxFrom $0.04/1M input

Details

ResearcherGoogle DeepMind

LicenseGemma

Commercial useCommercial use: conditional

Models12

Released2024

Max context8k

Capabilities

Structured Outputs8 of 12 models

Links

Website HuggingFace

About

The Gemma family of large language models (LLMs) represents a series of advanced open models developed by Google. These lightweight models harness the cutting-edge research and technologies utilized in the Gemini models and are tailored for diverse natural language processing tasks. Gemma offers two model sizes: a 2 billion parameter version compatible with CPU and on-device environments, and a 7 billion parameter model primed for GPU and TPU platforms. Both sizes come in pre-trained and instruction-tuned forms, ensuring flexibility in their deployment. Designed for accessibility, the models support major AI frameworks and hardware platforms, embodying Google's commitment to responsible AI development with integrated safety measures and risk mitigation tools 1 5 6.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

11 in view1 retired

Gemma 7B InstructCurrent

Use when the workload needs 8k context, 7B parameters, and structured outputs.

2024-028k context7B parametersstructured outputs

Gemma 1.1 7B InstructCurrent

Use when the workload needs 8k context, 7B parameters, and structured outputs.

2024-028k context7B parametersstructured outputs

Gemma 1.1 2B InstructCurrent

Use when the workload needs 2k context and 2B parameters.

2024-022k context2B parameters

Gemma 7BCurrent

Use when the workload needs 8k context, 7B parameters, and structured outputs.

2024-028k context7B parametersstructured outputs

Gemma 2BCurrent

Use when the workload needs 2k context and 2B parameters.

2024-022k context2B parameters

Together AI Gemma-7B-itCurrent

Use when the workload needs 8k context, 7B parameters, and structured outputs.

2024-028k context7B parametersstructured outputs

OctoML Gemma-7B-itCurrent

Use when the workload needs 8k context and 7B parameters.

2024-028k context7B parameters

OctoML Gemma-2B-itCurrent

Use when the workload needs 8k context and 2B parameters.

2024-028k context2B parameters

Gemma 7B on Google Vertex AICurrent

Use when the workload needs 8k context, 7B parameters, and structured outputs.

2024-028k context7B parametersstructured outputs

DeepInfra Google Gemma 7BCurrent

Use when the workload needs 8k context, 7B parameters, and structured outputs.

2024-028k context7B parametersstructured outputs

DeepInfra Google Gemma 2BCurrent

Use when the workload needs 8k context, 2B parameters, and structured outputs.

2024-028k context2B parametersstructured outputs

Current Gemma variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Gemma 7B Instruct	Use when the workload needs 8k context, 7B parameters, and structured outputs.	2024-02	8k context7B parametersstructured outputs	Current
Gemma 1.1 7B Instruct	Use when the workload needs 8k context, 7B parameters, and structured outputs.	2024-02	8k context7B parametersstructured outputs	Current
Gemma 1.1 2B Instruct	Use when the workload needs 2k context and 2B parameters.	2024-02	2k context2B parameters	Current
Gemma 7B	Use when the workload needs 8k context, 7B parameters, and structured outputs.	2024-02	8k context7B parametersstructured outputs	Current
Gemma 2B	Use when the workload needs 2k context and 2B parameters.	2024-02	2k context2B parameters	Current
Together AI Gemma-7B-it	Use when the workload needs 8k context, 7B parameters, and structured outputs.	2024-02	8k context7B parametersstructured outputs	Current
OctoML Gemma-7B-it	Use when the workload needs 8k context and 7B parameters.	2024-02	8k context7B parameters	Current
OctoML Gemma-2B-it	Use when the workload needs 8k context and 2B parameters.	2024-02	8k context2B parameters	Current
Gemma 7B on Google Vertex AI	Use when the workload needs 8k context, 7B parameters, and structured outputs.	2024-02	8k context7B parametersstructured outputs	Current
DeepInfra Google Gemma 7B	Use when the workload needs 8k context, 7B parameters, and structured outputs.	2024-02	8k context7B parametersstructured outputs	Current
DeepInfra Google Gemma 2B	Use when the workload needs 8k context, 2B parameters, and structured outputs.	2024-02	8k context2B parametersstructured outputs	Current

Release Timeline

1 release group

2024-02

11 current · 1 retired

DeepInfra Google Gemma 2B

8k context2B parametersstructured outputs

Current

DeepInfra Google Gemma 7B

8k context7B parametersstructured outputs

Current

Gemma 1.1 2B Instruct

2k context2B parameters

Current

Gemma 1.1 7B Instruct

8k context7B parametersstructured outputs

Current

Gemma 2B

2k context2B parameters

Current

Gemma 2B Instruct

2k context2B parametersstructured outputs

Archived

Gemma 7B

8k context7B parametersstructured outputs

Current

Gemma 7B Instruct

8k context7B parametersstructured outputs

Current

Gemma 7B on Google Vertex AI

8k context7B parametersstructured outputs

Current

OctoML Gemma-2B-it

8k context2B parameters

Current

OctoML Gemma-7B-it

8k context7B parameters

Current

Together AI Gemma-7B-it

8k context7B parametersstructured outputs

Current

Specifications(12 models)

Gemma model specifications comparison
Model	Released	Context	Parameters	Structured Outputs
Gemma 7B Instruct	2024-02	8k	7B	Yes
Gemma 1.1 7B Instruct	2024-02	8k	7B	Yes
Gemma 1.1 2B Instruct	2024-02	2k	2B	No
Gemma 7B	2024-02	8k	7B	Yes
Gemma 2B	2024-02	2k	2B	No
Together AI Gemma-7B-it	2024-02	8k	7B	Yes
OctoML Gemma-7B-it	2024-02	8k	7B	No
OctoML Gemma-2B-it	2024-02	8k	2B	No
Gemma 7B on Google Vertex AI	2024-02	8k	7B	Yes
DeepInfra Google Gemma 7B	2024-02	8k	7B	Yes
DeepInfra Google Gemma 2B	2024-02	8k	2B	Yes

Available From(10 providers)

Alibaba Cloud PAI-EAS

Cloudflare Workers AI

OctoML (Deprecated)+2 more

Pricing

Gemma model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
Gemma 1.1 7B Instruct	DeepInfra	$0.05	$0.15	Serverless
DeepInfra Google Gemma 7B	DeepInfra	$0.05	$0.15	Serverless
DeepInfra Google Gemma 2B	DeepInfra	$0.05	$0.15	Serverless
Gemma 7B Instruct	Replicate API	$0.05	$0.25	Serverless
Gemma 7B Instruct	Lepton AI API	$0.07	$0.07	Serverless
Gemma 7B Instruct	GCP Vertex AI	$0.1	$0.3	Serverless
OctoML Gemma-2B-it	OctoML (Deprecated)	$0.1	$0.15	Serverless
Gemma 7B	GCP Vertex AI	$0.1	$0.3	Serverless
Gemma 7B on Google Vertex AI	GCP Vertex AI	$0.125	$0.375	Serverless
Together AI Gemma-7B-it	Together AI	$0.15	$0.15	Serverless
OctoML Gemma-7B-it	OctoML (Deprecated)	$0.15	$0.2	Serverless
Gemma 7B Instruct	Fireworks AI	$0.2	$0.2	Provisioned
Gemma 7B Instruct	Together AI	$0.2	$0.2	Serverless
Gemma 7B	Fireworks AI	$0.2	$0.2	Serverless

Popular comparisons in this family

Frequently Asked Questions

What is Gemma used for?: Gemma is used for structured outputs, coding, and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.
How does Gemma compare to T5Gemma?: Gemma by Google DeepMind is strongest where you need structured outputs, while T5Gemma by Google DeepMind is the closest related family to check for agent workflows and tool use. Gemma has 12 listed variants and reaches up to 8k context, so compare the specs and pricing tables before choosing a production model.
Which Gemma model should I use?: For the lowest listed input price, start with Gemma 2B Instruct through GCP Vertex AI at $0.04/1M input tokens. For the most capable/latest local choice, evaluate Together AI Gemma-7B-it with 8k context and structured outputs.