Gemma Models by Google DeepMind
Details
Capabilities
About
The Gemma family of large language models (LLMs) represents a series of advanced open models developed by Google. These lightweight models harness the cutting-edge research and technologies utilized in the Gemini models and are tailored for diverse natural language processing tasks. Gemma offers two model sizes: a 2 billion parameter version compatible with CPU and on-device environments, and a 7 billion parameter model primed for GPU and TPU platforms. Both sizes come in pre-trained and instruction-tuned forms, ensuring flexibility in their deployment. Designed for accessibility, the models support major AI frameworks and hardware platforms, embodying Google's commitment to responsible AI development with integrated safety measures and risk mitigation tools 1 5 6.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 8k context, 7B parameters, and structured outputs.
Use when the workload needs 8k context, 7B parameters, and structured outputs.
Use when the workload needs 2k context and 2B parameters.
Use when the workload needs 8k context, 7B parameters, and structured outputs.
Use when the workload needs 2k context and 2B parameters.
Use when the workload needs 8k context, 7B parameters, and structured outputs.
Use when the workload needs 8k context and 7B parameters.
Use when the workload needs 8k context and 2B parameters.
Use when the workload needs 8k context, 7B parameters, and structured outputs.
Use when the workload needs 8k context, 7B parameters, and structured outputs.
Use when the workload needs 8k context, 2B parameters, and structured outputs.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Gemma 7B Instruct | Use when the workload needs 8k context, 7B parameters, and structured outputs. | 2024-02 | 8k context7B parametersstructured outputs | Current |
| Gemma 1.1 7B Instruct | Use when the workload needs 8k context, 7B parameters, and structured outputs. | 2024-02 | 8k context7B parametersstructured outputs | Current |
| Gemma 1.1 2B Instruct | Use when the workload needs 2k context and 2B parameters. | 2024-02 | 2k context2B parameters | Current |
| Gemma 7B | Use when the workload needs 8k context, 7B parameters, and structured outputs. | 2024-02 | 8k context7B parametersstructured outputs | Current |
| Gemma 2B | Use when the workload needs 2k context and 2B parameters. | 2024-02 | 2k context2B parameters | Current |
| Together AI Gemma-7B-it | Use when the workload needs 8k context, 7B parameters, and structured outputs. | 2024-02 | 8k context7B parametersstructured outputs | Current |
| OctoML Gemma-7B-it | Use when the workload needs 8k context and 7B parameters. | 2024-02 | 8k context7B parameters | Current |
| OctoML Gemma-2B-it | Use when the workload needs 8k context and 2B parameters. | 2024-02 | 8k context2B parameters | Current |
| Gemma 7B on Google Vertex AI | Use when the workload needs 8k context, 7B parameters, and structured outputs. | 2024-02 | 8k context7B parametersstructured outputs | Current |
| DeepInfra Google Gemma 7B | Use when the workload needs 8k context, 7B parameters, and structured outputs. | 2024-02 | 8k context7B parametersstructured outputs | Current |
| DeepInfra Google Gemma 2B | Use when the workload needs 8k context, 2B parameters, and structured outputs. | 2024-02 | 8k context2B parametersstructured outputs | Current |
Release Timeline
1 release groupSpecifications(12 models)
| Model | Released | Context | Parameters | Structured Outputs |
|---|---|---|---|---|
| Gemma 7B Instruct | 2024-02 | 8k | 7B | Yes |
| Gemma 1.1 7B Instruct | 2024-02 | 8k | 7B | Yes |
| Gemma 1.1 2B Instruct | 2024-02 | 2k | 2B | No |
| Gemma 7B | 2024-02 | 8k | 7B | Yes |
| Gemma 2B | 2024-02 | 2k | 2B | No |
| Together AI Gemma-7B-it | 2024-02 | 8k | 7B | Yes |
| OctoML Gemma-7B-it | 2024-02 | 8k | 7B | No |
| OctoML Gemma-2B-it | 2024-02 | 8k | 2B | No |
| Gemma 7B on Google Vertex AI | 2024-02 | 8k | 7B | Yes |
| DeepInfra Google Gemma 7B | 2024-02 | 8k | 7B | Yes |
| DeepInfra Google Gemma 2B | 2024-02 | 8k | 2B | Yes |
Available From(10 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Gemma 1.1 7B Instruct | DeepInfra | $0.05 | $0.15 | Serverless |
| DeepInfra Google Gemma 7B | DeepInfra | $0.05 | $0.15 | Serverless |
| DeepInfra Google Gemma 2B | DeepInfra | $0.05 | $0.15 | Serverless |
| Gemma 7B Instruct | Replicate API | $0.05 | $0.25 | Serverless |
| Gemma 7B Instruct | Lepton AI API | $0.07 | $0.07 | Serverless |
| Gemma 7B Instruct | GCP Vertex AI | $0.1 | $0.3 | Serverless |
| OctoML Gemma-2B-it | OctoML (Deprecated) | $0.1 | $0.15 | Serverless |
| Gemma 7B | GCP Vertex AI | $0.1 | $0.3 | Serverless |
| Gemma 7B on Google Vertex AI | GCP Vertex AI | $0.125 | $0.375 | Serverless |
| Together AI Gemma-7B-it | Together AI | $0.15 | $0.15 | Serverless |
| OctoML Gemma-7B-it | OctoML (Deprecated) | $0.15 | $0.2 | Serverless |
| Gemma 7B Instruct | Fireworks AI | $0.2 | $0.2 | Provisioned |
| Gemma 7B Instruct | Together AI | $0.2 | $0.2 | Serverless |
| Gemma 7B | Fireworks AI | $0.2 | $0.2 | Serverless |
Frequently Asked Questions
- What is Gemma used for?
- Gemma is used for structured outputs, coding, and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.
- How does Gemma compare to Gemma 4?
- Gemma by Google DeepMind is strongest where you need structured outputs, while Gemma 4 by Google DeepMind is the closest related family to check for multimodal. Gemma has 12 listed variants and reaches up to 8k context, while Gemma 4 reaches up to 256k context, so compare the specs and pricing tables before choosing a production model.
- Which Gemma model should I use?
- For the lowest listed input price, start with Gemma 2B Instruct through GCP Vertex AI at $0.04/1M input tokens. For the most capable/latest local choice, evaluate Together AI Gemma-7B-it with 8k context and structured outputs.
Models(12)
Gemma 7B Instruct
Gemma 1.1 7B Instruct
Gemma 1.1 2B Instruct
Gemma 7B
Gemma 2B
Together AI Gemma-7B-it
OctoML Gemma-7B-it
OctoML Gemma-2B-it
Gemma 7B on Google Vertex AI
DeepInfra Google Gemma 7B
DeepInfra Google Gemma 2B

