Gemma 7B Instruct

Name: Gemma 7B Instruct
Author: Google DeepMind

Released

2024-02-21

Last refreshed

2026-06-15

Status

Researched 99d ago

Open weightsCommercial use: conditionalCodingClassificationJSON / Tool use

Gemma 7B Instruct is worth evaluating for coding, classification, and json / tool use when its provider route and context window match the workload.

Use it for

Teams evaluating coding, classification, and json / tool use
Workloads that can use a 8k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: Gemma
Released: 2024-02-21
Context: 8k
Parameters: 7B
Architecture: Decoder Only
Knowledge cutoff: 2023-04
Specialization: general
Openness: Open weights
License: GemmaCommercial use: conditional
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

Google DeepMind

Pioneering artificial intelligence research.

London, United Kingdom

Founded 2014

Website

Pricing

Output / 1M

$0.070

Input / 1M

$0.070

Cheapest of 8 routes · Lepton AI API

Providers(8)

NVIDIA NIM Fireworks AI Together AI GCP Vertex AI Cloudflare Workers AI Alibaba Cloud PAI-EAS Lepton AI API Replicate API

View 8 provider routes

About

Gemma 7B Instruct is a cutting-edge large language model developed by Google DeepMind, boasting 7 billion parameters. As part of the Gemma family, it benefits from the advanced research underpinning Google's Gemini models. This model is optimized for text generation tasks, excelling in areas like question answering and summarization, and it is finely tuned to follow instructions effectively. Despite its compact size, Gemma 7B Instruct performs impressively on benchmarks, making it versatile for deployment across various hardware platforms, from laptops to cloud infrastructure. Moreover, it is open-source, with accessible weights and incorporates responsible AI practices, such as data filtering and human feedback, to ensure safe and ethical use.

Gemma 7B Instruct is an open-weight model in the Gemma family. The structured metadata tracks a 8k-token context window and structured outputs. This page tracks provider routes through NVIDIA NIM, Fireworks AI, Together AI, and 5 more, with the cheapest tracked route listed at $0.05 input and $0.25 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 50.8, HellaSwag 89.2, and HumanEval 70.1.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ A

1 relevant benchmark in the decision map.

Classification

Q/$ A

2 relevant benchmarks in the decision map.

JSON / Tool use

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 8

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
Lepton AI API	$0.070	$0.070	Serverless
Fireworks AI	$0.200	$0.200	Provisioned
Together AI	$0.200	$0.200	Serverless
Replicate API	$0.050	$0.250	Serverless

Available via routers & gateways(14)

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSGCP Vertex AI

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughGCP Vertex AITogether AIFireworks AI

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionGCP Vertex AI

AIRouter

Router

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

Passthrough + feeGCP Vertex AI

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionGCP Vertex AI

Kong AI Gateway

Gateway

Multi-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.

SubscriptionGCP Vertex AI

Capabilities

Structured Outputs

Benchmark peer barsfor Coding

HumanEvalRank 59 of 97

Claude Sonnet 4.6

98.0

96.7

Claude Opus 4.6

95.0

Grok-3

94.5

Gemma 7B Instructcurrent

70.1

Benchmark scores(5)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Evaluation	Source
Google-Proof Q&A	50.8	diamondObserved 2026-03-06	—	research
HellaSwag	89.2	10-shotObserved 2026-03-06	—	Source
HumanEval	70.1	pass@1Observed 2026-03-06	—	Source
Massive Multitask Language Understanding	75.3	5-shotObserved 2026-03-06	—	Source
Instruction-Following Evaluation	42.6	v2Observed 2026-04-14	—	Source

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(1)

Best Small Language Models (SLMs)Listed

Compare Gemma 7B Instruct with other models

Comparison and alternatives

Browse all comparisons →

Show all 32 popular comparisonssorted by 7-day search impressions

Frequently asked questions

What is the context window of Gemma 7B Instruct?

Gemma 7B Instruct has a context window of 8k tokens.

How much does Gemma 7B Instruct cost?

Gemma 7B Instruct pricing ranges from $0.05/1M to $0.2/1M input tokens depending on the provider.

When was Gemma 7B Instruct released?

Gemma 7B Instruct was released on 2024-02-21.

Which providers offer Gemma 7B Instruct?

Gemma 7B Instruct is available from 8 providers: NVIDIA NIM, Fireworks AI, Together AI, GCP Vertex AI, Cloudflare Workers AI, Alibaba Cloud PAI-EAS, Lepton AI API, Replicate API.

What benchmarks has Gemma 7B Instruct been tested on?

Gemma 7B Instruct has been evaluated on 5 benchmarks, including Google-Proof Q&A, HellaSwag, HumanEval, Massive Multitask Language Understanding, Instruction-Following Evaluation.

Created by

Google DeepMind

Pioneering artificial intelligence research.

London, United Kingdom

Founded 2014

Website

Pricing

Output / 1M

$0.070

Input / 1M

$0.070

Cheapest of 8 routes · Lepton AI API

Providers(8)

NVIDIA NIM Fireworks AI Together AI GCP Vertex AI Cloudflare Workers AI Alibaba Cloud PAI-EAS Lepton AI API Replicate API

View 8 provider routes