Gemma 7B

Name: Gemma 7B
Author: Google DeepMind

Released

2024-02-21

Last refreshed

2026-06-15

Status

Researched 69d ago

Open weightsCommercial use: conditionalClassificationJSON / Tool use

Gemma 7B is worth evaluating for classification and json / tool use when its provider route and context window match the workload.

Use it for

Teams evaluating classification and json / tool use
Workloads that can use a 8k context window
Buyers comparing 2 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: Gemma
Released: 2024-02-21
Context: 8k
Parameters: 7B
Architecture: Decoder Only
Knowledge cutoff: 2023-04
Specialization: general
Openness: Open weights
License: GemmaCommercial use: conditional
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

Google DeepMind

Pioneering artificial intelligence research.

London, United Kingdom

Founded 2014

Website

Pricing

Output / 1M

$0.200

Input / 1M

$0.200

Cheapest of 2 routes · Fireworks AI

Providers(2)

Fireworks AI GCP Vertex AI

View 2 provider routes

About

Gemma 7B is Google DeepMind's Gemma model. It offers an 8K-token context window with weights openly available for self-hosting.

Gemma 7B is an open-weight model in the Gemma family. The structured metadata tracks a 8k-token context window and structured outputs. This page tracks provider routes through Fireworks AI and GCP Vertex AI, with the cheapest tracked route listed at $0.1 input and $0.3 output per 1M tokens. No headline benchmark score is tracked for Gemma 7B yet.

Top use-case fit

Classification

Included by capability and metadata signals in the decision map.

JSON / Tool use

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 2

Compare API pricing across 2 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
Fireworks AI	$0.200	$0.200	Serverless
GCP Vertex AI	$0.100	$0.300	Serverless

Available via routers & gateways(13)

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSGCP Vertex AI

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughGCP Vertex AIFireworks AI

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionGCP Vertex AI

AIRouter

Router

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

Passthrough + feeGCP Vertex AI

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionGCP Vertex AI

Kong AI Gateway

Gateway

Multi-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.

SubscriptionGCP Vertex AI