Using Gemma 7B Instruct on GCP Vertex AI

Implementation guide · Gemma · Google DeepMind

ServerlessOpen Weights

Quick Start

1
Create an account at GCP Vertex AI and generate an API key.
2
Use the GCP Vertex AI SDK or REST API to call gemma-7b-it — see the documentation for request format.
3
You'll be billed $0.10/1M input, $0.30/1M output tokens. See full pricing.

API Portal Documentation Pricing Model Card

Code Examples

Install

pip install google-cloud-aiplatform

API key

GOOGLE_CLOUD_PROJECT

Model ID

gemma-7b-it

For Google-published models use the model name directly, e.g. "gemini-2.0-flash-001". For third-party publishers (Anthropic, Meta, etc.) use the full publisher path, e.g. "publishers/anthropic/models/claude-3-5-sonnet-v2@20241022".

import os
import vertexai
from vertexai.generative_models import GenerativeModel

# Reads GOOGLE_CLOUD_PROJECT from env; authenticates via Application Default Credentials
vertexai.init(project=os.environ["GOOGLE_CLOUD_PROJECT"], location="us-central1")
model = GenerativeModel("gemma-7b-it")
response = model.generate_content("Hello")
print(response.text)

About GCP Vertex AI

Google Cloud Vertex AI is a comprehensive machine learning platform that provides end-to-end solutions for developing, deploying, and managing AI models. The platform offers a unified interface that integrates various tools and services, enabling users to efficiently handle the entire machine learning lifecycle. Key features include AutoML capabilities for building custom models with minimal coding, a managed notebook environment for prototyping, and robust MLOps tools for model monitoring and versioning. Vertex AI supports both pre-trained models and custom training, making it versatile for a wide range of applications such as natural language processing, image recognition, and predictive analytics. The platform's design focuses on increasing productivity and accelerating time-to-market for AI solutions. By consolidating multiple AI tools into a single ecosystem, Vertex AI reduces manual effort and enhances collaboration among data scientists and engineers. Its scalable architecture allows organizations to efficiently manage large datasets and complex models, while the pay-as-you-go pricing model makes it accessible for businesses of all sizes. Additionally, Vertex AI's integration with popular open-source frameworks like TensorFlow and PyTorch enables users to leverage existing models and tools, fostering innovation and facilitating the development of customized AI applications tailored to specific business needs.

Vertex AI is Google Cloud's managed AI platform, offering access to Gemini models and hundreds of partner models alongside tools for fine-tuning, grounding, vector search, and end-to-end MLOps pipelines.

View all models on GCP Vertex AI →

Pricing on GCP Vertex AI

Type	Price (per 1M)
Input tokens	$0.10
Output tokens	$0.30

Capabilities

Structured Outputs

About Gemma 7B Instruct

Gemma 7B Instruct is a cutting-edge large language model developed by Google DeepMind, boasting 7 billion parameters. As part of the Gemma family, it benefits from the advanced research underpinning Google's Gemini models. This model is optimized for text generation tasks, excelling in areas like question answering and summarization, and it is finely tuned to follow instructions effectively. Despite its compact size, Gemma 7B Instruct performs impressively on benchmarks, making it versatile for deployment across various hardware platforms, from laptops to cloud infrastructure. Moreover, it is open-source, with accessible weights and incorporates responsible AI practices, such as data filtering and human feedback, to ensure safe and ethical use.

Full model details →

Model Specs

Released2024-02-21

Parameters7B

Context8k

ArchitectureDecoder Only

Knowledge cutoff2023-04

Google Cloud Platform (GCP)

Mountain View, California, United States