Gemini 1.5 Flash 8B on GCP Vertex AI

Name: Gemini 1.5 Flash 8B on GCP Vertex AI
Brand: Google DeepMind
SKU: gemini-1.5-flash-8b-gcp-vertex-ai
Price: 0.0375 USD

Gemini 1.5 · Google DeepMind

Serverless

Last refreshed 2026-06-15. Next refresh: weekly.

Why use Gemini 1.5 Flash 8B on GCP Vertex AI?

GCP Vertex AI offers Gemini 1.5 Flash 8B with pay-as-you-go pricing at $0.04/1M input tokens. Vertex AI is Google Cloud's managed AI platform, offering access to Gemini models and hundreds of partner models alongside tools for fine-tuning, grounding, vector search, and end-to-end MLOps pipelines.

Input / 1M

$0.0375

Output / 1M

$0.15

Cache

Not sourced

Batch

Not sourced

Setup recipe

Python + curl

Install

pip install google-cloud-aiplatform

Auth

export GOOGLE_CLOUD_PROJECT=...

Call

import os
import vertexai
from vertexai.generative_models import GenerativeModel
vertexai.init(project=os.environ["GOOGLE_CLOUD_PROJECT"], location="us-central1")

Model ID

gemini-1.5-flash-8b

Request example

import os
import vertexai
from vertexai.generative_models import GenerativeModel

# Reads GOOGLE_CLOUD_PROJECT from env; authenticates via Application Default Credentials
vertexai.init(project=os.environ["GOOGLE_CLOUD_PROJECT"], location="us-central1")
model = GenerativeModel("gemini-1.5-flash-8b")
response = model.generate_content("Hello")
print(response.text)

Gotchas

For Google-published models use the model name directly, e.g. "gemini-2.0-flash-001". For third-party publishers (Anthropic, Meta, etc.) use the full publisher path, e.g. "publishers/anthropic/models/claude-3-5-sonnet-v2@20241022".
The examples expect GOOGLE_CLOUD_PROJECT; rename it only if your application config maps the new variable.

Pricing

Type	Price (per 1M)
Input tokens	$0.04
Output tokens	$0.15

Capabilities

No model capability flags are currently sourced.

About Gemini 1.5 Flash 8B

Lightweight 8B variant of Gemini 1.5 Flash optimized for speed and cost-efficiency. Supports 1M token context with fast inference for real-time applications.