How many GCP Vertex AI models does LLMReference track?

LLMReference currently tracks 127 models available through GCP Vertex AI's API. GCP Vertex AI's full catalog may be larger.

What are GCP Vertex AI's most popular models?

GCP Vertex AI's top models include Claude 3.5 Sonnet, Llama 3 70B Instruct, Gemini 1.5 Pro, Gemini 1.0 Pro, Gemini 1.0 Pro Vision.

Does GCP Vertex AI offer free models?

Yes, GCP Vertex AI offers 13 free models, including Gemma 3, Gemma 3n, ShieldGemma 2, PaliGemma, TxGemma.

What is GCP Vertex AI's pricing?

GCP Vertex AI pricing ranges from $0.035/1M to $15/1M input tokens depending on the model.

GCP Vertex AI

Researched 4d agoHyperscalerTier 1

Google Cloud Platform (GCP)

CodingRAGAgentsLong contextVisionClassificationJSON / Tool useHighlightHyperscaler

GCP Vertex AI offers 127 tracked models (97 with output token pricing). This catalog covers coding, rag, and agents; open any model detail page for benchmarks, batch tiers, and migration prompts.
Covers 7 workload areas across 127 tracked models; last verified 2026-07-09.

Use it for

Teams comparing token and batch pricing across this provider's models
Operators routing coding, rag, and agents workloads through this API
Batch buyers auditing discount coverage model-by-model

Do not use it for

Final benchmark picks without opening the relevant model detail page

Tracked models

127

Models available through this provider

Priced output routes

Models with output token pricing tracked

Cheapest output

$0.080

Gemma 3 4B IT on this route

Batch-ready models

Models with discounted batch pricing

Latest model release

2026-06-30

13d since newest release

Freshness

2026-07-09

Researched 4d ago

fresh

Routes available via routers & gateways

These routers list GCP Vertex AI as a target provider, so they can sit in front of this catalog for fallback, routing, or unified API access.

Browse routers ->

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSSelf-host

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

Passthrough

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionSelf-host

AIRouter

Router

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

Passthrough + fee

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionSelf-host

Kong AI Gateway

Gateway

Multi-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.

SubscriptionSelf-host

Information

TypeHyperscaler

TierTier 1

Models127

CompanyGoogle Cloud Platform (GCP)

Founded2008

Mountain View, California, United States

Vertex AI is Google Cloud's managed AI platform, offering access to Gemini models and hundreds of partner models alongside tools for fine-tuning, grounding, vector search, and end-to-end MLOps pipelines.

Links

Website X / Twitter LinkedIn Crunchbase

Catalog freshness

The newest model tracked on this provider was released 2026-06-30 (13d ago).

Where this host wins

Coding: 41 tracked models with SWE-bench / HumanEval-style scores.
RAG: 52 tracked models with ruler / needle retrieval benchmarks.
Agentic: 48 tracked models with BFCL, tau-bench, and SWE-bench tool-use coverage.
Long-context: 55 tracked models with context-token or InfiniteBench-class signal.

Getting started

Official product, docs, and pricing links — confirm quotas and regions in the vendor docs.

Product Docs Portal Pricing

Compliance notes

No verified compliance claims (SOC 2, ISO, HIPAA) tracked for this provider yet — check the vendor's trust center for current certifications.

Platform Overview

Google Cloud Vertex AI is a comprehensive machine learning platform that provides end-to-end solutions for developing, deploying, and managing AI models. The platform offers a unified interface that integrates various tools and services, enabling users to efficiently handle the entire machine learning lifecycle. Key features include AutoML capabilities for building custom models with minimal coding, a managed notebook environment for prototyping, and robust MLOps tools for model monitoring and versioning. Vertex AI supports both pre-trained models and custom training, making it versatile for a wide range of applications such as natural language processing, image recognition, and predictive analytics. The platform's design focuses on increasing productivity and accelerating time-to-market for AI solutions. By consolidating multiple AI tools into a single ecosystem, Vertex AI reduces manual effort and enhances collaboration among data scientists and engineers. Its scalable architecture allows organizations to efficiently manage large datasets and complex models, while the pay-as-you-go pricing model makes it accessible for businesses of all sizes. Additionally, Vertex AI's integration with popular open-source frameworks like TensorFlow and PyTorch enables users to leverage existing models and tools, fostering innovation and facilitating the development of customized AI applications tailored to specific business needs.

Compare per-model pricing, input and output token costs, batch availability, and benchmark coverage.

Available Models(127)

View all →

All models available as Serverless

Model	Input (per 1M)	Output (per 1M)	Batch input (per 1M)	Batch output (per 1M)
Claude Sonnet 5	$3.00	$15.00	$1.50(-50%)	$7.50(-50%)
Claude Fable 5	$10.00	$50.00	—	—
Claude Mythos 5			—	—
Claude Opus 4.8			—	—
Gemini 3.5 Flash	$1.5	$9	$0.75(-50%)	$4.5(-50%)
Claude Opus 4.7	$5	$25	—	—
Gemma 4 26B A4B IT	$0.15	$0.60	—	—
Gemma 4 31B IT	$0.15	$0.60	—	—
Gemma 4 E2B	$0	$0	—	—
Gemma 4 E2B IT	$0	$0	—	—

View full catalog →

Where else to run this

Llama 2 7B Chat on GCP Vertex AI

Provider setup and pricing

Llama 2 13B Chat on GCP Vertex AI

Provider setup and pricing

Llama 2 70B Chat on GCP Vertex AI

Provider setup and pricing

Llama 2 7B Chat on Alibaba Cloud PAI-EAS

Alternative host

Llama 2 13B Chat on Alibaba Cloud PAI-EAS

Alternative host

Llama 2 70B Chat on Databricks Foundation Model Serving

Alternative host