LLM Reference
GCP Vertex AI

GCP Vertex AI

Researched 1d agoHyperscalerTier 1

Google Cloud Platform (GCP)

CodingRAGAgentsLong contextVisionClassificationJSON / Tool useHighlightHyperscaler

GCP Vertex AI exposes 124 tracked models (96 with output token pricing in seed data). Task coverage across this catalog includes coding, rag, and agents; open any model detail page for benchmarks, batch tiers, and migration prompts.

Portfolio context: 7 decision-task tags, 124 catalog rows, latest research stamp 2026-06-01.

Use this portfolio page for

  • Teams comparing token and batch economics on this surface
  • Operators routing coding, rag, and agents workloads through this API
  • Batch buyers auditing discount coverage model-by-model

Do not stop here for

  • Final benchmark picks without opening the relevant model detail page

Catalog rows

124

Models linked to this provider in seed data

Priced output routes

96

Rows with token_out in seed data

Cheapest output

$0.080

Gemma 3 4B IT on this route

Batch-ready SKUs

1

Models with batch columns populated

Latest catalog ship

2026-05-28

5d since dated release field

Freshness

2026-06-01

Researched 1d ago

fresh

Catalog release signal

Latest ISO-dated model.release in this catalog is 2026-05-28 (5d ago).

Where this host wins

  • Coding: 32 tracked models with SWE-bench / HumanEval-style scores.
  • RAG: 49 tracked models with ruler / needle retrieval benchmarks.
  • Agentic: 45 tracked models with BFCL, tau-bench, and SWE-bench tool-use coverage.
  • Long-context: 52 tracked models with context-token or InfiniteBench-class signal.

Getting started

Official entry points from seed metadata — confirm quotas and regions in vendor docs.

Compliance notes (verbatim seed excerpts)

Not yet verified from seed copy — no SOC/ISO/HIPAA-class sentences detected to quote verbatim.

Platform Overview

Google Cloud Vertex AI is a comprehensive machine learning platform that provides end-to-end solutions for developing, deploying, and managing AI models. The platform offers a unified interface that integrates various tools and services, enabling users to efficiently handle the entire machine learning lifecycle. Key features include AutoML capabilities for building custom models with minimal coding, a managed notebook environment for prototyping, and robust MLOps tools for model monitoring and versioning. Vertex AI supports both pre-trained models and custom training, making it versatile for a wide range of applications such as natural language processing, image recognition, and predictive analytics. The platform's design focuses on increasing productivity and accelerating time-to-market for AI solutions. By consolidating multiple AI tools into a single ecosystem, Vertex AI reduces manual effort and enhances collaboration among data scientists and engineers. Its scalable architecture allows organizations to efficiently manage large datasets and complex models, while the pay-as-you-go pricing model makes it accessible for businesses of all sizes. Additionally, Vertex AI's integration with popular open-source frameworks like TensorFlow and PyTorch enables users to leverage existing models and tools, fostering innovation and facilitating the development of customized AI applications tailored to specific business needs.

Available Models(124)

View all →

All models available as Serverless

ModelInput (per 1M)Output (per 1M)Batch input (per 1M)Batch output (per 1M)
Claude Opus 4.8$5$25
Gemini 3.5 Flash$1.5$9$0.75(-50%)$4.5(-50%)
Claude Opus 4.7$5$25
Gemma 4 26B A4B IT$0.15$0.60
Gemma 4 31B IT$0.15$0.60
Gemma 4 E2B$0$0
Gemma 4 E2B IT$0$0
Gemma 4 E4B$0$0
Gemma 4 E4B IT$0$0
Gemini 3.1 Flash Lite Preview$0.25$1.5
View full catalog →

Platform Details

TypeHyperscaler
TierTier 1
Models124

Organization

Google Cloud Platform (GCP)
Founded2008
Mountain View, California, United States

Vertex AI is Google Cloud's managed AI platform, offering access to Gemini models and hundreds of partner models alongside tools for fine-tuning, grounding, vector search, and end-to-end MLOps pipelines.