Using Gemini 3.5 Flash on Vercel AI Gateway

Implementation guide · Gemini 3.5 · Google DeepMind

Serverless

Quick Start

1
Create an account at Vercel AI Gateway and generate an API key.
2
Use the Vercel AI Gateway SDK or REST API to call google/gemini-3.5-flash — see the documentation for request format.
3
You'll be billed $1.50/1M input, $9.00/1M output tokens. See full pricing.

API Portal Documentation Pricing Model Card

Code Examples

Install

pip install openai

API key

AI_GATEWAY_API_KEY

Model ID

google/gemini-3.5-flash

creator/model-name e.g. kwaipilot/kat-coder-pro-v2

import os
from openai import OpenAI

client = OpenAI(
    api_key=os.environ["AI_GATEWAY_API_KEY"],
    base_url="https://ai-gateway.vercel.sh/v1"
)
response = client.chat.completions.create(
    model="google/gemini-3.5-flash",
    messages=[{"role": "user", "content": "Hello"}]
)
print(response.choices[0].message.content)

About Vercel AI Gateway

Vercel AI Gateway is a unified AI proxy providing a single OpenAI-compatible API endpoint to 275+ models from 25+ providers including Anthropic, OpenAI, Google, Meta, Mistral, DeepSeek, xAI, Alibaba, Amazon, ByteDance, Cohere, MiniMax, MoonshotAI, KwaiPilot, Black Forest Labs, Recraft, Voyage AI, NVIDIA, and more. Pricing is pass-through at provider list rates with zero markup. Includes $5/month free tier; paid is pay-as-you-go. Features: automatic provider fallbacks, unified observability, streaming, tool use, vision, embeddings, image/video generation, BYOK mode. Integrates via @ai-sdk/gateway package or plain model ID strings in Vercel AI SDK. API details: API key via Authorization: Bearer <AI_GATEWAY_API_KEY>. Key from Vercel dashboard. Free $5/month credit; paid tier is provider list price with zero markup. BYOK (bring-your-own-key) also supported with no markup or fee. Model IDs use {provider-owner}/{model-name} — e.g., anthropic/claude-opus-4.6, openai/gpt-5.

View all models on Vercel AI Gateway →

Pricing on Vercel AI Gateway

Type	Price (per 1M)
Input tokens	$1.50
Output tokens	$9.00
Query	$14.00

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudio

About Gemini 3.5 Flash

Gemini 3.5 Flash is Google DeepMind's generally available Flash model for sustained frontier-level performance on agentic and coding tasks. It supports multimodal inputs, native thinking, tool and function calling, structured outputs, code execution, search grounding, batch processing, and long contexts up to 1M tokens.

Full model details →

Model Specs

Released2026-05-19

Context1.05m

ArchitectureDecoder Only

Knowledge cutoff2025-01

Also available on(3)

Google AI Studio$1.50/1M GCP Vertex AI$1.50/1M OpenRouter$1.50/1M

Compare all providers →

Provider

Vercel AI Gateway

Vercel