Using Gemini 3.1 Flash-Lite on Google AI Studio

Implementation guide · Gemini 3.1 · Google DeepMind

Serverless

Quick Start

1
Create an account at Google AI Studio and generate an API key.
2
Use the Google AI Studio SDK or REST API to call gemini-3.1-flash-lite — see the documentation for request format.
3
You'll be billed $0.25/1M input, $1.50/1M output tokens. See full pricing.

API Portal Documentation Pricing

Code Examples

Install

pip install google-genai

API key

GOOGLE_API_KEY

Model ID

gemini-3.1-flash-lite

Use the model name directly, e.g. "gemini-2.0-flash", "gemini-1.5-pro", or "gemini-2.5-pro-preview-05-06".

import os
from google import genai

client = genai.Client(api_key=os.environ["GOOGLE_API_KEY"])
response = client.models.generate_content(
    model="gemini-3.1-flash-lite",
    contents="Hello"
)
print(response.text)

About Google AI Studio

Google AI Studio is a model prototyping environment and API access point for Gemini models, offering an inference playground for developers to test and build AI applications.

View all models on Google AI Studio →

Pricing on Google AI Studio

Type	Price (per 1M)
Input tokens	$0.25
Output tokens	$1.50

Capabilities

VisionMultimodalFunction CallingTool UseStructured OutputsCode Execution

About Gemini 3.1 Flash-Lite

Gemini 3.1 Flash-Lite is Google's generally available low-latency Gemini 3.1 model, launched May 7, 2026. It is optimized for high-volume, cost-sensitive workloads with text, image, and video inputs, a 1M token context window, and a 66K token maximum output. The GA model uses the stable API ID gemini-3.1-flash-lite and replaces gemini-3.1-flash-lite-preview, which is scheduled to shut down on May 25, 2026. Pricing is $0.25 per 1M input tokens and $1.50 per 1M output tokens.

Full model details →

Model Specs

Released2026-05-07

Context1.05m

ArchitectureDecoder Only

Knowledge cutoff2025-01