PaliGemma 3B 896

Name: PaliGemma 3B 896
Author: Google DeepMind

Released

2024-05-14

Last refreshed

2026-05-01

Status

Researched 207d ago

Open weightsCommercial use: conditionalMultimodalVision

PaliGemma 3B 896 is worth evaluating for vision when its provider route and context window match the workload.

Use it for

Teams evaluating vision
Workloads that can use a 512 context window
Buyers comparing 1 tracked provider route

Do not use it for

Strict JSON or tool-calling flows

Specifications

Family: PaliGemma
Released: 2024-05-14
Context: 512
Parameters: 3B
Architecture: Decoder Only
Specialization: general
Openness: Open weights
License: GemmaCommercial use: conditional
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

Google DeepMind

Pioneering artificial intelligence research.

London, United Kingdom

Founded 2014

Website

Pricing

Output / 1M

Input / 1M

Cheapest of 1 route · NVIDIA NIM

Providers(1)

NVIDIA NIM

View 1 provider route

About

PaliGemma 3B 896 is a versatile and lightweight vision-language model developed by Google, designed to process and integrate both images and text. Inspired by the PaLI-3 model, it employs components like the SigLIP vision model and the Gemma-2B language model, featuring a linear projection layer for seamless integration of visual and textual inputs. Capable of handling tasks such as image captioning, visual question answering, object detection, and segmentation, it supports multilingual text processing. Despite requiring task-specific fine-tuning for optimal performance, PaliGemma highlights strong capabilities across various vision-language applications, although it may encounter challenges with contextual understanding, biases, and computational demands 124.

PaliGemma 3B 896 is an open-weight model in the PaliGemma family. The structured metadata tracks a 512-token context window and multimodal input. This page tracks provider routes through NVIDIA NIM. No headline benchmark score is tracked for PaliGemma 3B 896 yet.

Top use-case fit

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
NVIDIA NIM	-	-	ProvisionedPartial

Available via routers & gateways(1)

NVIDIA LLM Router Blueprint

Router

NVIDIA's open-source AI blueprint for LLM routing that selects the optimal model per prompt via intent classification or neural auto-routing; being deprecated 2026-06-20.

Free OSSNVIDIA NIM