LLM ReferenceLLM Reference

Granite Vision 3.3 2B

granite-vision-3.3-2b

Open SourceMultimodal

About

IBM Granite Vision 3.3 2B improves on Granite Vision 3.2 with a new SigLIP2 vision encoder, higher-quality training data, and experimental capabilities: image segmentation, doctags generation, and multi-page support (up to 8 pages). Enhanced safety compared to 3.2. Architecture: SigLIP2 + 2-layer MLP (GELU) + Granite 3.1 2B Instruct. Benchmarks: DocVQA 0.91, TextVQA 0.80, OCRBench 0.79, InfoVQA 0.68. Apache 2.0.

Granite Vision 3.3 2B has a 128K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2025-06-11
Parameters2B
Context128K
ArchitectureSigLIP2 vision encoder + 2-layer MLP (GELU) + Granite 3.1 2B Instruct LLM

Created by

Creating reliable and adaptable AI solutions

Armonk, New York, United States
Founded 1945
Website