LLM ReferenceLLM Reference

Granite Vision 3.2 2B

granite-vision-3.2-2b

Open SourceMultimodal

About

IBM Granite Vision 3.2 2B is a compact vision-language model for visual document understanding. Architecture: SigLIP vision encoder + two-layer MLP connector + Granite 3.1 2B Instruct LLM. Excels at tables, charts, OCR, infographics, and document QA. Benchmarks: DocVQA 0.89, ChartQA 0.87, TextVQA 0.78, OCRBench 0.77. Apache 2.0.

Granite Vision 3.2 2B has a 128K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2025-02-26
Parameters2B
Context128K
ArchitectureSigLIP vision encoder + 2-layer MLP (GELU) + Granite 3.1 2B Instruct LLM

Created by

Creating reliable and adaptable AI solutions

Armonk, New York, United States
Founded 1945
Website