Granite 4.0 Micro
granite-4.0-micro
Open Source
About
IBM Granite 4.0 Micro is a 3B dense (non-hybrid) instruct model using a conventional transformer architecture. 40 attention layers with GQA (32 heads, 8 KV heads) and RoPE. Distinct from Granite 4.0 H-Micro which uses hybrid Mamba2/attention architecture. Supports multilingual dialog (12 languages), code (FIM), tool-calling, and RAG. Apache 2.0.
Granite 4.0 Micro has a 128K-token context window.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution