LLM ReferenceLLM Reference

Granite 4.0 Micro

granite-4.0-micro

Open Source

About

IBM Granite 4.0 Micro is a 3B dense (non-hybrid) instruct model using a conventional transformer architecture. 40 attention layers with GQA (32 heads, 8 KV heads) and RoPE. Distinct from Granite 4.0 H-Micro which uses hybrid Mamba2/attention architecture. Supports multilingual dialog (12 languages), code (FIM), tool-calling, and RAG. Apache 2.0.

Granite 4.0 Micro has a 128K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyGranite 4
Released2025-10-02
Parameters3B
Context128K
ArchitectureDense decoder-only transformer: 40 layers, GQA with RoPE, SwiGLU

Created by

Creating reliable and adaptable AI solutions

Armonk, New York, United States
Founded 1945
Website