Granite 4.1 8B
granite-4.1-8b
Open Source
About
IBM Granite 4.1 8B is a dense decoder-only transformer instruct model with 40 layers, 4096 embedding size, GQA (32 attention heads, 8 KV heads). Supports multilingual dialog (12 languages), code with FIM, tool-calling/function-calling, RAG, and summarization. Trained on NVIDIA GB200 NVL72 cluster. Apache 2.0. Benchmarks: MMLU 73.84, HumanEval 85.37, GSM8K 92.49, BFCL v3 68.27.
Granite 4.1 8B has a 131K-token context window.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution
Specifications
FamilyGranite 4.1
Released2026-04-29
Parameters8B
Context131K
ArchitectureDense decoder-only transformer: 40 layers, 4096 embed, 32 attn heads, 8 KV heads, SwiGLU, RoPE, RMSNorm
Created by
Creating reliable and adaptable AI solutions
Armonk, New York, United States
Founded 1945
Website