LLM ReferenceLLM Reference

Granite 4.1 8B

granite-4.1-8b

Open Source

About

IBM Granite 4.1 8B is a dense decoder-only transformer instruct model with 40 layers, 4096 embedding size, GQA (32 attention heads, 8 KV heads). Supports multilingual dialog (12 languages), code with FIM, tool-calling/function-calling, RAG, and summarization. Trained on NVIDIA GB200 NVL72 cluster. Apache 2.0. Benchmarks: MMLU 73.84, HumanEval 85.37, GSM8K 92.49, BFCL v3 68.27.

Granite 4.1 8B has a 131K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2026-04-29
Parameters8B
Context131K
ArchitectureDense decoder-only transformer: 40 layers, 4096 embed, 32 attn heads, 8 KV heads, SwiGLU, RoPE, RMSNorm

Created by

Creating reliable and adaptable AI solutions

Armonk, New York, United States
Founded 1945
Website