LLM ReferenceLLM Reference
OpenRouter

Granite 4.1 8B on OpenRouter

Granite 4.1 · IBM Research

ServerlessOpen Source

Pricing

TypePrice (per 1M)
Input tokens$0.05
Output tokens$0.10

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About Granite 4.1 8B

IBM Granite 4.1 8B is a dense decoder-only transformer instruct model with 40 layers, 4096 embedding size, GQA (32 attention heads, 8 KV heads). Supports multilingual dialog (12 languages), code with FIM, tool-calling/function-calling, RAG, and summarization. Trained on NVIDIA GB200 NVL72 cluster. Apache 2.0. Benchmarks: MMLU 73.84, HumanEval 85.37, GSM8K 92.49, BFCL v3 68.27.

Get Started

Model Specs

Released2026-04-29
Parameters8B
Context131K
ArchitectureDense decoder-only transformer: 40 layers, 4096 embed, 32 attn heads, 8 KV heads, SwiGLU, RoPE, RMSNorm