LLM ReferenceLLM Reference

Granite 4.0 H Tiny

granite-4.0-h-tiny

Open Source

About

IBM Granite 4.0 H Tiny is a hybrid Mixture-of-Experts (MoE) model with 7B total parameters and 1B active parameters. Architecture: 4 attention + 36 Mamba2 layers, 64 total experts with 6 active, 1536 embedding size. Efficient inference via sparse activation. Supports multilingual dialog (12 languages), code (FIM), tool-calling, and RAG. Benchmarks: MMLU 67.43, HumanEval 81%, GSM8K 81.35, SALAD-Bench 96.28. Apache 2.0.

Granite 4.0 H Tiny has a 128K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyGranite 4
Released2025-10-02
Parameters7B total / 1B active
Context128K
ArchitectureHybrid MoE: 4 attention + 36 Mamba2 layers, 64 experts / 6 active, 1536 embedding

Created by

Creating reliable and adaptable AI solutions

Armonk, New York, United States
Founded 1945
Website