Granite 4.0 H Tiny
granite-4.0-h-tiny
Open Source
About
IBM Granite 4.0 H Tiny is a hybrid Mixture-of-Experts (MoE) model with 7B total parameters and 1B active parameters. Architecture: 4 attention + 36 Mamba2 layers, 64 total experts with 6 active, 1536 embedding size. Efficient inference via sparse activation. Supports multilingual dialog (12 languages), code (FIM), tool-calling, and RAG. Benchmarks: MMLU 67.43, HumanEval 81%, GSM8K 81.35, SALAD-Bench 96.28. Apache 2.0.
Granite 4.0 H Tiny has a 128K-token context window.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution