LLM ReferenceLLM Reference

Granite Speech 4.1 2B

granite-speech-4.1-2b

Open SourceMultimodal

About

IBM Granite Speech 4.1 2B is a multilingual ASR (Automatic Speech Recognition) and AST (Automatic Speech Translation) model trained on 174,000 hours of audio. ASR: English, French, German, Spanish, Portuguese, Japanese. Translation: X→English (French, German, Spanish, Portuguese, Japanese) and English→X (French, German, Spanish, Italian, Japanese, Mandarin Chinese). Features: punctuation/truecasing, keyword biasing, dual-head CTC encoder. Architecture: 16 conformer blocks + 2-layer window Q-former + Granite 4.0 1B LLM base (128K context). Variants: granite-speech-4.1-2b-plus (adds speaker-attributed ASR, word timestamps), granite-speech-4.1-2b-nar (non-autoregressive, higher throughput). Apache 2.0.

Granite Speech 4.1 2B has a 128K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2026-04-29
Parameters2B
Context128K

Created by

Creating reliable and adaptable AI solutions

Armonk, New York, United States
Founded 1945
Website