Granite Speech 4.1 2B
granite-speech-4.1-2b
About
IBM Granite Speech 4.1 2B is a multilingual ASR (Automatic Speech Recognition) and AST (Automatic Speech Translation) model trained on 174,000 hours of audio. ASR: English, French, German, Spanish, Portuguese, Japanese. Translation: X→English (French, German, Spanish, Portuguese, Japanese) and English→X (French, German, Spanish, Italian, Japanese, Mandarin Chinese). Features: punctuation/truecasing, keyword biasing, dual-head CTC encoder. Architecture: 16 conformer blocks + 2-layer window Q-former + Granite 4.0 1B LLM base (128K context). Variants: granite-speech-4.1-2b-plus (adds speaker-attributed ASR, word timestamps), granite-speech-4.1-2b-nar (non-autoregressive, higher throughput). Apache 2.0.
Granite Speech 4.1 2B has a 128K-token context window.
Capabilities
Created by
Creating reliable and adaptable AI solutions