Granite Speech 4.1 2B NAR
Granite Speech 4.1 2B NAR has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Use it for
- Teams evaluating vision
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Strict JSON or tool-calling flows
- Teams that need a tracked hosted API route today
- Family
- Granite 4.1
- Released
- 2026-04-29
- Parameters
- 2B
- Specialization
- speech-recognition
- Openness
- Open source
- License
- Apache 2.0(OSI)Commercial use allowed
No tracked provider token pricing is available yet.
About
IBM Granite Speech 4.1 2B NAR (Non-AutoRegressive) is a high-throughput speech recognition model that generates transcriptions in a single forward pass rather than token-by-token. Architecture: 440M CTC speech encoder (16-layer Conformer) + 160M Q-Former projector + 1B bidirectional LLM editor (LoRA-adapted Granite-4.0-1b-base). Achieves ~1820x real-time factor on a single H100 GPU at batch size 128 and 1.29% WER on LibriSpeech Clean. Optimized for latency-sensitive production deployments. Apache 2.0 license.
Granite Speech 4.1 2B NAR is an open-source model in the Granite 4.1 family. The structured metadata tracks multimodal input and audio. No headline benchmark score is tracked for Granite Speech 4.1 2B NAR yet.
Top use-case fit
Vision
Included by capability and metadata signals in the decision map.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Capabilities
Benchmark peer barsfor Vision
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
No tracked provider token pricing is available yet.