MOSS-Audio 8B Instruct
MOSS-Audio 8B Instruct is worth evaluating for vision when its provider route and context window match the workload.
Use it for
- Teams evaluating vision
- Buyers comparing 1 tracked provider route
Do not use it for
- Strict JSON or tool-calling flows
- Family
- MOSS-Audio
- Released
- 2026-04-13
- Parameters
- 8.6B
- Architecture
- audio-language-transformer
- Specialization
- audio-understanding
- License
- Apache 2.0
- Training
- pretrained
Cheapest of 1 route · Hugging Face Inference Endpoints
About
MOSS-Audio 8B Instruct is the instruction-following 8.6B variant of MOSI Intelligence and OpenMOSS Team's open-weight audio understanding model. It pairs the MOSS-Audio encoder with a Qwen3-8B language backbone and is positioned for stronger open-source speech, sound, music, audio captioning, ASR, timestamp, and QA workloads.
MOSS-Audio 8B Instruct is a model in the MOSS-Audio family. The structured metadata tracks multimodal input and audio. This page tracks provider routes through Hugging Face Inference Endpoints. No headline benchmark score is tracked for MOSS-Audio 8B Instruct yet.
Top use-case fit
Vision
Included by capability and metadata signals in the decision map.
Provider price ladder
Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Hugging Face Inference Endpoints | - | - | Partial |
Capabilities
Benchmark peer barsfor Vision
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.