MOSS-Audio 8B Thinking

Name: MOSS-Audio 8B Thinking
Author: MOSI AI

Released

2026-04-13

Last refreshed

2026-06-29

Status

Researched 44d ago

Open sourceCommercial use: permittedMultimodalVisionMultimodal

MOSS-Audio 8B Thinking is worth evaluating for vision when its provider route and context window match the workload.

Use it for

Teams evaluating vision
Buyers comparing 1 tracked provider route

Do not use it for

Strict JSON or tool-calling flows

Specifications

Family: MOSS-Audio
Released: 2026-04-13
Parameters: 8.6B
Architecture: Audio / Speech
Specialization: audio-understanding
Openness: Open source
License: Apache 2.0OSI-approvedCommercial use: permitted
Weights: Available
Code: Unknown
Training: Pretrained

Created by

MOSI AI

OpenMOSS speech, audio, and video foundation-model research.

Shanghai, China

Website

Pricing

Output / 1M

Input / 1M

Cheapest of 1 route · Hugging Face Inference Endpoints

Providers(1)

Hugging Face Inference Endpoints

View 1 provider route

Links

Website HuggingFace

About

MOSS-Audio 8B Thinking is the reasoning-tuned 8.6B variant of MOSI AI and OpenMOSS Team's open-weight audio understanding model. It uses the MOSS-Audio encoder and Qwen3-8B backbone, with Thinking post-training for complex audio reasoning over speech, environmental sound, music, timestamps, captions, and question answering.

MOSS-Audio 8B Thinking is an open-source model in the MOSS-Audio family. The structured metadata tracks multimodal input, audio, and reasoning. This page tracks provider routes through Hugging Face Inference Endpoints. No headline benchmark score is tracked for MOSS-Audio 8B Thinking yet.