LLM Reference

MOSI AI

Researched 6d ago
Flagship Q/$
Quality
$/M out

7 models across 3 families · Latest: MOSS-TTS-v1.5 (2026-05)

OpenMOSS speech, audio, and video foundation-model research.

VisionChinaOpen Source

MOSI AI's portfolio covers 7 active models across 3 non-obsolete families, with task labels spanning vision. Open a model detail page to compare provider routes and sourced benchmarks.

Portfolio context: 1 decision-task tag, 7 active tracked models, latest research stamp 2026-06-04.

Use it for

  • Teams evaluating vision across this lab's releases
  • Readers comparing families before locking a flagship SKU
  • 7 tracked SKUs for migration and pricing follow-ups

Do not use it for

  • Choosing a hosting provider without opening a model page for price ladders

Active models

7

Non-deprecated SKUs linked to this researcher

Active families

3

Non-obsolete families in coverage

Open catalog

7 open

7 OSI source / 0 open weights (0 text-match)

Decision task tags

1

Mapped to the site-wide task taxonomy

Latest dated release

2026-05-26

MOSS-TTS-v1.5

Freshness

2026-06-04

Researched 6d ago

fresh

Information

Shanghai, China

Release cadence

Showing 5 recent dated ships (full timeline below). Latest spotlight: MOSS-TTS-v1.5 (2026-05-26).

Where this lab wins

  • Vision: 6 tracked models with multimodal benchmark coverage.

Flagship quality / price signal

Anchor SKU: MOSS-Audio 4B Instruct (best sourced coding Q/$ in this portfolio).

Quality / dollar unavailable for this anchor — missing benchmark coverage and/or output token price on the cheapest ladder route (open the model detail after pricing lands).

MOSI AI is a Chinese AI research organization. OpenMOSS speech, audio, and video foundation-model research. MOSI AI ships 3 model families totaling 7 models, with the most recent release MOSS-TTS-v1.5 in 2026-05. Notable families include MOSS-TTS, MOSS-Audio, and MOVA. Use it as a stable reference for lab background, release coverage, and follow-up model pages as they are added. Researchers and evaluators can. View official API endpoints, benchmark performance, and coding/agent fit for every MOSI AI model.

About

MOSI AI is the organization behind the OpenMOSS Team's open-weight speech, audio, and video foundation models, including MOSS-TTS for text-to-speech, MOSS-Audio for real-world audio understanding, and MOVA for synchronized video-audio generation. Its OpenMOSS presence publishes research code, model cards, and weights through GitHub and Hugging Face, and should be tracked separately from Kyutai's Moshi voice model family.

Featured models

ModelReleasedContextInput price ($/1M)Output price ($/1M)LicenseOpenness
MOSS-TTS-v1.52026-05-26---Apache 2.0Open source
MOSS-Audio 4B Instruct2026-04-13---Apache 2.0Open source
MOSS-Audio 4B Thinking2026-04-13---Apache 2.0Open source

Model families

Recent releases

  1. MOSS-TTS-v1.5- 2026-05-26
  2. MOSS-Audio 4B Instruct- 2026-04-13
  3. MOSS-Audio 4B Thinking- 2026-04-13
  4. MOSS-Audio 8B Instruct- 2026-04-13
  5. MOSS-Audio 8B Thinking- 2026-04-13

FAQ

What models has MOSI AI released?

MOSI AI ships 7 models across 3 families: MOSS-TTS, MOSS-Audio, and MOVA.

Is MOSI AI's technology open source?

All tracked models are released under Apache 2.0.

Where is MOSI AI headquartered?

MOSI AI is headquartered in Shanghai, China.

What is MOSI AI known for?

OpenMOSS speech, audio, and video foundation-model research. Its most prominent tracked family is MOSS-TTS.

How can I access MOSI AI's models?

MOSI AI's models are available via Hugging Face Inference Endpoints.

Explore related pages

Last reviewed: 2026-06-04. Data sourced from public lab announcements and provider documentation.