LLM Reference

Higgs Audio Models by Boson AI

Boson AINoncommercialOpen weights
1 model2026Up to 8k ctx

Details

ResearcherBoson AI
Commercial useNon-commercial only
Models1
Released2026
Max context8k

Links

Website

About

Boson AI's Higgs Audio family of text-audio foundation models, spanning TTS (v3 TTS) and STT (v3 STT) variants. Designed for voice agents with zero-shot voice cloning, support for 100+ languages, and inline control over emotion, style, prosody, and sound effects.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

1 in view

Use when the workload needs text to speech, 8k context, and 4B parameters.

2026-06text to speech8k context4B parameters

Release Timeline

1 release group
2026-06
1 current
Higgs Audio v3 TTS
text to speech8k context4B parameters
Current

Specifications(1 models)

Higgs Audio model specifications comparison
ModelReleasedContextParameters
Higgs Audio v3 TTS2026-068k4B

Available From(1 provider)

Frequently Asked Questions

What is Higgs Audio used for?
Higgs Audio is used for text to speech and agent workflows. The family description and listed model capabilities point to those workloads as the best fit.
How does Higgs Audio compare to Claude 3?
Higgs Audio by Boson AI is strongest where you need text to speech, while Claude 3 by Anthropic is the closest related family to check for vision and multimodal work. Higgs Audio has 1 listed variant and reaches up to 8k context, while Claude 3 reaches up to 200k context, so compare the specs and pricing tables before choosing a production model.
Which Higgs Audio model should I use?
If price is the main constraint, use the pricing table first because Higgs Audio does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Higgs Audio v3 TTS with 8k context.