LLM Reference

Qwen3 Embedding Models by Alibaba

AlibabaApache 2.0
2 models2025Up to 33K ctxFrom $0.07/1M input

About

Qwen3 Embedding is Alibaba's multilingual text embedding model series from the Qwen3 generation, supporting 119 languages. Available in 0.6B, 4B, and 8B sizes. Open-sourced under Apache 2.0 and achieves SOTA on MTEB multilingual benchmarks.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

2 in view

Use when the workload needs embedding, 33K context, and 600M parameters.

2025-06embedding33K context600M parameters

Use when the workload needs embedding, 33K context, and 8B parameters.

2025-06embedding33K context8B parameters

Release Timeline

1 release group
2025-06
2 current
Qwen3 Embedding 0.6B
embedding33K context600M parameters
Current
Qwen3 Embedding 8B
embedding33K context8B parameters
Current

Specifications(2 models)

Qwen3 Embedding model specifications comparison
ModelReleasedContextParameters
Qwen3 Embedding 0.6B2025-0633K0.6B
Qwen3 Embedding 8B2025-0633K8B

Available From(1 provider)

Pricing

Qwen3 Embedding model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Qwen3 Embedding 0.6BNovita AI$0.07Serverless
Qwen3 Embedding 8BNovita AI$0.07Serverless

Frequently Asked Questions

What is Qwen3 Embedding used for?
Qwen3 Embedding is used for embedding. The family description and listed model capabilities point to those workloads as the best fit.
How does Qwen3 Embedding compare to Tongyi DeepResearch?
Qwen3 Embedding by Alibaba is strongest where you need embedding, while Tongyi DeepResearch by Alibaba is the closest related family to check for adjacent model selection. Qwen3 Embedding has 2 listed variants and reaches up to 33K context, while Tongyi DeepResearch reaches up to 131K context, so compare the specs and pricing tables before choosing a production model.
Which Qwen3 Embedding model should I use?
For the lowest listed input price, start with Qwen3 Embedding 0.6B through Novita AI at $0.07/1M input tokens. For the most capable/latest local choice, evaluate Qwen3 Embedding 0.6B with 33K context.

Models(2)