BGE M3
BGE M3 is worth evaluating for general LLM work when its provider route and context window match the workload.
Use it for
- Teams evaluating general LLM work
- Workloads that can use a 8k context window
- Buyers comparing 2 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
Cheapest of 2 routes · Novita AI
About
BGE-M3 is BAAI's flagship multilingual embedding model that simultaneously performs dense retrieval, sparse (lexical) retrieval, and multi-vector (ColBERT-style) retrieval. It covers 100+ languages with an 8,192-token context window — far longer than most embedding models — making it effective for both short queries and long documents. Built on an extended XLM-RoBERTa architecture, it achieves state-of-the-art results on the MKQA and MLDR multilingual retrieval benchmarks and is available via NVIDIA NIM.
BGE M3 is an open-source model in the BGE family. The structured metadata tracks a 8k-token context window. This page tracks provider routes through Cloudflare Workers AI and Novita AI. No headline benchmark score is tracked for BGE M3 yet.
Top use-case fit
No primary decision-task fit is mapped for this model yet.
Provider price ladder
Compare all 2Compare API pricing across 2 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Novita AI | $0.010 | - | ServerlessPartial |
| Cloudflare Workers AI | - | - | ServerlessPartial |
Capabilities
No model capability flags are currently sourced.
Benchmark peer barsfor Coding
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
Cheapest of 2 routes · Novita AI