BGE M3
bge-m3
About
BGE-M3 is BAAI's flagship multilingual embedding model that simultaneously performs dense retrieval, sparse (lexical) retrieval, and multi-vector (ColBERT-style) retrieval. It covers 100+ languages with an 8,192-token context window — far longer than most embedding models — making it effective for both short queries and long documents. Built on an extended XLM-RoBERTa architecture, it achieves state-of-the-art results on the MKQA and MLDR multilingual retrieval benchmarks and is available via NVIDIA NIM.
BGE M3 has a 8K-token context window.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution