LLM ReferenceLLM Reference

MiniCPM-V 4.6

minicpm-v-4.6

Open SourceMultimodalmultimodalvision-language

About

OpenBMB's compact 1.3B vision-language model released May 11, 2026, designed for on-device deployment on smartphones (iOS, Android, HarmonyOS) and edge devices. Pairs a SigLIP2-400M vision encoder with a Qwen3.5-0.8B language backbone using the LLaVA-UHD v4 approach. Supports single-image, multi-image, and video input (up to 128 frames), with text output. Context window: 262,144 tokens. Achieves 13 on the Artificial Analysis Intelligence Index — highest for any open-weights model under 2B parameters, with 19x lower token cost than Qwen3.5-0.8B. Available via vLLM, SGLang, llama.cpp, and Ollama. Apache 2.0 license.

MiniCPM-V 4.6 has a 256K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning

Rankings

Specifications

FamilyMiniCPM
Released2026-05-11
Parameters1.3B
Context262K
Architecturetransformer
LicenseApache 2.0

Created by

Efficient open-source language models for edge devices.

Beijing, China
Founded 2022
Website