Qwen3.5-VL-7B
Multimodal
About
Compact multimodal vision-language model supporting image and text understanding with 128K context.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution
Compact multimodal vision-language model supporting image and text understanding with 128K context.