Qwen3.5 VL 7B
Multimodal
About
Compact multimodal vision-language model supporting image and text understanding with 128K context.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution
Compact multimodal vision-language model supporting image and text understanding with 128K context.