LLM ReferenceLLM Reference

UI-TARS 1.5 7B

ui-tars-1.5-7b

Open SourceMultimodal

About

UI-TARS-1.5 is ByteDance's multimodal vision-language agent model optimized for GUI-based environments including desktop interfaces, web browsers, and mobile apps. It supports grounding, planning, and action execution for computer-use tasks.

UI-TARS 1.5 7B has a 128K-token context window.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyUI-TARS
Released2026-02-01
Parameters7B
Context128K
ArchitectureDecoder Only
Specializationagents
LicenseApache 2.0
Trainingfinetuned

Created by

TikTok data enhances AI realism

Beijing, China
Founded 2012
Website