LLM Reference

Fuyu-Heavy

About

Adept's Fuyu-Heavy is a leading multimodal AI model designed as a digital agent. It ranks as the world's third most capable multimodal model, following GPT-4V and Gemini Ultra. Despite being smaller by a factor of 10-20 compared to these models, Fuyu-Heavy excels in multimodal reasoning, particularly in understanding user interfaces (UIs), and outperforms Gemini Pro on the MMMU benchmark. It delivers robust performance on text-based benchmarks, matching or surpassing models within its compute class. Developed to handle both text and image data, it powers Adept's enterprise product and supports complex calculations and long-form conversations.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyFuyu
Released2024-01-24
ArchitectureDecoder Only
Specializationgeneral