LLM ReferenceLLM Reference

Chameleon 7B

About

Chameleon 7B is an innovative mixed-modal foundation model from Meta AI's FAIR team, notable for its early-fusion architecture that treats images and text as a single sequence of tokens. This approach allows seamless integration of multimodal inputs, enhancing the coherence and contextual relevance of outputs. Utilizing a transformer architecture and incorporating advancements like query-key normalization, Chameleon 7B maintains stable training at scale. It excels in tasks like visual question answering, image captioning, and text generation, often outperforming similar-sized models. Originally capable of generating images, its public version is restricted to text-only outputs for safety considerations, making it a robust tool for research in multimodal AI applications.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyChameleon
Released2024-06-18
Parameters7B
Context4K
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

Large-scale open-source AI for social technologies.

Menlo Park, California, United States
Founded 2013
Website