LLM ReferenceLLM Reference

Cerebras LLaVA 13B

About

Cerebras LLaVA 13B is a sophisticated multimodal large language model designed by Cerebras Systems, integrating a vision encoder with a language model. The model features a CLIP-VisionModel-Large and a language model derived from Vicuna-13B checkpoints, further refined with instruction tuning on diverse datasets. It is equipped with a projector module for seamless combination of modalities. Geared towards research in multimodal systems, the model supports tasks such as visual question answering by processing images and text. Researchers should exercise caution due to the potential presence of offensive content in the training data. It is accessible through the LLaVA source code for implementation.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2024-08-01
Parameters13B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

World's largest AI chip innovation

Sunnyvale, California, United States
Founded 2016
Website