LLM Reference
Cerebras LLaVA

Cerebras LLaVA

About

Cerebras Systems' large language model family, based on the LLaVA architecture, are cutting-edge multimodal models capable of processing both text and images. These models, freely available on Hugging Face, come in various sizes, notably the 7B and 13B parameter versions 48. They are trained using Cerebras’s unique Wafer-Scale Engine, ensuring efficient and robust training processes. Enhancing their functionality, a vision encoder, such as the CLIP-VisionModel-Large, is integrated, empowering these models with visual instruction following and chat capabilities in a multimodal context. These models leverage Vicuna checkpoints for pretraining and are extensively fine-tuned on varied datasets, optimizing their performance for research in large multimodal models and chatbot applications. Notably, vision encoder checkpoints are provided separately, broadening their applicability in diverse projects 4.

Models(2)

Details

ResearcherCerebras
Models2