LLM Reference

Cerebras LLaVA Models by Cerebras

2 models2024Up to 4k ctx

About

Cerebras Systems' large language model family, based on the LLaVA architecture, are cutting-edge multimodal models capable of processing both text and images. These models, freely available on Hugging Face, come in various sizes, notably the 7B and 13B parameter versions 48. They are trained using Cerebras’s unique Wafer-Scale Engine, ensuring efficient and robust training processes. Enhancing their functionality, a vision encoder, such as the CLIP-VisionModel-Large, is integrated, empowering these models with visual instruction following and chat capabilities in a multimodal context. These models leverage Vicuna checkpoints for pretraining and are extensively fine-tuned on varied datasets, optimizing their performance for research in large multimodal models and chatbot applications. Notably, vision encoder checkpoints are provided separately, broadening their applicability in diverse projects 4.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

2 in view

Use when the workload needs 4k context and 13B parameters.

2024-084k context13B parameters

Use when the workload needs 4k context and 7B parameters.

2024-084k context7B parameters

Release Timeline

1 release group
2024-08
2 current
Cerebras LLaVA 13B
4k context13B parameters
Current
Cerebras LLaVA 7B
4k context7B parameters
Current

Specifications(2 models)

Cerebras LLaVA model specifications comparison
ModelReleasedContextParameters
Cerebras LLaVA 13B2024-084k13B
Cerebras LLaVA 7B2024-084k7B

Frequently Asked Questions

What is Cerebras LLaVA used for?
Cerebras LLaVA is used for coding and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
How does Cerebras LLaVA compare to Cerebras GPT?
Cerebras LLaVA by Cerebras is strongest where you need coding, while Cerebras GPT by Cerebras is the closest related family to check for reasoning. Cerebras LLaVA has 2 listed variants and reaches up to 4k context, while Cerebras GPT reaches up to 2k context, so compare the specs and pricing tables before choosing a production model.
Which Cerebras LLaVA model should I use?
If price is the main constraint, use the pricing table first because Cerebras LLaVA does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Cerebras LLaVA 13B with 4k context.

Models(2)