LLM ReferenceLLM Reference
4 models2023Up to 4K ctx

About

LLaVA, or Large Language and Vision Assistant, is an advanced family of open-source large multimodal models (LMMs) developed by a collaborative team from the University of Wisconsin-Madison, Microsoft Research, and Columbia University 126. These models uniquely integrate a vision encoder, such as CLIP ViT-L/14, with large language models like Vicuna, Mistral, and Nous-Hermes to enable robust visual and language understanding 126. A key innovation of LLaVA models is their end-to-end training process, enriched with GPT-4 generated multimodal instruction-following data to optimize performance 12. The evolution of LLaVA models includes LLaVA-1.5, which added an MLP vision-language connector and academic task-oriented data, and LLaVA-NeXT (1.6), which improved image resolution and broadened LLM support 6. Prioritizing data efficiency, these models are highly accessible for research purposes 12.

Specifications(4 models)

LLaVA model specifications comparison
ModelReleasedContextParametersVisionMultimodal
LLaVA Vicuna 13B2023-0413BNoNo
LLaVA Llama 2 13B2023-0413BNoNo
LLaVA Llama 2 7B2023-047BNoNo
LLaVA 13B2023-044K13BYesYes

Available From(1 provider)

Frequently Asked Questions

What is LLaVA?
LLaVA, or Large Language and Vision Assistant, is an advanced family of open-source large multimodal models (LMMs) developed by a collaborative team from the University of Wisconsin-Madison, Microsoft Research, and Columbia University 126. These models uniquely integrate a vision encoder, such as CLIP ViT-L/14, with large language models like Vicuna, Mistral, and Nous-Hermes to enable robust visual and language understanding 126. A key innovation of LLaVA models is their end-to-end training process, enriched with GPT-4 generated multimodal instruction-following data to optimize performance 12. The evolution of LLaVA models includes LLaVA-1.5, which added an MLP vision-language connector and academic task-oriented data, and LLaVA-NeXT (1.6), which improved image resolution and broadened LLM support 6. Prioritizing data efficiency, these models are highly accessible for research purposes 12.
How many models are in the LLaVA family?
The LLaVA family contains 4 models.
What is the latest LLaVA model?
The latest model is LLaVA Vicuna 13B, released in 2023-04.

Models(4)