LLM ReferenceLLM Reference

NeVA

NVIDIA AICC-BY-NC-4.0
3 models2024

About

NeVA is a family of multimodal vision-language models developed by NVIDIA within the NeMo Multimodal ecosystem. These models integrate large language models, such as NVGPT or LLaMA, with a vision encoder to interpret and generate human-like responses to both text and images. The NeVA models undergo a two-stage training process involving feature alignment pre-training and end-to-end fine-tuning. They are ideal for tasks that require an advanced understanding of visual content and precise instruction following, showcasing capabilities akin to advanced multimodal models like GPT-4, even when presented with novel inputs. A specific variant, Video NeVA, expands its functionality to include video data processing by converting videos into sequences of image frames. Available in versions with 8B, 22B, and 43B parameters, NeVA models make use of NeMo's framework features for efficient training, including model parallelism and activation checkpointing. It is noteworthy that some NeVA models are restricted to non-commercial use 12.

Specifications(3 models)

NeVA model specifications comparison
ModelReleasedParameters
NeVA 43B2024-0343B
NeVA 8B2024-038B

Available From(1 provider)

Frequently Asked Questions

What is NeVA?
NeVA is a family of multimodal vision-language models developed by NVIDIA within the NeMo Multimodal ecosystem. These models integrate large language models, such as NVGPT or LLaMA, with a vision encoder to interpret and generate human-like responses to both text and images. The NeVA models undergo a two-stage training process involving feature alignment pre-training and end-to-end fine-tuning. They are ideal for tasks that require an advanced understanding of visual content and precise instruction following, showcasing capabilities akin to advanced multimodal models like GPT-4, even when presented with novel inputs. A specific variant, Video NeVA, expands its functionality to include video data processing by converting videos into sequences of image frames. Available in versions with 8B, 22B, and 43B parameters, NeVA models make use of NeMo's framework features for efficient training, including model parallelism and activation checkpointing. It is noteworthy that some NeVA models are restricted to non-commercial use 12.
How many models are in the NeVA family?
The NeVA family contains 3 models.
What is the latest NeVA model?
The latest model is NeVA 43B, released in 2024-03.

Models(3)