LLM Reference
NVIDIA NIM

NeVA 22B on NVIDIA NIM

NeVA · NVIDIA AI

Provisioned

Pricing

TypePrice (per 1M)
Input tokensFree
Output tokensFree

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

About NeVA 22B

NeVA-22B is a sophisticated vision-language model from NVIDIA, capable of interpreting and responding to intricate instructions that involve both text and images. It integrates a GPT-based language model with a CLIP model for image encoding, projecting image data into a shared text space for seamless processing. Trained with extensive datasets, including image-caption pairs and synthetic GPT-4 generated data, NeVA-22B excels in tasks such as language generation and visual question answering. It is optimized for NVIDIA’s hardware and utilizes Triton and TensorRT-LLM for efficient inference. Despite its advancements, users should be cautious of potential biases and inaccuracies in its outputs.

Get Started

Model Specs

Released2024-03-01
Parameters22B
ArchitectureDecoder Only