LLM ReferenceLLM Reference

NeVA 22B

Deprecated

About

NeVA-22B is a sophisticated vision-language model from NVIDIA, capable of interpreting and responding to intricate instructions that involve both text and images. It integrates a GPT-based language model with a CLIP model for image encoding, projecting image data into a shared text space for seamless processing. Trained with extensive datasets, including image-caption pairs and synthetic GPT-4 generated data, NeVA-22B excels in tasks such as language generation and visual question answering. It is optimized for NVIDIA’s hardware and utilizes Triton and TensorRT-LLM for efficient inference. Despite its advancements, users should be cautious of potential biases and inaccuracies in its outputs.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
NVIDIA NIMProvisioned

Specifications

FamilyNeVA
Released2024-03-01
Parameters22B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

Accelerated AI for enterprise solutions

Santa Clara, California, United States
Founded 2015
Website

Providers(1)