LLaVA 1.6 Hermes Yi 34B
About
LLaVA-1.6, specifically the Hermes Yi 34B variant, represents a leap in multimodal AI capabilities, enhanced from its predecessor, LLaVA 1.5. This open-source chatbot excels in processing and responding to both text and image inputs. The model boasts a fourfold increase in image resolution support, enhanced visual reasoning and OCR capabilities, and improved visual conversation and world knowledge. It leverages the Nous-Hermes-2-Yi-34B language model as its backbone, offering superior commercial licenses and bilingual support. LLaVA-1.6-34B outshines other open-source models and even competes with Google's Gemini Pro on some tasks. Its training efficiency is impressive, requiring just one day on 32 A100 GPUs, and a demo for chat, image captioning, and visual question answering is accessible online.
Capabilities
Providers(2)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| NVIDIA NIM | — | — | Provisioned | |
| Fireworks AI Platform | — | — | Provisioned |