LLM Reference

LLaVA 13B

Open SourceMultimodal

About

Original LLaVA (Large Language-and-Vision Assistant) 13B model. Multimodal vision+language model combining a vision encoder with a language model for visual understanding tasks.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

Providers(1)

ProviderInput (per 1M)Output (per 1M)Type
Replicate APIServerless

Rankings

Specifications

FamilyLLaVA
Released2023-04-17
Parameters13B
Context4K
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Providers(1)