Capabilities
VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution
About LLaVA 13B
Original LLaVA (Large Language-and-Vision Assistant) 13B model. Multimodal vision+language model combining a vision encoder with a language model for visual understanding tasks.
Model Specs
Released2023-04-17
Parameters13B
Context4K
ArchitectureDecoder Only