LLM Reference

Command A Vision (07-2025)

ProprietaryMultimodal

About

Cohere's first multimodal model capable of processing both text and images. Command A Vision excels in enterprise use cases such as analyzing charts, graphs, diagrams, table understanding, OCR, document Q&A, and object detection. Supports English, Portuguese, Italian, French, German, and Spanish.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

Rankings

Specifications

FamilyCommand
Released2025-07-01
Context128k
Architecturetransformer
Specializationvisionchatagentsdocument-understanding
LicenseProprietary