Command A Vision (07-2025)
ProprietaryMultimodal
About
Cohere's first multimodal model capable of processing both text and images. Command A Vision excels in enterprise use cases such as analyzing charts, graphs, diagrams, table understanding, OCR, document Q&A, and object detection. Supports English, Portuguese, Italian, French, German, and Spanish.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution
Specifications
FamilyCommand
Released2025-07-01
Context128k
Architecturetransformer
Specializationvisionchatagentsdocument-understanding
LicenseProprietary