LLM Reference

Palmyra Vision

About

Palmyra Vision is Writer's cutting-edge multimodal LLM that excels at interpreting and generating text from images, making it an ideal solution for various enterprise applications. Its robust capabilities include extracting text from handwritten notes, classifying objects in images, and analyzing visual data like charts and graphs. Surpassing models such as GPT-4V and Gemini 1.0 Ultra with an 84.4% score on the VQAv2 benchmark, Palmyra Vision is designed for seamless integration within Writer's AI platform, enabling custom application creation with minimal engineering. It supports areas like compliance, e-commerce, finance, and healthcare while offering scalable pricing at $0.015 per image or video second, and $22.50 per million text words.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

Released2024-02-27
ArchitectureDecoder Only
Specializationgeneral