Phi-3 Vision on Azure OpenAI

Name: Phi-3 Vision on Azure OpenAI
Brand: Microsoft Research
Price: 0.28 USD

Phi-3 · Microsoft Research

Provisioned

Pricing

Type	Price (per 1M)
Input tokens	$0.28
Output tokens	$0.84

Capabilities

VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution

About Phi-3 Vision

Phi-3 Vision is a sophisticated multimodal AI model from Microsoft, designed to adeptly integrate language and vision capabilities. Unlike traditional language models, it processes both text and images and can perform tasks such as optical character recognition, chart analysis, and image interpretation. Its architecture features an image encoder, a text-image connector, a projector for mapping image features, and the Phi-3 Mini language model. Despite its relatively small size of 4.2 billion parameters, it competes with larger models and suits devices with limited computational power. Phi-3 Vision's ability to handle up to 128K tokens supports complex multimodal reasoning. It draws upon high-quality and synthetic data for training while incorporating essential safety measures.

Get Started

Model Card Docs Portal Pricing

Model Specs

Released2024-05-21

Parameters4.2B

Context128K

ArchitectureDecoder Only

Microsoft

All models on Azure OpenAI →

Phi-3 Vision on Azure OpenAI

Pricing

Capabilities

About Phi-3 Vision

Get Started

Model Specs

Other Providers(2)

Related Models on Azure OpenAI

Provider