LLM Reference

DeepSeek OCR Models by DeepSeek

2 models2025–2026Up to 8K ctxFrom $0.03/1M input

About

DeepSeek OCR is DeepSeek's family of specialized vision-language models for optical character recognition and document parsing. DeepSeek-OCR introduces Contexts Optical Compression (arXiv 2510.18234) while DeepSeek-OCR-2 introduces Visual Causal Flow for improved document understanding (arXiv 2601.20552).

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

2 in view

Use when the workload needs vision, 8K context, and multimodal inputs.

2026-01vision8K contextmultimodal inputs

Use when the workload needs vision, 8K context, and multimodal inputs.

2025-10vision8K contextmultimodal inputs

Release Timeline

2 release groups
2026-01
1 current
DeepSeek OCR 2
vision8K contextmultimodal inputs
Current
2025-10
1 current
DeepSeek OCR
vision8K contextmultimodal inputs
Current

Specifications(2 models)

DeepSeek OCR model specifications comparison
ModelReleasedContextVisionMultimodal
DeepSeek OCR 22026-018KYesYes
DeepSeek OCR2025-108KYesYes

Available From(1 provider)

Pricing

DeepSeek OCR model pricing by provider
ModelProviderInput / 1MOutput / 1MType
DeepSeek OCRNovita AI$0.03$0.03Serverless
DeepSeek OCR 2Novita AI$0.03$0.03Serverless

Frequently Asked Questions

What is DeepSeek OCR used for?
DeepSeek OCR is used for vision and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
How does DeepSeek OCR compare to Janus?
DeepSeek OCR by DeepSeek is strongest where you need vision, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek OCR has 2 listed variants and reaches up to 8K context, so compare the specs and pricing tables before choosing a production model.
Which DeepSeek OCR model should I use?
For the lowest listed input price, start with DeepSeek OCR through Novita AI at $0.03/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek OCR 2 with 8K context and multimodal inputs.

Models(2)