LLM Reference
DeepSeek VL

DeepSeek VL

DeepSeekHighlight

About

DeepSeek-VL is an advanced open-source family of vision-language models crafted for real-world applications, offering 1.3B and 7B parameter sizes with both "base" and "chat" variants. A standout feature is its hybrid vision encoder, which efficiently handles 1024 x 1024 high-resolution images, balancing performance with low computational needs. The models prioritize robust language abilities by integrating vision-language data strategically during training, preventing any compromise on language performance. With a vast pretraining dataset sourced from Common Crawl, web code, e-books, and educational content, DeepSeek-VL achieves competitive or state-of-the-art results across various benchmarks. These models aim to bridge the open-source and closed-source performance gap, enhancing both user experience and real-world applicability, and are available on platforms like Hugging Face for easy access.

Models(4)

Details

ResearcherDeepSeek
Models4