LLM Reference
InternLM-XComposer2

InternLM-XComposer2

About

The InternLM-XComposer2 family brings together a suite of vision-language large models (VLLMs) based on the InternLM2 foundation model. These models are adept at tackling advanced text-image comprehension and composition tasks, with variations such as the InternLM-XComposer2-VL excelling in multimodal benchmarks and InternLM-XComposer2 tailored for sophisticated text-image composition. A standout in this series, the InternLM-XComposer2.5, offers capabilities comparable to GPT-4V with just a 7B LLM backend, while the InternLM-XComposer2-4KHD can comprehend images up to 4K resolution. Open-source and accessible on Hugging Face and GitHub, these models cater to various needs, including multimodal content creation and enhancing visual language understanding, with some versions optimized for devices with limited resources through 4-bit quantization.

Models(3)

Details

ResearcherIntern-AI
Models3