Pricing
| Type | Price (per 1M) |
|---|---|
| Input tokens | $0.05 |
| Output tokens | $0.25 |
Capabilities
VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution
About GLM-4V 9B
GLM-4V-9B is an advanced, open-source multimodal large language model developed by THUDM at Tsinghua University. Building on the GLM-4 series, it incorporates autoregressive blank infilling and hybrid pretraining objectives, enhancing its capabilities in both text and image processing. This model excels in tasks like multi-round conversations in English and Chinese, image understanding, and high-resolution processing up to 1120 x 1120 pixels. Its strong performance surpasses other leading models like GPT-4 on various benchmarks, and it supports a large context window of up to 8K tokens, facilitating comprehensive understanding of longer inputs. Its open-source nature enriches the community by allowing wider access and collaboration.
Model Specs
Released2024-06-05
Parameters9B
ArchitectureDecoder Only