GLM Models by Tsinghua Knowledge Engineering Group (THUDM)
About
The GLM family of large language models (LLMs) is a collaborative creation by the GLM team at Zhipu AI and Tsinghua University 1. These models range from 110 million to a massive 130 billion parameters and are especially notable for their bilingual proficiency, supporting both English and Chinese languages 7. The GLM architecture is distinct for its use of autoregressive blank infilling as the core pre-training strategy 10. Among these, the GLM-130B model stands out by matching or surpassing the performance of GPT-3 in various benchmarks, even outdoing ERNIE TITAN 3.0 in Chinese language tasks 7. Successive models like GLM-4 have been trained on ten trillion tokens and employ refined techniques like supervised fine-tuning and reinforcement learning from human feedback, enhancing alignment and instruction adherence 1. The GLM-4 series, including the GLM-4 All Tools, showcase the ability to understand user intent and use tools autonomously, such as web browsers and Python interpreters, for executing intricate tasks 1. Numerous GLM models have been released as open-source, which has resulted in millions of downloads 15.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 200k context, 744B parameters, and multimodal inputs.
Use when the workload needs 2k context and 130B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Together GLM-5 | Use when the workload needs 200k context, 744B parameters, and multimodal inputs. | 2026-02 | 200k context744B parametersmultimodal inputs | Current |
| GLM-130B | Use when the workload needs 2k context and 130B parameters. | 2024-01 | 2k context130B parameters | Current |
Release Timeline
2 release groupsSpecifications(2 models)
| Model | Released | Context | Parameters | Multimodal |
|---|---|---|---|---|
| Together GLM-5 | 2026-02 | 200k | 744B | Yes |
| GLM-130B | 2024-01 | 2k | 130B | No |
Frequently Asked Questions
- What is GLM used for?
- GLM is used for vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
- How does GLM compare to ChatGLM-4?
- GLM by Tsinghua Knowledge Engineering Group (THUDM) is strongest where you need vision and multimodal work, while ChatGLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is the closest related family to check for adjacent model selection. GLM has 2 listed variants and reaches up to 200k context, while ChatGLM-4 reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
- Which GLM model should I use?
- If price is the main constraint, use the pricing table first because GLM does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Together GLM-5 with 200k context and multimodal inputs.





