LLM Reference

GLM-4 Models by Tsinghua Knowledge Engineering Group (THUDM)

16 models2024–2026Up to 200k ctxFrom $0.05/1M input

About

The GLM-4 family, developed by Zhipu AI and Tsinghua University, represents an evolving series of large language models renowned for their multilingual capabilities and state-of-the-art performance. Building upon previous ChatGLM generations, these models are pre-trained on an extensive dataset of ten trillion tokens across Chinese, English, and 24 other languages. They undergo rigorous multi-stage post-training, involving supervised fine-tuning and reinforcement learning from human feedback, which enables them to rival or surpass GPT-4 on various benchmarks. The series includes versions like GLM-4, GLM-4-Air, and GLM-4-9B, each tailored for different tasks and resource constraints. A notable feature is the GLM-4 All Tools model that can autonomously use web browsers and Python interpreters for complex task completion. Open-source variants, such as GLM-4-9B and its chat-optimized version, along with multimodal models like GLM-4V-9B, which integrates image processing, highlight the family's versatility. Recent advancements include the GLM-4-Voice model, an end-to-end speech model supporting Chinese and English, further extending the boundaries of open-source LLMs 1 3 5 6 7 8.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

16 in view
GLM 4.7Current

Use when the workload needs 200k context, tool use, and function calling.

2026-03200k contexttool usefunction calling
GLM 4.6VCurrent

Use when the workload needs 128k context, tool use, and function calling.

2026-02128k contexttool usefunction calling
GLM 4.5VCurrent

Use when the workload needs 64k context, tool use, and function calling.

2026-0164k contexttool usefunction calling

Use when the workload needs 128k context and 9B parameters.

2025-05128k context9B parameters

Use when the workload needs 4B parameters.

2025-034B parameters
GLM-4 32BCurrent

Use when the workload needs 128k context, 32B parameters, and structured outputs.

2025-03128k context32B parametersstructured outputs
GLM-4.7Current

Use when the workload needs 128k context and structured outputs.

2025-01128k contextstructured outputs
GLM-4.5Current

Use when the workload needs 128k context and structured outputs.

2025-01128k contextstructured outputs

Use when the workload needs 198k context and structured outputs.

2025-01198k contextstructured outputs

Use when the workload needs 128k context and structured outputs.

2025-01128k contextstructured outputs
GLM-4.6Current

Use when the workload needs 198k context and structured outputs.

2025-01198k contextstructured outputs

Use when provider availability and model metadata match the workload.

2024-06
GLM-4-AirCurrent

Use when the workload needs 128k context.

2024-06128k context

Use when the workload needs 128k context.

2024-06128k context
GLM-4 9BCurrent

Use when the workload needs 131k context and 9B parameters.

2024-06131k context9B parameters
GLM-4V 9BCurrent

Use when the workload needs 131k context, 9B parameters, and multimodal inputs.

2024-06131k context9B parametersmultimodal inputs

Release Timeline

7 release groups
2026-03
1 current
GLM 4.7
200k contexttool usefunction calling
Current
2026-02
1 current
GLM 4.6V
128k contexttool usefunction calling
Current
2026-01
1 current
GLM 4.5V
64k contexttool usefunction calling
Current
2025-05
1 current
GLM-4 Code 9B
128k context9B parameters
Current
2025-03
2 current
GLM-4 32B
128k context32B parametersstructured outputs
Current
GLM-4 Air 4B
4B parameters
Current
2025-01
5 current
GLM-4.5
128k contextstructured outputs
Current
GLM-4.5-Air
128k contextstructured outputs
Current
GLM-4.6
198k contextstructured outputs
Current
GLM-4.7
128k contextstructured outputs
Current
GLM-4.7 Flash
198k contextstructured outputs
Current
2024-06
5 current
GLM-4 9B
131k context9B parameters
Current
GLM-4-Air
128k context
Current
GLM-4-Flash
128k context
Current
GLM-4V 9B
131k context9B parametersmultimodal inputs
Current

Specifications(16 models)

GLM-4 model specifications comparison
ModelReleasedContextParametersVisionMultimodalFn CallingTool UseStructured OutputsCode Exec
GLM 4.72026-03200k358B (32B active)NoNoYesYesYesYes
GLM 4.6V2026-02128k106B (12B active)YesYesYesYesNoNo
GLM 4.5V2026-0164k106B (12B active)YesYesYesYesNoNo
GLM-4 Code 9B2025-05128k9BNoNoNoNoNoNo
GLM-4 Air 4B2025-034BNoNoNoNoNoNo
GLM-4 32B2025-03128k32BNoNoNoNoYesNo
GLM-4.72025-01128k358B (32B active)NoNoNoNoYesNo
GLM-4.52025-01128k355B (32B active)NoNoNoNoYesNo
GLM-4.7 Flash2025-01198k30B (3B active)NoNoNoNoYesNo
GLM-4.5-Air2025-01128k106B (12B active)NoNoNoNoYesNo
GLM-4.62025-01198k355B (32B active)NoNoNoNoYesNo
GLM-4-Extreme2024-06NoNoNoNoNoNo
GLM-4-Air2024-06128kNoNoNoNoNoNo
GLM-4-Flash2024-06128kNoNoNoNoNoNo
GLM-4 9B2024-06131k9BNoNoNoNoNoNo
GLM-4V 9B2024-06131k9BNoYesNoNoNoNo

Available From(10 providers)

Pricing

GLM-4 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
GLM-4V 9BReplicate API$0.05$0.25Serverless
GLM-4.7 FlashCloudflare Workers AI$0.06$0.4Serverless
GLM-4.7 FlashOpenRouter$0.06$0.4Serverless
GLM-4.7 FlashVercel AI Gateway$0.07$0.4Serverless
GLM-4.7 FlashNovita AI$0.07$0.4Serverless
GLM-4 32BOpenRouter$0.1$0.1Serverless
GLM-4 9BAWS Bedrock$0.1$0.1Serverless
GLM-4 9BGCP Vertex AI$0.1$0.1Serverless
GLM-4.5-AirOpenRouter$0.13$0.85Serverless
GLM-4.5-AirNovita AI$0.13$0.85Serverless
GLM-4 9BBitdeer AI$0.14$0.42Serverless
GLM-4 9BFireworks AI$0.2$0.2Serverless
GLM-4.5-AirVercel AI Gateway$0.2$1.1Serverless
GLM 4.6VVercel AI Gateway$0.3$0.9Serverless
GLM 4.6VNovita AI$0.3$0.9Serverless
GLM-4.7OpenRouter$0.38$1.74Serverless
GLM-4.6OpenRouter$0.39$1.9Serverless
GLM-4.6Novita AI$0.55$2.2Serverless
GLM-4.7Fireworks AI$0.6$2.2Serverless
GLM-4.7GCP Vertex AI$0.6$2.2Serverless
GLM-4.5OpenRouter$0.6$2.2Serverless
GLM 4.7Fireworks AI$0.6$2.2Serverless
GLM-4.5Vercel AI Gateway$0.6$2.2Serverless
GLM 4.5VVercel AI Gateway$0.6$1.8Serverless
GLM-4.6Vercel AI Gateway$0.6$2.2Serverless
GLM 4.7Novita AI$0.6$2.2Serverless
GLM-4.5Novita AI$0.6$2.2Serverless
GLM 4.5VNovita AI$0.6$1.8Serverless
GLM-4.5Fireworks AI$0.9$0.9Serverless
GLM-4.7 FlashFireworks AI$0.9$0.9Serverless
GLM-4.5-AirFireworks AI$0.9$0.9Serverless
GLM-4.6Fireworks AI$0.9$0.9Serverless
GLM 4.7Vercel AI Gateway$2.25$2.75Serverless

Comparisons

All comparisons →

Frequently Asked Questions

What is GLM-4 used for?
GLM-4 is used for vision and multimodal work, agent workflows and tool use, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does GLM-4 compare to ChatGLM-4?
GLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is strongest where you need vision and multimodal work, while ChatGLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is the closest related family to check for adjacent model selection. GLM-4 has 16 listed variants and reaches up to 200k context, while ChatGLM-4 reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
Which GLM-4 model should I use?
For the lowest listed input price, start with GLM-4V 9B through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate GLM 4.7 with 200k context and tool use, function calling, and structured outputs.

Models(16)