What is GLM-4 used for?

GLM-4 is used for vision and multimodal work, agent workflows and tool use, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.

How does GLM-4 compare to ChatGLM-4?

GLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is strongest where you need vision and multimodal work, while ChatGLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is the closest related family to check for adjacent model selection. GLM-4 has 16 listed variants and reaches up to 200k context, while ChatGLM-4 reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.

Which GLM-4 model should I use?

For the lowest listed input price, start with GLM-4V 9B through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate GLM 4.7 with 200k context and tool use, function calling, and structured outputs.

GLM-4 Models by Tsinghua Knowledge Engineering Group (THUDM)

Tsinghua Knowledge Engineering Group (THUDM)

16 models2024–2026Up to 200k ctxFrom $0.05/1M input

About

The GLM-4 family, developed by Zhipu AI and Tsinghua University, represents an evolving series of large language models renowned for their multilingual capabilities and state-of-the-art performance. Building upon previous ChatGLM generations, these models are pre-trained on an extensive dataset of ten trillion tokens across Chinese, English, and 24 other languages. They undergo rigorous multi-stage post-training, involving supervised fine-tuning and reinforcement learning from human feedback, which enables them to rival or surpass GPT-4 on various benchmarks. The series includes versions like GLM-4, GLM-4-Air, and GLM-4-9B, each tailored for different tasks and resource constraints. A notable feature is the GLM-4 All Tools model that can autonomously use web browsers and Python interpreters for complex task completion. Open-source variants, such as GLM-4-9B and its chat-optimized version, along with multimodal models like GLM-4V-9B, which integrates image processing, highlight the family's versatility. Recent advancements include the GLM-4-Voice model, an end-to-end speech model supporting Chinese and English, further extending the boundaries of open-source LLMs 1 3 5 6 7 8.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

16 in view

GLM 4.7Current

Use when the workload needs 200k context, tool use, and function calling.

2026-03200k contexttool usefunction calling

GLM 4.6VCurrent

Use when the workload needs 128k context, tool use, and function calling.

2026-02128k contexttool usefunction calling

GLM 4.5VCurrent

Use when the workload needs 64k context, tool use, and function calling.

2026-0164k contexttool usefunction calling

GLM-4 Code 9BCurrent

Use when the workload needs 128k context and 9B parameters.

2025-05128k context9B parameters

GLM-4 Air 4BCurrent

Use when the workload needs 4B parameters.

2025-034B parameters

GLM-4 32BCurrent

Use when the workload needs 128k context, 32B parameters, and structured outputs.

2025-03128k context32B parametersstructured outputs

GLM-4.7Current

Use when the workload needs 128k context and structured outputs.

2025-01128k contextstructured outputs

GLM-4.5Current

Use when the workload needs 128k context and structured outputs.

2025-01128k contextstructured outputs

GLM-4.7 FlashCurrent

Use when the workload needs 198k context and structured outputs.

2025-01198k contextstructured outputs

GLM-4.5-AirCurrent

Use when the workload needs 128k context and structured outputs.

2025-01128k contextstructured outputs

GLM-4.6Current

Use when the workload needs 198k context and structured outputs.

2025-01198k contextstructured outputs

GLM-4-ExtremeCurrent

Use when provider availability and model metadata match the workload.

2024-06

GLM-4-AirCurrent

Use when the workload needs 128k context.

2024-06128k context

GLM-4-FlashCurrent

Use when the workload needs 128k context.

2024-06128k context

GLM-4 9BCurrent

Use when the workload needs 131k context and 9B parameters.

2024-06131k context9B parameters

GLM-4V 9BCurrent

Use when the workload needs 131k context, 9B parameters, and multimodal inputs.

2024-06131k context9B parametersmultimodal inputs

Current GLM-4 variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
GLM 4.7	Use when the workload needs 200k context, tool use, and function calling.	2026-03	200k contexttool usefunction calling	Current
GLM 4.6V	Use when the workload needs 128k context, tool use, and function calling.	2026-02	128k contexttool usefunction calling	Current
GLM 4.5V	Use when the workload needs 64k context, tool use, and function calling.	2026-01	64k contexttool usefunction calling	Current
GLM-4 Code 9B	Use when the workload needs 128k context and 9B parameters.	2025-05	128k context9B parameters	Current
GLM-4 Air 4B	Use when the workload needs 4B parameters.	2025-03	4B parameters	Current
GLM-4 32B	Use when the workload needs 128k context, 32B parameters, and structured outputs.	2025-03	128k context32B parametersstructured outputs	Current
GLM-4.7	Use when the workload needs 128k context and structured outputs.	2025-01	128k contextstructured outputs	Current
GLM-4.5	Use when the workload needs 128k context and structured outputs.	2025-01	128k contextstructured outputs	Current
GLM-4.7 Flash	Use when the workload needs 198k context and structured outputs.	2025-01	198k contextstructured outputs	Current
GLM-4.5-Air	Use when the workload needs 128k context and structured outputs.	2025-01	128k contextstructured outputs	Current
GLM-4.6	Use when the workload needs 198k context and structured outputs.	2025-01	198k contextstructured outputs	Current
GLM-4-Extreme	Use when provider availability and model metadata match the workload.	2024-06	—	Current
GLM-4-Air	Use when the workload needs 128k context.	2024-06	128k context	Current
GLM-4-Flash	Use when the workload needs 128k context.	2024-06	128k context	Current
GLM-4 9B	Use when the workload needs 131k context and 9B parameters.	2024-06	131k context9B parameters	Current
GLM-4V 9B	Use when the workload needs 131k context, 9B parameters, and multimodal inputs.	2024-06	131k context9B parametersmultimodal inputs	Current

Release Timeline

7 release groups

2026-03

1 current

GLM 4.7

200k contexttool usefunction calling

Current

2026-02

1 current

GLM 4.6V

128k contexttool usefunction calling

Current

2026-01

1 current

GLM 4.5V

64k contexttool usefunction calling

Current

2025-05

1 current

GLM-4 Code 9B

128k context9B parameters

Current

2025-03

2 current

GLM-4 32B

128k context32B parametersstructured outputs

Current

GLM-4 Air 4B

4B parameters

Current

2025-01

5 current

GLM-4.5

128k contextstructured outputs

Current

GLM-4.5-Air

128k contextstructured outputs

Current

GLM-4.6

198k contextstructured outputs

Current

GLM-4.7

128k contextstructured outputs

Current

GLM-4.7 Flash

198k contextstructured outputs

Current

2024-06

5 current

GLM-4 9B

131k context9B parameters

Current

GLM-4-Air

128k context

Current

GLM-4-Extreme

Current

GLM-4-Flash

128k context

Current

GLM-4V 9B

131k context9B parametersmultimodal inputs

Current

Specifications(16 models)

GLM-4 model specifications comparison
Model	Released	Context	Parameters	Vision	Multimodal	Fn Calling	Tool Use	Structured Outputs	Code Exec
GLM 4.7	2026-03	200k	358B (32B active)	No	No	Yes	Yes	Yes	Yes
GLM 4.6V	2026-02	128k	106B (12B active)	Yes	Yes	Yes	Yes	No	No
GLM 4.5V	2026-01	64k	106B (12B active)	Yes	Yes	Yes	Yes	No	No
GLM-4 Code 9B	2025-05	128k	9B	No	No	No	No	No	No
GLM-4 Air 4B	2025-03	—	4B	No	No	No	No	No	No
GLM-4 32B	2025-03	128k	32B	No	No	No	No	Yes	No
GLM-4.7	2025-01	128k	358B (32B active)	No	No	No	No	Yes	No
GLM-4.5	2025-01	128k	355B (32B active)	No	No	No	No	Yes	No
GLM-4.7 Flash	2025-01	198k	30B (3B active)	No	No	No	No	Yes	No
GLM-4.5-Air	2025-01	128k	106B (12B active)	No	No	No	No	Yes	No
GLM-4.6	2025-01	198k	355B (32B active)	No	No	No	No	Yes	No
GLM-4-Extreme	2024-06	—	—	No	No	No	No	No	No
GLM-4-Air	2024-06	128k	—	No	No	No	No	No	No
GLM-4-Flash	2024-06	128k	—	No	No	No	No	No	No
GLM-4 9B	2024-06	131k	9B	No	No	No	No	No	No
GLM-4V 9B	2024-06	131k	9B	No	Yes	No	No	No	No

Available From(10 providers)

AWS Bedrock

Bitdeer AI

Cloudflare Workers AI

Pricing

GLM-4 model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
GLM-4V 9B	Replicate API	$0.05	$0.25	Serverless
GLM-4.7 Flash	Cloudflare Workers AI	$0.06	$0.4	Serverless
GLM-4.7 Flash	OpenRouter	$0.06	$0.4	Serverless
GLM-4.7 Flash	Vercel AI Gateway	$0.07	$0.4	Serverless
GLM-4.7 Flash	Novita AI	$0.07	$0.4	Serverless
GLM-4 32B	OpenRouter	$0.1	$0.1	Serverless
GLM-4 9B	AWS Bedrock	$0.1	$0.1	Serverless
GLM-4 9B	GCP Vertex AI	$0.1	$0.1	Serverless
GLM-4.5-Air	OpenRouter	$0.13	$0.85	Serverless
GLM-4.5-Air	Novita AI	$0.13	$0.85	Serverless
GLM-4 9B	Bitdeer AI	$0.14	$0.42	Serverless
GLM-4 9B	Fireworks AI	$0.2	$0.2	Serverless
GLM-4.5-Air	Vercel AI Gateway	$0.2	$1.1	Serverless
GLM 4.6V	Vercel AI Gateway	$0.3	$0.9	Serverless
GLM 4.6V	Novita AI	$0.3	$0.9	Serverless
GLM-4.7	OpenRouter	$0.38	$1.74	Serverless
GLM-4.6	OpenRouter	$0.39	$1.9	Serverless
GLM-4.6	Novita AI	$0.55	$2.2	Serverless
GLM-4.7	Fireworks AI	$0.6	$2.2	Serverless
GLM-4.7	GCP Vertex AI	$0.6	$2.2	Serverless
GLM-4.5	OpenRouter	$0.6	$2.2	Serverless
GLM 4.7	Fireworks AI	$0.6	$2.2	Serverless
GLM-4.5	Vercel AI Gateway	$0.6	$2.2	Serverless
GLM 4.5V	Vercel AI Gateway	$0.6	$1.8	Serverless
GLM-4.6	Vercel AI Gateway	$0.6	$2.2	Serverless
GLM 4.7	Novita AI	$0.6	$2.2	Serverless
GLM-4.5	Novita AI	$0.6	$2.2	Serverless
GLM 4.5V	Novita AI	$0.6	$1.8	Serverless
GLM-4.5	Fireworks AI	$0.9	$0.9	Serverless
GLM-4.7 Flash	Fireworks AI	$0.9	$0.9	Serverless
GLM-4.5-Air	Fireworks AI	$0.9	$0.9	Serverless
GLM-4.6	Fireworks AI	$0.9	$0.9	Serverless
GLM 4.7	Vercel AI Gateway	$2.25	$2.75	Serverless

Comparisons

All comparisons →

Frequently Asked Questions

What is GLM-4 used for?: GLM-4 is used for vision and multimodal work, agent workflows and tool use, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does GLM-4 compare to ChatGLM-4?: GLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is strongest where you need vision and multimodal work, while ChatGLM-4 by Tsinghua Knowledge Engineering Group (THUDM) is the closest related family to check for adjacent model selection. GLM-4 has 16 listed variants and reaches up to 200k context, while ChatGLM-4 reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
Which GLM-4 model should I use?: For the lowest listed input price, start with GLM-4V 9B through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate GLM 4.7 with 200k context and tool use, function calling, and structured outputs.

Models(16)

GLM 4.7

2026-03200k358B (32B active)3 providers

GLM 4.6V

2026-02128k106B (12B active)2 providers

Multimodal

GLM 4.5V

2026-0164k106B (12B active)2 providers

GLM-4 Code 9B

GLM-4 Air 4B

GLM-4 32B

2025-03128k32B1 provider

Open Source

GLM-4.7

2025-01128k358B (32B active)4 providers

GLM-4.5

2025-01128k355B (32B active)4 providers

GLM-4.7 Flash

2025-01198k30B (3B active)5 providers

GLM-4.5-Air

2025-01128k106B (12B active)4 providers

GLM-4.6

2025-01198k355B (32B active)4 providers

GLM-4-Extreme

2024-06

GLM-4-Air

2024-06128k

GLM-4-Flash

2024-06128k

GLM-4 9B

2024-06131k9B4 providers

GLM-4V 9B

2024-06131k9B1 provider

Multimodal

GLM-4 Models by Tsinghua Knowledge Engineering Group (THUDM)

About

Current Variants

Release Timeline

Specifications(16 models)

Available From(10 providers)

Pricing

Comparisons

Frequently Asked Questions

Related Model Families

Models(16)