GLM-5 on NVIDIA NIM

Name: GLM-5 on NVIDIA NIM
Brand: Zhipu AI
SKU: glm-5-nvidia-nim

GLM-5 · Zhipu AI

ServerlessOpen Source

Last refreshed 2026-06-30. Next refresh: weekly.

Why use GLM-5 on NVIDIA NIM?

NVIDIA NIM offers GLM-5 with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.

Compare GLM-5 across 7 providers to find the best fit for your use case

Input / 1M

Output / 1M

Cache

Not sourced

Batch

Not sourced

Setup recipe

Docs fallback

Install

Use the provider REST API or SDK

Auth

Create a provider API key

Call

model: z-ai/glm5

Model ID

z-ai/glm5

Request example

Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID z-ai/glm5.

Gotchas

Use provider model ID "z-ai/glm5", not the LLMReference slug "glm-5".

Compare GLM-5 Across Providers

Provider	Input (per 1M)	Output (per 1M)
Fireworks AI	$1.00	$3.20
OpenRouter	$0.60	$2.08
Together AI	$1.00	$3.20
GCP Vertex AI	$1.00	$3.20
NVIDIA NIM	—	—

View all 7 providers →

Pricing

Type	Rate
GPU Hour Rate	$1.00/GPU·hr

Capabilities

ReasoningFunction CallingTool UseStructured OutputsPrompt Caching

About GLM-5

Flagship open-weight foundation model from Zhipu AI with 744B parameters (40B active per token) in Mixture of Experts architecture. Trained on 28.5T tokens using DeepSeek Sparse Attention on Huawei Ascend hardware. Achieves state-of-the-art performance on coding and agentic benchmarks (SWE-bench Verified: 77.8%). Supports autonomous planning, multi-step tool use, and self-correction.