LLM Reference

GLM-130B

About

GLM-130B is a cutting-edge bilingual language model developed by Tsinghua University's KEG group, comprising 130 billion parameters. It employs a bidirectional architecture based on the General Language Model (GLM) framework, with autoregressive blank infilling as the main training objective. Pre-trained on over 400 billion tokens, GLM-130B performs exceptionally well across benchmarks, even outperforming GPT-3 in tasks like language understanding and generation. It excels in zero-shot settings, supports fast inference (up to 2.5 times faster with optimization), and utilizes INT4 quantization for efficient operation on hardware such as 4 RTX 3090 GPUs. This model is adept at handling various NLP tasks including question answering, sentiment analysis, and machine translation 2410.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyGLM
Parameters130B
ArchitectureDecoder Only
Specializationgeneral