Activation-aware Weight Quantization

AWQ

Definition

AWQ (Activation-aware Weight Quantization) quantizes LLM weights based on activation statistics, preserving the most important weights for model performance. It achieves higher accuracy than uniform quantization at lower bit-widths by adapting quantization granularity to activation patterns.