LLM Reference
AI Glossary
architecture

Attention mechanism

Definition

The attention mechanism allows large language models to weigh the importance of different tokens in a sequence relative to each other when processing input, enabling focus on relevant context regardless of position. It is core to transformer architectures, powering parallel computation and long-range dependencies.

Models Using Attention mechanism(12)