MemotivaLLM Engineer Interview Questions: Transformer Architecture, Self-Attention, and Modern LLM Foundations

What is self-attention?

LLM Engineer Interview Questions: Transformer Architecture, Self-Attention, and Modern LLM Foundations

Audio flashcard · 0:19

Nortren·

What is self-attention?

0:19

Self-attention is a mechanism that lets each token in a sequence look at every other token to compute its own representation. For each token, the model produces a query, a key, and a value vector. The attention score is computed by comparing queries with keys, then used to weight the values. This is how transformers capture relationships between distant tokens.
arxiv.org