What is the difference between encoder-only, decoder-only, and encoder-decoder transformers?
LLM Engineer Interview Questions: Transformer Architecture, Self-Attention, and Modern LLM Foundations
Audio flashcard · 0:19Nortren·
What is the difference between encoder-only, decoder-only, and encoder-decoder transformers?
0:19
Encoder-only models like BERT process the full input bidirectionally and are used for understanding tasks. Decoder-only models like GPT and Llama generate text autoregressively, predicting one token at a time. Encoder-decoder models like T5 first encode input then decode output, suited for translation and summarization. Modern LLMs are predominantly decoder-only.
huggingface.co