Question

When should you use RAG over a long-context language model?

Accepted Answer

RAG wins when your knowledge base is much larger than the model context window, when you need source attribution, or when cost matters at scale. Long-context models pay quadratic attention costs and suffer from the lost-in-the-middle effect, where facts buried in long contexts get ignored. RAG keeps prompts small by retrieving only the few most relevant chunks, reduces token cost dramatically, and gives exact source citations. Use long context for one-off document analysis, and RAG for production systems with frequently updated knowledge.