What is contextual compression in RAG?
LLM Engineer Interview Questions: Advanced RAG Techniques — Self-RAG, GraphRAG, Agentic RAG
Audio flashcard · 0:19Nortren·
What is contextual compression in RAG?
0:19
Contextual compression filters or rewrites retrieved chunks before sending them to the LLM, removing irrelevant content and keeping only the parts that actually answer the query. This reduces prompt size, lowers cost, and helps the LLM focus on what matters. Compression can be done with a smaller model, an LLM, or a trained extractor.
python.langchain.com