MemotivaPrompt Engineering Patterns: Generated Knowledge, RAG Prompts, Citation and Grounding Techniques

What is the contextual compression technique?

Prompt Engineering Patterns: Generated Knowledge, RAG Prompts, Citation and Grounding Techniques

Audio flashcard · 0:19

Nortren·

What is the contextual compression technique?

0:19

Contextual compression filters or rewrites retrieved chunks before sending them to the LLM, removing irrelevant content and keeping only the parts that actually answer the query. This reduces prompt size, lowers cost, and helps the LLM focus. Compression can be done with a smaller model, an extractor, or another LLM call before the main generation. ---
python.langchain.com