What is the difference between learned sparse and lexical sparse retrieval?
RAG & Vector DB Interview: Hybrid Search, BM25, Rerankers, ColBERT, RRF Explained
Audio flashcard · 0:28Nortren·
What is the difference between learned sparse and lexical sparse retrieval?
0:28
Lexical sparse retrieval like BM25 uses vocabulary terms as dimensions and hand-crafted statistical weights like term frequency and inverse document frequency, with no training required. Learned sparse retrieval like SPLADE or uniCOIL uses the same sparse vocabulary structure but learns weights from query-document pairs, often adding expansion terms not in the original text. Learned sparse matches exact keywords like lexical methods, while also closing vocabulary gaps like dense retrieval. It serves through the same inverted index infrastructure as BM25.
arxiv.org