How would you handle a RAG system where the corpus updates hourly?
RAG & Vector DB Interview: Common RAG Mistakes, Pitfalls, System Design Questions
Audio flashcard · 0:29Nortren·
How would you handle a RAG system where the corpus updates hourly?
0:29
Use incremental ingest rather than rebuilds, maintaining document-level hashes or timestamps to detect changes. Queue and batch updates to amortize embedding and indexing cost. Vector databases like Pinecone, Qdrant, and Milvus support online upserts without downtime, so the main work is on the ingest side. Delete or mark stale documents to avoid serving outdated content. For time-sensitive queries like news, add recency boosts or filters using timestamp metadata. Monitor freshness latency from source update to query-time visibility.
docs.pinecone.io