MemotivaLLM Engineer Interview Questions: RAG Pipeline Design, Chunking Strategies, Hybrid Retrieval

How do you choose the right chunk size?

Nortren·

How do you choose the right chunk size?

0:20

Chunk size depends on the embedding model's optimal input length, the granularity of your queries, and the structure of your documents. Common starting points are 256 to 512 tokens for question answering and 1024 to 2048 tokens for summarization. The best size is found empirically by measuring retrieval quality on a representative evaluation set.
docs.llamaindex.ai