MemotivaRAG & Vector DB Interview: HNSW, IVF, Product Quantization, ANN Search Explained

What is scalar quantization and how does it differ from product quantization?

RAG & Vector DB Interview: HNSW, IVF, Product Quantization, ANN Search Explained

Audio flashcard · 0:30

Nortren·

What is scalar quantization and how does it differ from product quantization?

0:30

Scalar quantization compresses each dimension of a vector independently, typically from 32-bit float to 8-bit integer, giving a 4x memory reduction with minimal recall loss. Product quantization compresses groups of dimensions jointly using a learned codebook, achieving 16x to 64x compression but with higher recall loss and more complex training. Scalar quantization is simpler, requires no training, and is a safe default in Qdrant and Milvus. Use product quantization only when memory pressure is extreme and some recall sacrifice is acceptable.
qdrant.tech