What is scalar quantization and how does it differ from product quantization?
RAG & Vector DB Interview: HNSW, IVF, Product Quantization, ANN Search Explained
Audio flashcard · 0:30Nortren·
What is scalar quantization and how does it differ from product quantization?
0:30
Scalar quantization compresses each dimension of a vector independently, typically from 32-bit float to 8-bit integer, giving a 4x memory reduction with minimal recall loss. Product quantization compresses groups of dimensions jointly using a learned codebook, achieving 16x to 64x compression but with higher recall loss and more complex training. Scalar quantization is simpler, requires no training, and is a safe default in Qdrant and Milvus. Use product quantization only when memory pressure is extreme and some recall sacrifice is acceptable.
qdrant.tech