RAG & Vector DB Interview: Pinecone vs Qdrant vs Weaviate vs Milvus vs pgvector

RAG & Vector DB Interview: Pinecone vs Qdrant vs Weaviate vs Milvus vs pgvector

This section offers a detailed comparison of popular vector databases like Pinecone, Qdrant, Weaviate, and Milvus. Understanding the differences and strengths of these technologies is crucial for making informed decisions.

12 audio · 5:44

Nortren·

What is the difference between Pinecone and Qdrant?

0:29
Pinecone is a fully managed proprietary vector database with serverless and pod-based deployment, focused on zero-ops simplicity and tight AWS, GCP, and Azure integration. Qdrant is an open-source vector database written in Rust that offers a managed cloud option and self-hosting, with strong filtering, payload support, and quantization features. Pinecone wins on ease of use and enterprise support, while Qdrant wins on cost at scale, self-hosting flexibility, and fine-grained control over index internals and metadata filtering.

What is the difference between Pinecone and Weaviate?

0:29
Pinecone is a closed-source managed service optimized for simple vector search at scale, with a minimal API and automatic infrastructure management. Weaviate is open-source, offers both cloud and self-hosted deployments, and bundles modules for vectorization, reranking, generative search, and multi-tenancy directly into the database. Weaviate uses GraphQL alongside REST, supports hybrid search natively, and targets teams that want more built-in AI features, while Pinecone targets teams that want pure vector infrastructure without extra moving parts.

What is the difference between Milvus and Qdrant?

0:29
Milvus is a distributed open-source vector database written in Go and C++, designed for horizontal scaling across many nodes with separate compute and storage tiers, often billion-vector deployments. Qdrant is written in Rust, prioritizes single-node efficiency with rich filtering and payload features, and scales horizontally more recently. Milvus wins on proven scale and cloud-native architecture with components like etcd and object storage, while Qdrant wins on single-node performance, memory efficiency, and simpler operations for small-to-medium deployments.

What is pgvector and when should you use it over a dedicated vector database?

0:29
pgvector is a PostgreSQL extension that adds a vector column type with HNSW and IVFFlat indexes, letting you run vector search alongside relational data in the same database. Use pgvector when vector search is secondary to relational workloads, when you already operate PostgreSQL, or when keeping one database simplifies your stack. Dedicated vector databases win at very large scale, when you need advanced features like complex hybrid search or GPU acceleration, or when vector workload size would hurt PostgreSQL performance on other queries.

Which vector database has the best performance benchmarks?

0:30
Benchmark results depend heavily on dataset size, dimensionality, recall target, and hardware, so no single database wins universally. Qdrant and Milvus typically lead on query throughput and indexing speed in public benchmarks like ANN Benchmarks and the Qdrant vector-db-benchmark. Pinecone optimizes latency for serverless workloads with cold-start trade-offs. pgvector with HNSW lags dedicated databases on raw speed but wins on integration simplicity. Always run benchmarks on your own data distribution since synthetic datasets rarely predict production performance.

What is the difference between serverless and provisioned vector database pricing?

0:29
Serverless pricing charges for actual usage like reads, writes, and storage, with no fixed node cost, ideal for variable or low-volume workloads. Provisioned pricing charges a fixed rate for dedicated nodes or pods regardless of utilization, which is predictable and cheaper at steady high volume. Pinecone offers both models explicitly, Qdrant Cloud offers cluster-based pricing, and self-hosted deployments are purely provisioned. Serverless suits early-stage products with spiky traffic, while provisioned suits mature products with predictable load.

Which vector databases support hybrid search out of the box?

0:27
Qdrant, Weaviate, Milvus, Elasticsearch, and OpenSearch support hybrid search natively, combining dense vector similarity with sparse BM25 or learned sparse retrieval in a single query. Pinecone supports sparse-dense hybrid through its sparse-dense vectors feature. pgvector requires manual combination of vector search with PostgreSQL full-text search using tsvector. Weaviate and Qdrant both use Reciprocal Rank Fusion by default, while Elasticsearch offers multiple fusion strategies including RRF and linear score combination.

Which vector database should you choose for a startup RAG prototype?

0:26
Choose pgvector if you already use PostgreSQL, since it adds vector search with zero new infrastructure and handles millions of vectors adequately. Choose Pinecone serverless for the fastest zero-ops setup with a generous free tier and no servers to manage. Choose Qdrant or Weaviate Cloud if you want open-source underneath with a managed experience and easy migration to self-hosting later. Avoid Milvus for small prototypes since its multi-component architecture is overkill until you reach tens of millions of vectors.

Which vector database handles metadata filtering best?

0:32
Qdrant has the most expressive filtering system, with nested conditions, geo filters, full-text match, and payload indexing that makes filtered search scale linearly with filter selectivity rather than collection size. Weaviate and Milvus both support filtered search with good performance when metadata fields are indexed. Pinecone supports metadata filtering but with simpler operators and historically slower filtered-vector performance on highly selective filters. pgvector inherits PostgreSQL's full SQL filtering, which is the most flexible but depends on how well the planner combines vector and WHERE clauses.

What is pre-filtering versus post-filtering in vector search?

0:27
Pre-filtering applies metadata filters before vector search, restricting the candidate set the ANN index must consider. Post-filtering runs vector search first, then filters the top-k results by metadata, which is simpler but can return too few results when the filter is highly selective. Pre-filtering is efficient when metadata fields are indexed but requires the ANN algorithm to handle filtered traversal, which can degrade recall if implemented naively. Qdrant and Weaviate use tight integration of pre-filtering with HNSW to maintain recall under filters.

How do vector databases handle multi-tenancy?

0:31
Multi-tenancy isolates data between users or customers sharing the same cluster. Pinecone uses namespaces within an index, Weaviate has first-class multi-tenancy with per-tenant shards that can be hot or cold, Qdrant uses collections or payload-based tenant filtering, and Milvus supports databases and collections as isolation units. Strong multi-tenancy requires per-tenant quotas, isolated indexes for performance, and security guarantees that one tenant cannot access another's data. Weaviate's dedicated multi-tenancy feature is the most feature-complete for SaaS workloads.

Which vector databases support on-disk and memory-mapped indexes?

0:26
Qdrant supports on-disk storage for both vectors and HNSW indexes with explicit memory-mapping configuration per collection. Milvus supports DiskANN and memory-mapped indexes for large datasets. Weaviate supports a flat index on disk for cold tenants. pgvector stores indexes on disk by PostgreSQL's standard buffer management. Pinecone handles this automatically in its serverless tier. On-disk indexes trade query latency for dramatically lower RAM cost, which matters when serving billions of vectors. ---