RAG & Vector DB Interview: Weaviate Modules, Multi-tenancy, GraphQL, Hybrid Search

RAG & Vector DB Interview: Weaviate Modules, Multi-tenancy, GraphQL, Hybrid Search

This section offers a detailed comparison of popular vector databases like Pinecone, Qdrant, Weaviate, and Milvus. Understanding the differences and strengths of these technologies is crucial for making informed decisions.

12 audio · 5:41

Nortren·

What is Weaviate and what are its main features?

0:27
Weaviate is an open-source vector database written in Go that supports vector search, hybrid search, and a module system for built-in vectorization, reranking, and generative AI integration. It exposes both REST and GraphQL APIs and supports multi-tenancy, replication, and horizontal scaling. Weaviate can automatically vectorize data at ingest using configured embedding modules from OpenAI, Cohere, Hugging Face, or local models, removing the need for a separate embedding pipeline. It runs self-hosted or on Weaviate Cloud.

What is a Weaviate collection and how is it structured?

0:31
A Weaviate collection, formerly called a class, is a set of objects with a shared schema defining properties, vectorizer settings, and index configuration. Each object has properties like text, numbers, dates, or references to other objects, plus one or more vector embeddings. Collections define the data model up front, similar to a table in relational databases, with strong typing on property values. Weaviate supports cross-references between collections, enabling graph-like queries that combine vector similarity with structural navigation across related objects.

What are Weaviate modules and how do they extend the database?

0:28
Weaviate modules are plugins that add capabilities like vectorization, reranking, generative search, question answering, named entity recognition, and image classification to the database. Modules run inside or alongside Weaviate and are configured per collection, so one collection might use OpenAI embeddings while another uses a local Hugging Face model. The text2vec modules automatically embed text at ingest and at query time, eliminating client-side embedding code. Modules make Weaviate a more integrated AI platform than pure vector databases.

How does hybrid search work in Weaviate?

0:28
Weaviate runs a dense vector search and a BM25 keyword search in parallel, then fuses the results using Reciprocal Rank Fusion or a relative score ranking method. The query API accepts an alpha parameter between zero and one, where zero is pure keyword, one is pure vector, and values in between blend both. Weaviate also supports fusion-aware filtering, reranking on hybrid results, and per-property BM25 boosts. This makes hybrid search a single API call rather than client-side orchestration of two separate queries.

What is Weaviate's multi-tenancy feature?

0:28
Weaviate supports first-class multi-tenancy where each tenant has its own shard within a collection, giving strong data isolation and independent scaling per tenant. Tenants can be in three states: hot, loaded in memory for fast queries; warm, on disk and loadable on demand; or cold, offloaded to object storage for minimal cost. This lets SaaS platforms serve thousands of tenants with mixed usage patterns efficiently, activating only the tenants currently being queried. Multi-tenancy is enabled per collection at creation time.

How does Weaviate use GraphQL for queries?

0:29
Weaviate exposes a GraphQL API alongside REST, letting clients specify exactly which properties and references to fetch in a single query. GraphQL is particularly useful for complex queries that combine vector search, filters, and cross-references, since it returns structured data matching the schema without over-fetching. The GraphQL API supports nearVector, nearText, hybrid, bm25, and filter operations, plus aggregations like counts and averages. Most client libraries wrap GraphQL with typed builders, hiding the string-based query syntax.

What distance metrics does Weaviate support?

0:28
Weaviate supports cosine similarity, dot product, squared Euclidean, Manhattan, and Hamming distance. Cosine is the default for text embeddings because it compares angle regardless of magnitude. The metric is set per collection in the vector index configuration and cannot be changed without reindexing. Dot product is faster when vectors are pre-normalized, Euclidean suits some image and geometry embeddings, and Hamming works with binary vectors. Most text RAG workloads use cosine or dot product depending on the embedding model's training regime.

How does Weaviate handle replication and consistency?

0:29
Weaviate supports synchronous replication across nodes with configurable consistency levels for reads and writes, from ONE acknowledgment for low latency to ALL for strong consistency. The replication factor is set per collection and determines how many copies of each shard exist in the cluster. Writes go to the configured quorum before returning success, giving tunable trade-offs between durability, latency, and availability. Weaviate uses a Raft-based consensus for schema changes and cluster metadata, while vector data replicates through a separate path.

What is generative search in Weaviate?

0:28
Generative search is a Weaviate feature where the database retrieves relevant objects and calls a generative language model on them in a single query, returning a generated answer alongside the retrieved objects. It wraps the full RAG pattern inside the database, eliminating the need to glue retrieval and generation in client code. Supported generator modules include OpenAI, Anthropic, Cohere, and local models through an Ollama module. This simplifies prototypes but may be less flexible than custom orchestration for production RAG with complex prompts.

How does Weaviate handle cross-references between collections?

0:29
Weaviate supports cross-references, which are typed pointers from one object to another collection's object, similar to foreign keys but without enforced integrity. Queries can follow references to fetch linked data in a single call, enabling graph-like navigation without a separate graph database. Cross-references are useful when your data has relationships like articles and authors or products and categories, and when you want retrieval to surface related objects along with the direct matches. They cost extra query time proportional to reference depth.

What is the Weaviate BYOV (Bring Your Own Vectors) approach?

0:27
BYOV, or Bring Your Own Vectors, lets you embed data outside Weaviate and send vectors directly with ingest requests, bypassing Weaviate's vectorizer modules. This is useful when you need fine-grained control over the embedding pipeline, use a model not supported by any module, or want to batch-embed data in a separate job for cost reasons. You can mix BYOV with module-based vectorization per collection. Queries that bypass the vectorizer must also provide query vectors directly rather than using nearText.

How does Weaviate support filters during vector search?

0:29
Weaviate applies filters before or during vector search depending on index configuration and filter selectivity. Filters can target any indexed property with operators like equal, greater than, less than, like for pattern match, and containsAny for array fields. Weaviate offers two filtered-search strategies: ACORN, which filters inside HNSW traversal for high selectivity, and sweeping, which restricts candidates before vector comparison. The strategy is chosen automatically based on filter cardinality, balancing recall and latency. ---