Question

Why are most production embeddings normalized to unit length?

Accepted Answer

Normalization to unit length lets you use dot product instead of cosine similarity without changing the ranking, which is faster in vector databases. It also makes distances comparable across texts, since two unit vectors cannot have inflated similarity from large magnitudes. OpenAI text-embedding-3 models, Cohere embeddings, and most sentence-transformers return normalized vectors by default. If your model does not normalize, divide each vector by its L2 norm before insertion, or your database may behave unexpectedly with dot product search.