infrastructureintermediate

Vector Database

Also known as: vector store, vector search, ANN index, embedding database

A specialized database for storing and querying high-dimensional embedding vectors using similarity search rather than exact key lookup.

See matching models with benchmark scores and pricing.

Definition

A vector database is a data storage system designed specifically for storing high-dimensional embedding vectors and performing fast approximate nearest-neighbor (ANN) search over them. Where traditional databases retrieve records by exact key or index match, a vector database retrieves records by semantic similarity — returning the vectors and associated documents closest to a query vector in the embedding space. This makes vector databases the foundational infrastructure layer for RAG systems, semantic search, recommendation engines, duplicate detection, and long-term agent memory.

Common vector database systems include Pinecone, Weaviate, Qdrant, Milvus, Chroma, and pgvector (a PostgreSQL extension). Each makes different trade-offs between index algorithm (HNSW is the most widely used for low-latency high-recall search), filtering capabilities (metadata predicates alongside vector queries), scalability, and operational complexity. Performance is typically measured by recall@k — what fraction of the true top-k nearest neighbors are returned — balanced against query latency and indexing throughput.

As RAG architectures have matured, vector databases are typically used in conjunction with an embedding model (to generate vectors at index and query time) and often with a re-ranker (to refine initial ANN results with a more precise cross-encoder pass). Vector search is also increasingly available natively inside general-purpose databases (PostgreSQL via pgvector, Redis, Elasticsearch), reducing the need for a dedicated specialized system. Understanding vector databases is prerequisite knowledge for building any retrieval-augmented LLM system.