Pick your vector database — 12 options across the four categories that decide most picks
Your vector store is the decision that compounds across every RAG retrieval, every semantic-search query, and every embedding pipeline you ship. The directory filters by embedded / managed-SaaS / self-hosted / multi-model, plus engine, pricing, and the hybrid-search and edge-ready flags that decide most picks. Every entry gives you a one-line summary, a concrete best-for, an honest skip-this-if, and a paragraph of opinion.
10 SIDE-BY-SIDE COMPARISONS → TOP-5 DECISION HUB →Filter the list
Showing 12 of 12
Milvus
Distributed vector DB built for billion-scale workloads. Heavy, capable, China-AI-aligned.
Qdrant
Rust-fast open-source vector engine. Cleaner API than Weaviate, smaller footprint.
Chroma
Embedded vector database for AI apps. Runs in-process like SQLite, prototype-first.
Weaviate
Open-source self-hostable vector database with hybrid search and module ecosystem.
pgvector
Vector search inside Postgres. The default for teams already on Postgres or Supabase.
LanceDB
Embedded multimodal vector + tabular database. Object-store-backed, Rust-fast.
Vespa
matureYahoo-built distributed search + vector engine. Hybrid retrieval at extreme scale.
Marqo
End-to-end vector search engine with built-in embedders. Multimodal, model-aware.
Pinecone
The original managed vector database. Polished SDK, predictable latency, expensive at scale.
Turbopuffer
Object-store-backed serverless vector DB. Pay-per-query, cheap at idle, fast at scale.
MongoDB Atlas Vector Search
Vector search inside MongoDB Atlas. Document data + embeddings in one query.
Astra DB Vector
Cassandra-based managed vector store from DataStax. Wide-column + vector hybrid.
No vector stores match your current filters.
The vector-store choice is the easy half — your retrieval design is the hard one
Picking the vector database is the easy half. The hard half is your chunking strategy, your hybrid-retrieval recipe, your reranking pipeline, and the eval loop that tells you whether your RAG is getting better or worse week to week. The 30-min call is the right starting place — describe your corpus, your latency budget, your scale; I tell you what fits.