Which vector database should I pick for a new RAG system?

If your team is already on Postgres, start with pgvector. Zero new infra and good enough for the first million vectors. If you want fully managed and do not want to think about ops, Pinecone Serverless is the safest default. If you want open source with strong hybrid search and a managed option, Qdrant or Weaviate are equally good. The matrix lets you filter by these dimensions.

What is hybrid search and why does it matter?

Hybrid search combines vector similarity (semantic) with keyword matching (lexical). For most production RAG, hybrid beats pure vector because users still type exact terms (product codes, names, error messages) that semantic search misses. If your queries include identifiers or rare terms, prioritise databases that support hybrid search natively.

Is open source actually free at scale?

The software is free, but the infrastructure is not. Self hosting Qdrant or Weaviate at production scale needs replicas, backups, monitoring, and on call. The total cost often lands close to a managed offering when you add ops time. The savings are real for teams with strong infra culture and small ops surfaces.

How accurate are the prices in this matrix?

Prices are taken from official vendor pages on the last verified date shown in the footer. Vendor pricing changes regularly, especially for serverless tiers. Treat the prices as directional and click through to the official page before signing a contract or committing to a tier.

Should I avoid pgvector for big workloads?

Not necessarily. pgvector handles tens of millions of vectors with proper indexing (HNSW or IVFFlat) and a strong Postgres host. Above that scale, dedicated vector databases tend to outperform on query latency and recall. Benchmark your workload first; do not assume.

What about embedded databases like Chroma and LanceDB?

Embedded databases run inside the application process or as a sidecar. Great for prototyping, notebooks, and small production deployments where you want zero network round trips. They scale less well than managed offerings, so most teams move off embedded once concurrent users grow.

Do you have benchmark numbers for these databases?

Vendor benchmarks tend to favor the vendor. The community benchmark to trust is ANN Benchmarks, which measures recall versus latency across HNSW, IVF, and other ANN algorithms. For latency, tail percentiles matter more than median. Always benchmark your own workload before committing.

Vector Database Comparison Matrix (2026, Free)

Name	Hosting	OSS	Hybrid	Free tier	Starting price
Chroma Local prototyping, notebooks, AI hackathons.	Self-host, Embedded	Yes	No	Yes	Free OSS
LanceDB Embedded use cases; serverless vector for LLM apps.	Self-host, Embedded	Yes	Yes	Yes	Free OSS
Milvus Very large scale vector workloads; billions of vectors.	Managed, Self-host	Yes	Yes	Yes	Free OSS or Zilliz Cloud from $0/mo dev
MongoDB Atlas Vector Search Teams already on MongoDB Atlas; no new datastore.	Managed	No	Yes	Yes	Atlas tier pricing
OpenSearch (k-NN) Teams already on Elastic / OpenSearch with text search.	Managed, Self-host	Yes	Yes	Yes	AWS OpenSearch Service pricing
pgvector Teams already on Postgres who want zero new infra.	Self-host, Embedded	Yes	No	Yes	Free with any Postgres
Pinecone Teams that want fully managed with no ops; production at scale.	Managed	No	Yes	Yes	Free + $0.096/M reads (Serverless)
Qdrant Self host friendly with cloud option; strong hybrid search.	Managed, Self-host, Embedded	Yes	Yes	Yes	Free OSS or $25/mo Cloud
Redis (Vector Search) Sub millisecond latency; cache plus vector in one store.	Managed, Self-host	Yes	Yes	Yes	Redis Cloud tier pricing
Weaviate GraphQL native, multimodal, generative search modules.	Managed, Self-host	Yes	Yes	Yes	Free OSS or pay per use Cloud

About this tool

What this matrix answers

The Vector Database Comparison Matrix shows 10 of the most adopted vector databases side by side, with a filter to narrow down to your constraints. Use it when scoping a new RAG system, evaluating whether to migrate off your current vector store, or briefing leadership on vendor options.

The data is hand curated and dated. Vendor benchmarks are excluded because they tend to favor the vendor; latency and recall numbers should come from ANN Benchmarks or your own workload, not from a marketing page.

How to use it

Start with hosting. Pick managed if your team has no ops capacity for a stateful service, self host if you have strong infra culture, or embedded if you are prototyping or shipping a small footprint app. Then toggle the must have features: hybrid search, open source, free tier.

The matrix narrows to options that match. Click through to the vendor for current pricing details (which change more often than this dataset can keep up with) and to the documentation for hybrid search behaviour, metadata filter syntax, and any quotas that matter for your workload.

How the dataset is sourced

Each database entry is sourced from the official documentation page and the vendor pricing page on the last verified date in the footer. Pricing is stated as a starting tier; production usage typically lands well above starting tiers because of replicas, backups, throughput, and storage growth.

The dataset lives in src/data/vector-databases.ts and is exposed as Schema.org Dataset markup so AEO crawlers can discover the comparison structure directly.

Where vector database picks usually go wrong

The most common mistake is picking the most hyped option without modelling the workload. A database that is excellent at billion vector scale may be wasteful and harder to operate at the few million vector scale most teams actually start at. Match the database to the workload, not to the conference talks.

The second mistake is underestimating hybrid search needs. Pure vector search misses exact term matches, which user queries always include. If you ship a pure vector system into production, expect a wave of bug reports about queries that should have matched but did not.

When this matrix is the right starting point

Use this matrix at the discovery stage of a RAG project, when scoping infra for a new agentic AI workflow, or when explaining vector database tradeoffs to a non technical stakeholder. It is fast, vendor neutral, and dated.

For final selection, pair the matrix with the RAG Cost & Sizing Calculator to estimate cost at your expected scale, and run a small benchmark of your own queries against your top two candidates before committing.

Designing a RAG system from scratch?

The vector database is one decision. Chunking, embeddings, reranking, and observability are the rest. Bring the design for an architecture review.

Book an architecture review

Frequently asked questions

Which vector database should I pick for a new RAG system?: If your team is already on Postgres, start with pgvector. Zero new infra and good enough for the first million vectors. If you want fully managed and do not want to think about ops, Pinecone Serverless is the safest default. If you want open source with strong hybrid search and a managed option, Qdrant or Weaviate are equally good. The matrix lets you filter by these dimensions.
What is hybrid search and why does it matter?: Hybrid search combines vector similarity (semantic) with keyword matching (lexical). For most production RAG, hybrid beats pure vector because users still type exact terms (product codes, names, error messages) that semantic search misses. If your queries include identifiers or rare terms, prioritise databases that support hybrid search natively.
Is open source actually free at scale?: The software is free, but the infrastructure is not. Self hosting Qdrant or Weaviate at production scale needs replicas, backups, monitoring, and on call. The total cost often lands close to a managed offering when you add ops time. The savings are real for teams with strong infra culture and small ops surfaces.
How accurate are the prices in this matrix?: Prices are taken from official vendor pages on the last verified date shown in the footer. Vendor pricing changes regularly, especially for serverless tiers. Treat the prices as directional and click through to the official page before signing a contract or committing to a tier.
Should I avoid pgvector for big workloads?: Not necessarily. pgvector handles tens of millions of vectors with proper indexing (HNSW or IVFFlat) and a strong Postgres host. Above that scale, dedicated vector databases tend to outperform on query latency and recall. Benchmark your workload first; do not assume.
What about embedded databases like Chroma and LanceDB?: Embedded databases run inside the application process or as a sidecar. Great for prototyping, notebooks, and small production deployments where you want zero network round trips. They scale less well than managed offerings, so most teams move off embedded once concurrent users grow.
Do you have benchmark numbers for these databases?: Vendor benchmarks tend to favor the vendor. The community benchmark to trust is ANN Benchmarks, which measures recall versus latency across HNSW, IVF, and other ANN algorithms. For latency, tail percentiles matter more than median. Always benchmark your own workload before committing.