Section 01 · The Decision
Why vector database selection matters for RAG
Your vector database is the retrieval layer of your RAG pipeline. Its performance, operational model, and cost at scale determine whether your RAG system is reliable, maintainable, and economically viable.
Quick answer
The short answer: Start with pgvector if you run Postgres — it is production grade up to roughly 10 million vectors and costs nothing to add. Use Pinecone when you need managed scale beyond that. Use Weaviate when you need native hybrid search or self-hosted control at large scale.
Most engineers choose a vector database by reading comparison articles that rank all options on all dimensions simultaneously. The more useful frame is the migration path: which option should you start with, and what should trigger a migration? If you want to filter the full landscape side by side, the Vector Database Comparison Matrix covers ten options by hosting, hybrid search, and price.
The answer is almost always pgvector first. It is a Postgres extension that runs inside your existing database. No new infrastructure. No new ops burden. No additional cost. Your existing backups, monitoring, and access controls cover it. At under 10 million vectors — which covers the vast majority of seed to Series A use cases — the performance is competitive with purpose-built vector stores.
Section 02 · Option 1
pgvector: start here unless you have a reason not to
pgvector adds vector storage and HNSW index support to PostgreSQL. You store vectors in a column alongside your existing data. Queries use SQL with a vector distance operator. The entire stack — vectors, metadata, relational data — lives in one database with one connection, one backup, one monitoring setup.
Use pgvector when
You already run Postgres. Your dataset is under 10 million vectors. You want to minimize infrastructure complexity. Supabase, Neon, and RDS all support pgvector natively. Companies including Instacart run pgvector in production at significant scale.
Migrate away from pgvector when
Your dataset exceeds 10 to 50 million vectors and single-node Postgres is showing latency degradation. You need native hybrid search without composing it manually with a BM25 index. You need multi-tenant vector isolation at scale.
Performance at 1 million vectors: pgvector achieves approximately 640 QPS with HNSW at 95% recall. Purpose-built vector stores achieve 1,600 QPS or more at the same recall level. At 1 million vectors, this difference rarely matters — query latency is low and throughput is rarely the bottleneck. At 50 million vectors, the gap becomes significant.
Section 03 · Option 2
Pinecone: the managed path to 100 million-plus vectors
Pinecone is a fully managed, serverless vector database. You create an index, insert vectors, and query — no infrastructure to configure or maintain. It scales transparently to hundreds of millions of vectors without operational changes. The SLA and support are the strongest of the three options.
Use Pinecone when
You need to scale beyond pgvector's practical ceiling and want the fastest time to production at scale without investing in infrastructure operations. Teams that have migrated from pgvector to Pinecone report the transition taking hours, not days — the API surface is straightforward.
Consider alternatives when
Cost is a primary constraint. Pinecone's serverless pricing is competitive at moderate scale but higher than self-hosted alternatives at large scale. If you can operate infrastructure reliably, Qdrant or Weaviate self-hosted will be cheaper per query at very high volumes.
Section 04 · Option 3
Weaviate: hybrid search native and self-hosted control
Weaviate ships hybrid search — BM25 plus vector similarity, fused with Reciprocal Rank Fusion — natively. You do not need to compose a separate BM25 index alongside your vector index. For production RAG systems that need hybrid retrieval (which is most of them), this is a significant operational advantage.
Use Weaviate when
You need native hybrid search without composing it manually. You want a self-hosted option for data sovereignty, compliance, or cost reasons. You are building a multi-tenant RAG system where vector spaces need to be isolated per tenant.
Consider alternatives when
You want the simplest possible managed service and do not need self-hosting. Weaviate's managed cloud offering is good, but Pinecone has a simpler API and a stronger SLA for teams that want fully managed without operational involvement.
Section 05 · Head-to-Head
The numbers that matter in production
| Dimension | pgvector | Pinecone | Weaviate |
|---|---|---|---|
| Deployment model | Self-hosted (Postgres extension) | Fully managed, serverless | Self-hosted or managed cloud |
| Hybrid search | Manual (compose with BM25 index) | Supported (added 2025) | Native — ships out of the box |
| Performance at 1M vectors | ~640 QPS, 95% recall | ~1,600+ QPS, 95% recall | ~1,600+ QPS, 95% recall |
| Practical scale ceiling | ~10 to 50M vectors (single node) | Hundreds of millions | Self-hosted: node-dependent; Managed: high |
| Cost model | Free (Postgres costs) | Usage-based (starts ~$70/mo) | Free self-hosted; managed pricing |
| Multi-tenant support | Schema-level isolation | Namespace-based | Class-level isolation — strong |
| Migration from Postgres | Already there | Hours | Days |
At 1 million vectors, the quality differences between the three are small — all reach 95% recall with default settings. Choose based on your operational model preference and hybrid search requirements. At 50 million vectors, pgvector requires careful tuning and may need a migration; Pinecone and Weaviate handle it without changes.
FAQ
Frequently asked questions
Should I use pgvector or Pinecone for a new RAG application?
Start with pgvector if you already run Postgres. It is production grade for datasets under 10 million vectors, costs nothing to add, and lets you manage your data in one place. Migrate to Pinecone when you outgrow pgvector's ceiling — the migration is straightforward and Pinecone's managed service eliminates infrastructure operations at scale.
What is the performance difference between pgvector and Pinecone at 1 million vectors?
At 1 million vectors with 95% recall, pgvector achieves approximately 640 QPS and purpose-built stores like Pinecone and Weaviate achieve 1,600 QPS or more. In most production RAG systems, this difference does not matter — query latency is well within acceptable bounds for both options.
Does pgvector support hybrid search?
Not natively. pgvector handles vector similarity search. To add keyword search, you need to compose a separate BM25 or full-text search index in Postgres and merge the results manually. Weaviate ships hybrid search out of the box. Pinecone added hybrid search in 2025. For production RAG that needs hybrid retrieval, Weaviate or Pinecone is operationally simpler.
When should I migrate from pgvector to Pinecone or Weaviate?
Migrate when: your dataset exceeds 10 to 50 million vectors and pgvector is showing latency degradation, you need native hybrid search without composing it manually, or you need multi-tenant vector isolation at scale. Do not migrate in anticipation of scale you have not reached — premature migration adds operational complexity with no benefit.