Self-hosting ParadeDB lets you run vector search inside your own PostgreSQL instance for full control and lower latency.
Self-hosting keeps embeddings and user data inside your VPC, avoids SaaS fees, and lets you tune PostgreSQL exactly for your workload.
ParadeDB ships a single Docker image that bundles PostgreSQL, the vectors extension, and ParadeDB’s SQL procedures. No extra build steps.
Create docker-compose.yml
and run docker compose up -d
. ParadeDB starts on port 5432 with persistent volume mapping for /var/lib/postgresql/data
.
services:
paradedb:
image: paradedb/paradedb:latest
ports:
- "5432:5432"
environment:
POSTGRES_PASSWORD: secret
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
Mount your certs and set POSTGRES_SSL=on
. ParadeDB uses the standard PostgreSQL parameters.
Yes. Copy paradedb--*.sql
files to $PGDATA/extension
or use CREATE EXTENSION
if the package is in share/extension
.
Connect as superuser, then:
CREATE EXTENSION IF NOT EXISTS vectors;
CREATE EXTENSION IF NOT EXISTS paradedb;
Add a vector
column to products
and create an ivfflat index.
ALTER TABLE products ADD COLUMN embedding vector(384);
CREATE INDEX idx_products_embedding ON products USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);
Use ParadeDB’s helper to embed the search phrase, then sort by L2 distance.
SELECT id, name
FROM products
ORDER BY embedding <=> paradedb_embed('lightweight laptop')
LIMIT 5;
Because ParadeDB is plain PostgreSQL, use pg_dump
, pg_basebackup
, or any managed-storage snapshot strategy.
Set shared_buffers
to 25% RAM and work_mem
to at least 32MB to avoid temporary files during ANN queries.
Missing ivfflat analyze: run VACUUM ANALYZE products;
after bulk inserts so the planner uses the index.
Too few lists: a low lists
setting slows recall. Reindex with 100-200 lists for >1M vectors.
Yes. The Docker image is rebuilt for every minor release. For manual installs, compile the vectors extension against Postgres 16.
Absolutely. Store embeddings with vector_cosine_ops
and use embedding <#> paradedb_embed('query')
.
Yes, GPL-licensed on GitHub at paradedb/paradedb
.