ParadeDB on AWS adds fast vector search to Amazon-hosted PostgreSQL by installing the ParadeDB extension, letting you store embeddings and run similarity queries.
ParadeDB extends PostgreSQL with a native vector
type and ANN indexes. Running it on Amazon RDS or Aurora gives you managed backups and scaling while keeping vectors close to transactional data.
Create a custom parameter group that adds paradedb
to shared_preload_libraries
, reboot, then run CREATE EXTENSION paradedb;
from an rds_superuser
role.
After reboot, connect with the master user and execute:
CREATE EXTENSION IF NOT EXISTS paradedb;
GRANT USAGE ON SCHEMA paradedb TO app_user;
Add a vector
column to products
to hold a 768-dimensional embedding:
ALTER TABLE products
ADD COLUMN embedding vector(768);
Load vector data first, then create IVFFlat or HNSW:
CREATE INDEX idx_products_embedding
ON products USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Find five products closest to a query vector:
SELECT id, name, price
FROM products
ORDER BY embedding <-> '[0.12,0.03,...]'::vector
LIMIT 5;
Choose r6g
or m6g
instances for more RAM. Keep work_mem
≥ 64MB for vector sorts. Monitor paradedb_ivfflat_searches_total
in CloudWatch.
Lack of superuser role: RDS blocks CREATE EXTENSION
for normal users. Use rds_superuser
or ask AWS Support to enable the extension.
Index too early: Populate most embeddings before building IVFFlat; otherwise you’ll need REINDEX
later.
Yes. ParadeDB is open-source and battle-tested. Enable automated snapshots, run nightly VACUUM ANALYZE
, and keep ParadeDB version in sync with major Postgres upgrades.
Yes. Follow the same parameter-group and extension steps used for RDS.
ParadeDB supports HNSW with USING hnsw
; choose it for higher recall at the cost of memory.
Run UPDATE
on the vector column. IVFFlat indexes auto-track changes; for HNSW, consider periodic REINDEX
.