Use pg_stat_statements, EXPLAIN ANALYZE, and logging parameters to locate and optimize slow SQL in ParadeDB.
Most delays come from missing indexes, scanning huge vectors, or inefficient joins. pg_stat_statements quickly surfaces the worst offenders by total execution time.
Install the extension, then preload it:
CREATE EXTENSION IF NOT EXISTS pg_stat_statements;
ALTER SYSTEM SET shared_preload_libraries = 'pg_stat_statements';
SELECT pg_reload_conf();
This activates a system view that aggregates runtime data for every statement.
SELECT query, calls, total_time, mean_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;
The result shows the 10 costliest queries so you can target them first.
Copy the query text, then run EXPLAIN with ANALYZE and BUFFERS to see the real execution plan:
EXPLAIN (ANALYZE, BUFFERS, VERBOSE)
SELECT o.id, SUM(oi.quantity * p.price) AS order_total
FROM Orders o
JOIN OrderItems oi ON oi.order_id = o.id
JOIN Products p ON p.id = oi.product_id
WHERE o.order_date > CURRENT_DATE - INTERVAL '30 days'
GROUP BY o.id;
The plan highlights sequential scans, high I/O, or costly vector distance calculations.
Set log_min_duration_statement
to a threshold in milliseconds:
ALTER SYSTEM SET log_min_duration_statement = 500; -- logs > 0.5 s
SELECT pg_reload_conf();
ParadeDB will now write any query taking longer than 500 ms to postgresql.log
for later review.
1. Create GIN or HNSW indexes on vector or text columns.
2. Add composite B-tree indexes for common join predicates.
3. Avoid SELECT *
; project only needed columns.
4. Batch inserts to keep the heap tidy and statistics fresh.
SELECT pg_stat_statements_reset();
Clearing the stats lets you measure performance gains accurately.
The overhead is usually <1% because it stores only aggregated stats in shared memory. The insight gained outweighs the minimal cost.
Yes. After initial preload, all views and logging parameters can be toggled with ALTER SYSTEM and pg_reload_conf(), avoiding downtime.