How to Test Queries in ParadeDB PostgreSQL

Galaxy Glossary

How do I test vector-search queries in ParadeDB?

The test_queries() helper in ParadeDB quickly benchmarks vector-search SQL against a chosen index.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Why should I run `test_queries()` in ParadeDB?

Use test_queries() to measure recall and latency of your vector index before shipping a semantic-search feature. The function executes a set of embeddings against the target index, captures timing, and returns aggregated metrics.

What is the basic workflow?

1) Load or generate query embeddings. 2) Call test_queries() with the index name, embedding array, and k-nearest neighbors. 3) Inspect the returned table for avg_ms, p95_ms, and recall. 4) Tune index parameters if needed.

How do I install ParadeDB?

Install the extension, then create it in your database: CREATE EXTENSION paradedb; ParadeDB requires PostgreSQL ≥15 and the vector extension.

What parameters does `test_queries()` accept?

index_name TEXT – the name of the vector index.
query_embeddings VECTOR[] – array of query embeddings.
k INT – number of neighbors to retrieve.
Optional named args mirror the underlying ANN parameters (ef_search, metric, etc.).

How do I interpret the result?

The function returns one row per query plus an aggregate row labeled _summary. Key columns: avg_ms, p95_ms, recall, rows_scanned. Aim for sub-50 ms avg latency and ≥0.9 recall.

Should I test on production data?

Yes—clone a production snapshot into a staging database. Accurate cardinality and distribution are essential for realistic benchmarks.

How can I speed up slow queries?

Lower ef_search, increase index M, or add a pre-filter (e.g., product category) before the vector condition. Re-run test_queries() after each change.

Why How to Test Queries in ParadeDB PostgreSQL is important