The test_queries() helper in ParadeDB quickly benchmarks vector-search SQL against a chosen index.
test_queries()
in ParadeDB?Use test_queries()
to measure recall and latency of your vector index before shipping a semantic-search feature. The function executes a set of embeddings against the target index, captures timing, and returns aggregated metrics.
1) Load or generate query embeddings. 2) Call test_queries()
with the index name, embedding array, and k-nearest neighbors. 3) Inspect the returned table for avg_ms
, p95_ms
, and recall
. 4) Tune index parameters if needed.
Install the extension, then create it in your database: CREATE EXTENSION paradedb;
ParadeDB requires PostgreSQL ≥15 and the vector
extension.
test_queries()
accept?index_name TEXT – the name of the vector index.
query_embeddings VECTOR[] – array of query embeddings.
k INT – number of neighbors to retrieve.
Optional named args mirror the underlying ANN parameters (ef_search
, metric
, etc.).
The function returns one row per query plus an aggregate row labeled _summary
. Key columns: avg_ms
, p95_ms
, recall
, rows_scanned
. Aim for sub-50 ms avg latency and ≥0.9 recall.
Yes—clone a production snapshot into a staging database. Accurate cardinality and distribution are essential for realistic benchmarks.
Lower ef_search
, increase index M
, or add a pre-filter (e.g., product category) before the vector condition. Re-run test_queries()
after each change.
test_queries()
modify my data?No, it only reads from the index and records timing in memory.
Call the function in a UNION ALL
query, passing each index name separately, then compare summary rows.
Twenty to fifty diverse queries usually provide stable latency and recall estimates.