How to Test Queries in ParadeDB PostgreSQL

Galaxy Glossary

How do I test vector-search queries in ParadeDB?

The test_queries() helper in ParadeDB quickly benchmarks vector-search SQL against a chosen index.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.

Description

Table of Contents

Why should I run test_queries() in ParadeDB?

Use test_queries() to measure recall and latency of your vector index before shipping a semantic-search feature. The function executes a set of embeddings against the target index, captures timing, and returns aggregated metrics.

What is the basic workflow?

1) Load or generate query embeddings. 2) Call test_queries() with the index name, embedding array, and k-nearest neighbors. 3) Inspect the returned table for avg_ms, p95_ms, and recall. 4) Tune index parameters if needed.

How do I install ParadeDB?

Install the extension, then create it in your database: CREATE EXTENSION paradedb; ParadeDB requires PostgreSQL ≥15 and the vector extension.

What parameters does test_queries() accept?

index_name TEXT – the name of the vector index.
query_embeddings VECTOR[] – array of query embeddings.
k INT – number of neighbors to retrieve.
Optional named args mirror the underlying ANN parameters (ef_search, metric, etc.).

How do I interpret the result?

The function returns one row per query plus an aggregate row labeled _summary. Key columns: avg_ms, p95_ms, recall, rows_scanned. Aim for sub-50 ms avg latency and ≥0.9 recall.

Should I test on production data?

Yes—clone a production snapshot into a staging database. Accurate cardinality and distribution are essential for realistic benchmarks.

How can I speed up slow queries?

Lower ef_search, increase index M, or add a pre-filter (e.g., product category) before the vector condition. Re-run test_queries() after each change.

Why How to Test Queries in ParadeDB PostgreSQL is important

How to Test Queries in ParadeDB PostgreSQL Example Usage


-- Benchmark semantic product search
SELECT *
FROM paradedb.test_queries(
    index_name       => 'products_name_embedding_idx',
    query_embeddings => ARRAY[
        pgml.embed('usb-c charger')::vector,
        pgml.embed('gaming headset')::vector
    ],
    k => 10
) AS t;
-- Expect a summary row with avg_ms and recall columns

How to Test Queries in ParadeDB PostgreSQL Syntax


SELECT paradedb.test_queries(
    index_name       => 'products_name_embedding_idx',
    query_embeddings => ARRAY[
       pgml.embed('wireless mouse')::vector,
       pgml.embed('mechanical keyboard')::vector
    ],
    k                => 5
);

Common Mistakes

Frequently Asked Questions (FAQs)

Does test_queries() modify my data?

No, it only reads from the index and records timing in memory.

Can I test multiple indexes at once?

Call the function in a UNION ALL query, passing each index name separately, then compare summary rows.

How many query embeddings should I use?

Twenty to fifty diverse queries usually provide stable latency and recall estimates.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.