Export Snowflake data to files, create matching tables, and COPY or IMPORT the files into ParadeDB.
ParadeDB runs on PostgreSQL, giving you open-source flexibility, vector search, and lower cost storage. Moving workloads removes Snowflake credit spend and lets engineers use standard psql tooling.
Parquet keeps types intact and loads quickly. CSV works if you need human-readable dumps. Use the same format for every table to simplify your pipeline.
Stage a named file format, then run COPY INTO to S3 or internal stage. Each COPY writes one file per worker unless SINGLE=TRUE is set.
CREATE OR REPLACE FILE FORMAT parquet_fmt TYPE = PARQUET;
COPY INTO @my_stage/orders.parquet
FROM orders
FILE_FORMAT = (FORMAT_NAME = parquet_fmt)
SINGLE = TRUE OVERWRITE = TRUE;
Mount the files locally or use an object-storage FDW. Create a foreign table that points at the Parquet, then INSERT INTO the real table.
CREATE EXTENSION IF NOT EXISTS parquet_fdw;
CREATE SERVER parquet_srv FOREIGN DATA WRAPPER parquet_fdw;
CREATE FOREIGN TABLE orders_ext(
id bigint,
customer_id bigint,
order_date date,
total_amount numeric)
SERVER parquet_srv OPTIONS (filename '/mnt/snowflake_dump/orders.parquet');
INSERT INTO orders SELECT * FROM orders_ext;
Generate COPY and CREATE TABLE statements with Snowflake’s INFORMATION_SCHEMA. Feed the list to a bash script that loops through tables, exports, scp’s the files, and runs psql.
Count rows, compare aggregates such as SUM(total_amount), and sample 100 random IDs from both systems. Differences usually point to NULL handling or timezone mismatches.
Disable indexes and constraints before INSERT, then re-create them. Increase maintenance_work_mem
and use COPY ... FREEZE
for CSV files to avoid extra vacuum work.
Yes. Use Snowflake’s Snowpipe to push data to Kafka or S3, then consume in ParadeDB with logical replication or FDWs that support streaming.
Encrypt Parquet files with server-side encryption in S3, restrict IAM roles, and use SSL connections in psql. ParadeDB inherits PostgreSQL’s robust security model.
Only for vector or full-text features. Basic table storage and SQL work out of the box in standard PostgreSQL after the data is loaded.