How to Migrate from Snowflake to ParadeDB in PostgreSQL

Galaxy Glossary

How do I migrate data from Snowflake to ParadeDB quickly and safely?

Export Snowflake data to files, create matching tables, and COPY or IMPORT the files into ParadeDB.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Why migrate from Snowflake to ParadeDB?

ParadeDB runs on PostgreSQL, giving you open-source flexibility, vector search, and lower cost storage. Moving workloads removes Snowflake credit spend and lets engineers use standard psql tooling.

Which files should I export?

Parquet keeps types intact and loads quickly. CSV works if you need human-readable dumps. Use the same format for every table to simplify your pipeline.

How do I export Snowflake tables?

Stage a named file format, then run COPY INTO to S3 or internal stage. Each COPY writes one file per worker unless SINGLE=TRUE is set.

CREATE OR REPLACE FILE FORMAT parquet_fmt TYPE = PARQUET; COPY INTO @my_stage/orders.parquet FROM orders FILE_FORMAT = (FORMAT_NAME = parquet_fmt) SINGLE = TRUE OVERWRITE = TRUE;

How do I load files into ParadeDB?

Mount the files locally or use an object-storage FDW. Create a foreign table that points at the Parquet, then INSERT INTO the real table.

CREATE EXTENSION IF NOT EXISTS parquet_fdw; CREATE SERVER parquet_srv FOREIGN DATA WRAPPER parquet_fdw; CREATE FOREIGN TABLE orders_ext( id bigint, customer_id bigint, order_date date, total_amount numeric) SERVER parquet_srv OPTIONS (filename '/mnt/snowflake_dump/orders.parquet'); INSERT INTO orders SELECT * FROM orders_ext;

How can I migrate multiple tables quickly?

Generate COPY and CREATE TABLE statements with Snowflake’s INFORMATION_SCHEMA. Feed the list to a bash script that loops through tables, exports, scp’s the files, and runs psql.

How do I verify the migration?

Count rows, compare aggregates such as SUM(total_amount), and sample 100 random IDs from both systems. Differences usually point to NULL handling or timezone mismatches.

What performance tweaks help large loads?

Disable indexes and constraints before INSERT, then re-create them. Increase maintenance_work_mem and use COPY ... FREEZE for CSV files to avoid extra vacuum work.

Why How to Migrate from Snowflake to ParadeDB in PostgreSQL is important

How to Migrate from Snowflake to ParadeDB in PostgreSQL Example Usage


-- Migrate Customers
-- Snowflake
COPY INTO @my_stage/customers.parquet
FROM customers FILE_FORMAT = (TYPE = PARQUET) SINGLE=TRUE;

-- ParadeDB
CREATE TABLE customers(
  id bigint PRIMARY KEY,
  name text,
  email text,
  created_at timestamptz);

CREATE FOREIGN TABLE customers_ext(
  id bigint,
  name text,
  email text,
  created_at timestamptz)
SERVER parquet_srv OPTIONS (filename '/mnt/snowflake_dump/customers.parquet');

INSERT INTO customers SELECT * FROM customers_ext;

How to Migrate from Snowflake to ParadeDB in PostgreSQL Syntax


-- General Workflow
1. In Snowflake
   COPY INTO @stage/<table>.parquet
     FROM <table>
     FILE_FORMAT = (TYPE = PARQUET) SINGLE=TRUE;

2. In ParadeDB (PostgreSQL)
   CREATE TABLE <table>(...);
   CREATE EXTENSION parquet_fdw;
   CREATE SERVER parquet_srv FOREIGN DATA WRAPPER parquet_fdw;
   CREATE FOREIGN TABLE <table>_ext (...) SERVER parquet_srv
     OPTIONS (filename '/path/<table>.parquet');
   INSERT INTO <table> SELECT * FROM <table>_ext;

Common Mistakes

Loading data before creating matching data types. ParadeDB rejects or truncates values when column types differ. Always inspect Snowflake DESCRIBE results and create ParadeDB columns with compatible types (e.g., numeric instead of float).
Keeping indexes enabled during bulk INSERT. Each row update triggers index writes, making loads slow. Drop or disable non-primary indexes, load data, and then rebuild indexes for a 5-10× speedup.

Frequently Asked Questions (FAQs)

Can I stream data instead of exporting files?

Yes. Use Snowflake’s Snowpipe to push data to Kafka or S3, then consume in ParadeDB with logical replication or FDWs that support streaming.

What about secure data?

Encrypt Parquet files with server-side encryption in S3, restrict IAM roles, and use SSL connections in psql. ParadeDB inherits PostgreSQL’s robust security model.