Staging Fivetran data in Snowflake involves configuring secure pipelines that land raw extracts from diverse sources into Snowflake schemas, allowing analytics teams to query and transform the data efficiently.
Staging data from Fivetran into Snowflake is the process whereby Fivetran-managed connectors extract data from source systems, land the files in an intermediate cloud storage bucket, and then load them—via Snowpipe—into Snowflake tables that live in a dedicated <source>_RAW
schema. The staged tables preserve the original structure and change data, providing a reliable, auditable foundation for downstream modeling.
Modern data stacks rely on automated ingestion and scalable cloud warehouses. Fivetran eliminates EL (extract-load) engineering work, while Snowflake offers elastic storage and compute. Together they enable:
Fivetran connectors query the source (e.g., MySQL, Salesforce, Stripe) and write JSON or CSV files to a Fivetran-managed S3, GCS, or Azure Blob bucket.
Fivetran creates an EXTERNAL STAGE
in Snowflake that points to the bucket and sets up an AUTO_INGEST
Snowpipe. A service managed by Fivetran posts cloud-storage events so Snowflake knows when to ingest new files.
Snowpipe executes COPY INTO
commands that stream the data into raw tables inside a schema such as FIVETRAN_DB.SALESFORCE_RAW
. Fivetran adds metadata columns (e.g., _FIVETRAN_SYNCED
, _FIVETRAN_DELETED
).
Optional dbt jobs or Snowflake Tasks model the staged data into analytics-ready marts.
CREATE ROLE FIVETRAN;
GRANT ROLE FIVETRAN TO USER fivetran_user;
CREATE DATABASE FIVETRAN_DB;
GRANT ALL ON DATABASE FIVETRAN_DB TO ROLE FIVETRAN;
SHOPIFY_RAW
, POSTGRES_RAW
, etc.).Key characteristics:
_FIVETRAN_DELETED = TRUE
._FIVETRAN_SYNCED
mark the last successful load._FIVETRAN_ID
if the original table lacks one.Point Fivetran at a small, cost-effective warehouse (e.g., XS
) since loads are frequent but lightweight.
Grant USAGE
and CREATE
only on the dedicated database. Analysts should read from modeled schemas, not RAW.
Apply TAGS
on the Fivetran warehouse and pipe to track credit consumption. Use ACCOUNT_USAGE.PIPE_USAGE_HISTORY
for monitoring.
The RAW data is immutable; a 1- to 3-day DATA_RETENTION_TIME_IN_DAYS
usually suffices.
To explore the latest state of a table:
SELECT *
FROM FIVETRAN_DB.SALESFORCE_RAW.OPPORTUNITY
WHERE _FIVETRAN_DELETED = FALSE
ORDER BY _FIVETRAN_SYNCED DESC
LIMIT 100;
For slowly changing dimensions, leverage Snowflake streams:
CREATE OR REPLACE STREAM OPPORTUNITY_CHANGES ON TABLE FIVETRAN_DB.SALESFORCE_RAW.OPPORTUNITY;
The RAW schema contains soft deletes and subtly changing column types. Always transform into a clean ANALYTICS
layer (dbt is ideal).
Giving analysts OWNERSHIP
on RAW objects lets them accidentally drop tables, breaking the connector. Stick to SELECT
.
Leaving the Fivetran warehouse running 24/7 wastes credits. Set AUTO_SUSPEND = 60
and AUTO_RESUME = TRUE
.
Once data is staged, engineers need to discover column names, write tests, and iterate on transforms. Galaxy’s lightning-fast SQL editor and context-aware AI copilot accelerate this workflow:
SALESFORCE_RAW
table metadata as soon as Fivetran creates it.Staging data from Fivetran into Snowflake replaces brittle in-house pipelines with a fully managed, secure, and near-real-time system. By following least-privilege roles, warehouse sizing best practices, and modeling raw tables into clean marts, teams unlock rapid analytics while controlling cost. Modern editors like Galaxy further streamline query development and collaboration on top of the staged data.
Fivetran plus Snowflake removes the EL burden and enables near-real-time analytics. Understanding how data lands in the RAW schema, what metadata columns mean, and how to secure and monitor the pipeline ensures reliability, cost control, and accurate downstream reporting. Without this knowledge, teams risk corrupted data models, runaway warehouse spend, and compliance gaps.
The default sync frequency is 15 minutes, but many connectors support 5-minute intervals or real-time CDC. You can adjust the schedule per connector in the Fivetran UI.
Yes. Fivetran creates an EXTERNAL STAGE
that points to its cloud bucket and an AUTO_INGEST
Snowpipe to stream files into tables.
Assign Fivetran a small warehouse with AUTO_SUSPEND
, monitor PIPE_USAGE_HISTORY
, and limit DATA_RETENTION_TIME_IN_DAYS
in RAW schemas.
Galaxy auto-discovers the new RAW schemas, offers AI-driven autocomplete, and lets teams share vetted staging queries in Collections, streamlining exploration and transformation.