What is dbt seed?

dbt seed is a dbt command that uploads static CSV files in your project’s data/ folder into your data warehouse as managed tables, making small reference datasets version-controlled and easily queryable.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.

Description

What Is dbt seed?

dbt seed loads CSV files stored in a project’s data/ directory into your warehouse as tables, giving you version-controlled reference or lookup data with one command.

Why Use dbt seed Instead of SQL COPY?

dbt seed automates table creation, handles schema changes, and ties data to Git history, whereas manual COPY commands live outside version control and require repetitive boilerplate.

How Does dbt seed Work Under the Hood?

During dbt seed, dbt reads each CSV, infers column types or applies column-level quoting, generates CREATE and INSERT statements, and stores a checksum in manifest.json for freshness checks.

How to Configure dbt seed?

Add settings in dbt_project.yml under seeds: to set database, schema, header, delimiter, quote_columns, and file-specific overrides.

YAML Example

seeds:
my_project:
+schema: staging
+quote_columns: false
users.csv:
+column_types:
id: integer
plan: varchar(10)

How Do You Run dbt seed Selectively?

Use dbt seed --select my_seed or dbt seed --exclude large_seed to control which CSVs load, saving build time in CI pipelines.

What Are Best Practices for dbt seed?

Keep files under ~100k rows, store only static or slowly changing data, set explicit column types, and add tests for row_count and not_null keys.

When Should You Avoid dbt seed?

Avoid dbt seed for large fact tables or frequently updated data; use proper ELT pipelines or warehouse-native staging instead.

How Does dbt seed Integrate with Galaxy?

In Galaxy’s SQL editor, seeded tables appear instantly in the sidebar metadata, letting you autocomplete against them and share validated seed queries inside Collections.

Why dbt seed is important

Version-controlled reference data eliminates hidden CSV uploads, keeps dev, CI, and prod in sync, and speeds testing by guaranteeing deterministic lookup tables.

dbt seed Example Usage


dbt seed --select marketing_campaigns

dbt seed Syntax



Common Mistakes

Frequently Asked Questions (FAQs)

Can dbt seed update existing rows?

No. dbt seed truncates and reloads the table each run. Use incremental models for updates.

Where should seed CSV files live?

Place them in the project’s data/ folder so dbt auto-detects them.

How do I prevent a seed from deploying to production?

Use dbt seed --exclude filename in your prod job or configure env-specific select flags.

Does Galaxy work with dbt seed tables?

Yes. Galaxy indexes seeded tables, offering instant autocomplete, query sharing, and AI explanations just like any other table.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie
BauHealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.