How to Integrate Snowflake with Apache Spark

Galaxy Glossary

How do I integrate Snowflake with Apache Spark?

Configures the Snowflake Connector for Spark so you can read from and write to Snowflake directly from Apache Spark jobs.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.

Description

Table of Contents

Why integrate Snowflake with Spark?

Unified workloads let you crunch Snowflake data with Spark[?25l

Why How to Integrate Snowflake with Apache Spark is important

How to Integrate Snowflake with Apache Spark Example Usage


# Join Spark DataFrames and write to Snowflake
orders_df = spark.read.format("snowflake").options(
  **base_options,
  dbtable="ORDERS"
).load()

items_df = spark.read.format("snowflake").options(
  **base_options,
  dbtable="ORDERITEMS"
).load()

revenue_df = orders_df.join(items_df, "id").groupBy("customer_id").sum("total_amount")

revenue_df.write.format("snowflake").options(
  **base_options,
  dbtable="CUSTOMER_REVENUE"
).mode("append").save()

How to Integrate Snowflake with Apache Spark Syntax


# Reading from Snowflake
df = (
  spark.read.format("snowflake")
    .option("url", "jdbc:snowflake://<account>.snowflakecomputing.com")
    .option("user", "SF_USER")
    .option("password", "SF_PASS")
    .option("dbtable", "ORDERS")
    .option("warehouse", "COMPUTE_WH")
    .option("sfDatabase", "ECOMMERCE")
    .option("sfSchema", "PUBLIC")
    .option("role", "ANALYST")
    .load()
)

# Writing to Snowflake
df.write.format("snowflake") \
  .option("url", "jdbc:snowflake://<account>.snowflakecomputing.com") \
  .option("user", "SF_USER") \
  .option("password", "SF_PASS") \
  .option("dbtable", "ORDERITEMS") \
  .option("warehouse", "COMPUTE_WH") \
  .option("sfDatabase", "ECOMMERCE") \
  .option("sfSchema", "PUBLIC") \
  .mode("overwrite") \
  .save()

Common Mistakes

Frequently Asked Questions (FAQs)

Does the connector support pushdown?

Yes. Snowflake executes filters, projections, and aggregations serverF[?25h

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.