Migrating On-Prem SQL Server to Azure Synapse

How do I migrate an on-premise SQL Server to Azure Synapse?

Moving databases, schema, and workloads from an on-premise Microsoft SQL Server instance to Azure Synapse Analytics using assessment, schema conversion, data movement, and optimization tools.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Migrating an on-premise SQL Server to Azure Synapse Analytics unlocks virtually limitless scalability, integrates cloud-native analytics services, and reduces hardware maintenance. This guide walks through every phase—from assessing legacy workloads to validating results—so you can execute a smooth, low-risk transition.

Why Migrate to Azure Synapse?

Azure Synapse (formerly SQL Data Warehouse) is Microsoft’s flagship cloud analytics platform. It combines a massively parallel processing (MPP) relational engine with native integration to Azure Data Lake, Spark, Power BI, and Azure Machine Learning. By moving the data warehouse from on-prem SQL Server to Synapse you gain:

Elastic compute – scale up for heavy workloads and pause when idle.
Unified analytics workspace – query data across lake and warehouse with a single service.
Lower TCO – no physical hardware to maintain or refresh.
Advanced security – built-in encryption, network isolation, and Managed Private Endpoints.

Migration Overview

Successful migrations follow a predictable life-cycle:

Assessment & Planning
Schema Conversion & Compatibility Fixes
Data Movement
ETL / Stored Procedure Refactoring
Performance Tuning & Validation
Cut-over & Monitoring

1. Assessment & Planning

Inventory and Baseline

Use Azure Migrate or Data Migration Assistant (DMA) to scan your SQL Server. Capture:

Database size, growth trends, and largest tables
Feature usage (CLR, CDC, Service Broker)
Index and partition strategy
ETL schedules and downstream consumers

Tip: Synapse does not support some SQL Server features (e.g., UNIQUE constraints, CDC). DMA will flag these so you can redesign early.

Choose a Synapse Architecture

Decide between Dedicated SQL Pool (MPP engine) and Serverless SQL (pay-per-query). Most data-warehouse migrations land in Dedicated SQL Pools for predictable performance.

Select the initial DWU (Data Warehouse Unit) level by matching current CPU/memory usage and concurrency. You can scale later without data reload.

2. Schema Conversion

Automated Conversion

Run Azure Synapse Pathway or SQL Server Migration Assistant (SSMA) to generate Synapse-compatible DDL. Address flagged items:

Drop unsupported features (e.g., UNIQUE constraints)
Replace IDENTITY with SEQUENCE
Remap DATETIME to DATETIME2 to avoid precision loss

Distribution and Index Strategy

Synapse tables require a distribution style:

Hash – best for large fact tables queried by a join key
Round-robin – good default for staging tables
Replicate – duplicate small dimension tables across nodes

Define clustered columnstore indexes for large append-only tables and clustered/non-clustered B-tree indexes for small tables with frequent lookups.

3. Data Movement

Staging to Azure

Bulk load data into Azure Data Lake Storage Gen2 (ADLS) using:

AzCopy – CLI utility for petabyte-scale transfers
Azure Data Factory or Synapse Pipelines – GUI-driven, supports incremental loads
Database Migration Service (DMS) – online, near-zero-downtime replication

Loading into Synapse

Once data files (.parquet, .csv) are in ADLS, ingest with:

COPY INTO dbo.Sales FROM 'https://<storage>.dfs.core.windows.net/stage/sales/' WITH ( FILE_TYPE = 'CSV', MAXERRORS = 0, FIELDTERMINATOR = ',', ROWTERMINATOR = '\n', FIRSTROW = 2 );

The COPY INTO command parallel-loads files across all Synapse nodes for maximum throughput.

4. ETL and Stored Procedure Refactoring

Transact-SQL syntax is almost identical, but key differences exist:

No MERGE with OUTPUT; use staging tables and INSERT/UPDATE
No WHILE loops that reference table variables > 8 MB; switch to set-based operations
Replace CROSS APPLY row-by-row logic with JOIN or window functions

Leverage Synapse Pipelines + Spark notebooks for complex transformations that require Python or Scala.

5. Performance Tuning & Validation

Statistics and Caching

Run UPDATE STATISTICS after large loads. Use DBCC PDW_SHOWSPACEUSED to verify distribution skew < 10%.

Workload Management

Create Workload Groups for concurrency control:

CREATE WORKLOAD GROUP etl WITH (MIN_PERCENTAGE_RESOURCE = 40);

Data Validation Queries

-- Row counts should match source SELECT 'Sales' AS TableName, COUNT(*) AS SynapseCount FROM dbo.Sales;

Automate row-count and checksum comparisons before sign-off.

6. Cut-Over & Monitoring

Plan a dual-write window where on-prem changes replicate to both systems via DMS. When downstream dashboards align, redirect BI tools to Synapse.

Post-cut-over, enable Intelligent Performance in Azure Advisor and monitor queries in Synapse Studio. Scale DWU up / down based on usage patterns.

Galaxy and SQL Migration

If your data team writes verification or reporting queries during migration, Galaxy’s desktop SQL editor can improve speed and collaboration. The AI Copilot autocompletes Synapse SQL dialect, while Collections let analysts share validation queries without pasting code into Slack. Although Galaxy is not a migration tool, it accelerates the testing and optimization phases where fast, trustworthy query iteration is critical.

Common Mistakes

Ignoring Data Skew

The problem: Default ROUND_ROBIN distribution can produce skewed joins, causing hotspots.
Fix: Choose a HASH key with high cardinality; monitor with sys.dm_pdw_nodes_db_partition_stats.

Underestimating ETL Rewrite Effort

Procedures using unsupported T-SQL will break. Plan for dedicated refactor sprints.

No Incremental Load Strategy

Attempting full reloads of multi-terabyte tables increases downtime. Implement CDC-style watermarks in ADF to load only changed data.

Best Practices

Stage data in ADLS Gen2; never load directly over VPN.
Create separate dev/test Synapse workspaces for dry runs.
Automate schema drift checks with Azure DevOps CI/CD.
Use PolyBase external tables for early report validation before full cut-over.

Real-World Example

A fintech company migrated a 40 TB on-prem SQL Server DW. They:

Ran DMA: 220 objects flagged, mostly IDENTITY and UNIQUE constraints.
Used AzCopy to push nightly exports (Parquet) into ADLS.
Loaded via COPY INTO at 15 TB/hr using DW3000c.
Refactored 80 stored procs to set-based logic; runtime dropped 60%.
Cut-over with 30-minute dual-write window; zero data loss.

Their analysts now query both historical and streaming data in a single Synapse workspace, reducing report runtime from 45 min to 3 min.

Why Migrating On-Prem SQL Server to Azure Synapse is important

On-premise SQL Server data warehouses struggle to scale and require costly hardware refreshes. Azure Synapse provides cloud elasticity, integrated analytics, and simplified security compliance. Understanding the migration process minimizes downtime and ensures workloads remain performant, enabling organizations to unlock modern BI and machine-learning capabilities without rewriting every application.

Migrating On-Prem SQL Server to Azure Synapse Example Usage

Migrating On-Prem SQL Server to Azure Synapse Syntax

Common Mistakes

Relying on default ROUND_ROBIN distribution for large fact tables, which causes severe data movement during joins. Fix by choosing a high-cardinality HASH distribution key aligned with common join predicates.
Attempting a single monolithic data load, leading to long cut-over windows and network throttling. Fix by designing incremental batches with Azure Data Factory and testing performance in lower environments first.
Copy-pasting complex stored procedures without checking Synapse T-SQL compatibility. Many features like CLR, linked servers, or certain MERGE patterns are unsupported. Fix by running Synapse Pathway and rewriting logic using supported set-based operations.

Frequently Asked Questions (FAQs)

Can I perform a zero-downtime migration to Synapse?

Yes. Use Azure Database Migration Service in online mode to replicate ongoing changes while the bulk of data is being moved. When lag drops to seconds, cut over applications.

How long does a typical 10 TB migration take?

With proper parallelism, bulk data copy can finish in 4–8 hours. Most project time is spent on schema conversion and stored procedure refactoring, which can take weeks.

Do I need to rewrite all T-SQL?

No. Basic SELECT, INSERT, UPDATE syntax is identical, but features like CLR, MERGE with OUTPUT, and certain hints are unsupported. Use Synapse Pathway to find what must change.

How does Galaxy help during migration?

Galaxy’s SQL editor accelerates validation and performance tuning by offering instant autocomplete for Synapse syntax and shared query Collections, ensuring the whole team tests against the same code.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Migrating On-Prem SQL Server to Azure Synapse

How do I migrate an on-premise SQL Server to Azure Synapse?

Description

Table of Contents

Why Migrate to Azure Synapse?

Migration Overview

1. Assessment & Planning

Inventory and Baseline

Choose a Synapse Architecture

2. Schema Conversion

Automated Conversion

Distribution and Index Strategy

3. Data Movement

Staging to Azure

Loading into Synapse

4. ETL and Stored Procedure Refactoring

5. Performance Tuning & Validation

Statistics and Caching

Workload Management

Data Validation Queries

6. Cut-Over & Monitoring

Galaxy and SQL Migration

Common Mistakes

Ignoring Data Skew

Underestimating ETL Rewrite Effort

No Incremental Load Strategy

Best Practices

Real-World Example

Why Migrating On-Prem SQL Server to Azure Synapse is important

Migrating On-Prem SQL Server to Azure Synapse Example Usage

Migrating On-Prem SQL Server to Azure Synapse Syntax

Common Mistakes

Frequently Asked Questions (FAQs)

Can I perform a zero-downtime migration to Synapse?

How long does a typical 10 TB migration take?

Do I need to rewrite all T-SQL?

How does Galaxy help during migration?

Want to learn about other SQL terms?

dbt tests

dbt utils

dbt Tutorial Glossary: Build, Test & Deploy SQL Models