Migrating On-Prem SQL Server to Azure Synapse

Galaxy Glossary

How do I migrate an on-premise SQL Server to Azure Synapse?

Moving databases, schema, and workloads from an on-premise Microsoft SQL Server instance to Azure Synapse Analytics using assessment, schema conversion, data movement, and optimization tools.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.

Description

Table of Contents

Migrating an on-premise SQL Server to Azure Synapse Analytics unlocks virtually limitless scalability, integrates cloud-native analytics services, and reduces hardware maintenance. This guide walks through every phase—from assessing legacy workloads to validating results—so you can execute a smooth, low-risk transition.

Why Migrate to Azure Synapse?

Azure Synapse (formerly SQL Data Warehouse) is Microsoft’s flagship cloud analytics platform. It combines a massively parallel processing (MPP) relational engine with native integration to Azure Data Lake, Spark, Power BI, and Azure Machine Learning. By moving the data warehouse from on-prem SQL Server to Synapse you gain:

  • Elastic compute – scale up for heavy workloads and pause when idle.
  • Unified analytics workspace – query data across lake and warehouse with a single service.
  • Lower TCO – no physical hardware to maintain or refresh.
  • Advanced security – built-in encryption, network isolation, and Managed Private Endpoints.

Migration Overview

Successful migrations follow a predictable life-cycle:

  1. Assessment & Planning
  2. Schema Conversion & Compatibility Fixes
  3. Data Movement
  4. ETL / Stored Procedure Refactoring
  5. Performance Tuning & Validation
  6. Cut-over & Monitoring

1. Assessment & Planning

Inventory and Baseline

Use Azure Migrate or Data Migration Assistant (DMA) to scan your SQL Server. Capture:

  • Database size, growth trends, and largest tables
  • Feature usage (CLR, CDC, Service Broker)
  • Index and partition strategy
  • ETL schedules and downstream consumers

Tip: Synapse does not support some SQL Server features (e.g., UNIQUE constraints, CDC). DMA will flag these so you can redesign early.

Choose a Synapse Architecture

Decide between Dedicated SQL Pool (MPP engine) and Serverless SQL (pay-per-query). Most data-warehouse migrations land in Dedicated SQL Pools for predictable performance.

Select the initial DWU (Data Warehouse Unit) level by matching current CPU/memory usage and concurrency. You can scale later without data reload.

2. Schema Conversion

Automated Conversion

Run Azure Synapse Pathway or SQL Server Migration Assistant (SSMA) to generate Synapse-compatible DDL. Address flagged items:

  • Drop unsupported features (e.g., UNIQUE constraints)
  • Replace IDENTITY with SEQUENCE
  • Remap DATETIME to DATETIME2 to avoid precision loss

Distribution and Index Strategy

Synapse tables require a distribution style:

  • Hash – best for large fact tables queried by a join key
  • Round-robin – good default for staging tables
  • Replicate – duplicate small dimension tables across nodes

Define clustered columnstore indexes for large append-only tables and clustered/non-clustered B-tree indexes for small tables with frequent lookups.

3. Data Movement

Staging to Azure

Bulk load data into Azure Data Lake Storage Gen2 (ADLS) using:

  • AzCopy – CLI utility for petabyte-scale transfers
  • Azure Data Factory or Synapse Pipelines – GUI-driven, supports incremental loads
  • Database Migration Service (DMS) – online, near-zero-downtime replication

Loading into Synapse

Once data files (.parquet, .csv) are in ADLS, ingest with:

COPY INTO dbo.Sales
FROM 'https://<storage>.dfs.core.windows.net/stage/sales/'
WITH (
FILE_TYPE = 'CSV',
MAXERRORS = 0,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW = 2
);

The COPY INTO command parallel-loads files across all Synapse nodes for maximum throughput.

4. ETL and Stored Procedure Refactoring

Transact-SQL syntax is almost identical, but key differences exist:

  • No MERGE with OUTPUT; use staging tables and INSERT/UPDATE
  • No WHILE loops that reference table variables > 8 MB; switch to set-based operations
  • Replace CROSS APPLY row-by-row logic with JOIN or window functions

Leverage Synapse Pipelines + Spark notebooks for complex transformations that require Python or Scala.

5. Performance Tuning & Validation

Statistics and Caching

Run UPDATE STATISTICS after large loads. Use DBCC PDW_SHOWSPACEUSED to verify distribution skew < 10%.

Workload Management

Create Workload Groups for concurrency control:

CREATE WORKLOAD GROUP etl
WITH (MIN_PERCENTAGE_RESOURCE = 40);

Data Validation Queries

-- Row counts should match source
SELECT 'Sales' AS TableName, COUNT(*) AS SynapseCount
FROM dbo.Sales;

Automate row-count and checksum comparisons before sign-off.

6. Cut-Over & Monitoring

Plan a dual-write window where on-prem changes replicate to both systems via DMS. When downstream dashboards align, redirect BI tools to Synapse.

Post-cut-over, enable Intelligent Performance in Azure Advisor and monitor queries in Synapse Studio. Scale DWU up / down based on usage patterns.

Galaxy and SQL Migration

If your data team writes verification or reporting queries during migration, Galaxy’s desktop SQL editor can improve speed and collaboration. The AI Copilot autocompletes Synapse SQL dialect, while Collections let analysts share validation queries without pasting code into Slack. Although Galaxy is not a migration tool, it accelerates the testing and optimization phases where fast, trustworthy query iteration is critical.

Common Mistakes

Ignoring Data Skew

The problem: Default ROUND_ROBIN distribution can produce skewed joins, causing hotspots.
Fix: Choose a HASH key with high cardinality; monitor with sys.dm_pdw_nodes_db_partition_stats.

Underestimating ETL Rewrite Effort

Procedures using unsupported T-SQL will break. Plan for dedicated refactor sprints.

No Incremental Load Strategy

Attempting full reloads of multi-terabyte tables increases downtime. Implement CDC-style watermarks in ADF to load only changed data.

Best Practices

  • Stage data in ADLS Gen2; never load directly over VPN.
  • Create separate dev/test Synapse workspaces for dry runs.
  • Automate schema drift checks with Azure DevOps CI/CD.
  • Use PolyBase external tables for early report validation before full cut-over.

Real-World Example

A fintech company migrated a 40 TB on-prem SQL Server DW. They:

  1. Ran DMA: 220 objects flagged, mostly IDENTITY and UNIQUE constraints.
  2. Used AzCopy to push nightly exports (Parquet) into ADLS.
  3. Loaded via COPY INTO at 15 TB/hr using DW3000c.
  4. Refactored 80 stored procs to set-based logic; runtime dropped 60%.
  5. Cut-over with 30-minute dual-write window; zero data loss.

Their analysts now query both historical and streaming data in a single Synapse workspace, reducing report runtime from 45 min to 3 min.

Why Migrating On-Prem SQL Server to Azure Synapse is important

On-premise SQL Server data warehouses struggle to scale and require costly hardware refreshes. Azure Synapse provides cloud elasticity, integrated analytics, and simplified security compliance. Understanding the migration process minimizes downtime and ensures workloads remain performant, enabling organizations to unlock modern BI and machine-learning capabilities without rewriting every application.

Migrating On-Prem SQL Server to Azure Synapse Example Usage



Migrating On-Prem SQL Server to Azure Synapse Syntax



Common Mistakes

Frequently Asked Questions (FAQs)

Can I perform a zero-downtime migration to Synapse?

Yes. Use Azure Database Migration Service in online mode to replicate ongoing changes while the bulk of data is being moved. When lag drops to seconds, cut over applications.

How long does a typical 10 TB migration take?

With proper parallelism, bulk data copy can finish in 4–8 hours. Most project time is spent on schema conversion and stored procedure refactoring, which can take weeks.

Do I need to rewrite all T-SQL?

No. Basic SELECT, INSERT, UPDATE syntax is identical, but features like CLR, MERGE with OUTPUT, and certain hints are unsupported. Use Synapse Pathway to find what must change.

How does Galaxy help during migration?

Galaxy’s SQL editor accelerates validation and performance tuning by offering instant autocomplete for Synapse syntax and shared query Collections, ensuring the whole team tests against the same code.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.