Self Join SQL

Galaxy Glossary

How can you join a table to itself in SQL?

A self join allows you to compare rows within the same table based on a related column. This is useful for finding relationships between different instances of the same entity. For example, finding employees who report to the same manager.
Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

A self join is a type of join operation in SQL where a table is joined to itself. This is different from joining two separate tables. Instead, you're essentially creating a virtual copy of the table to compare rows within the same table. This is particularly useful when you need to find relationships between different rows of the same table, such as finding hierarchical relationships (e.g., reporting structures in an organization), or identifying pairs of items with a specific relationship (e.g., finding customers who purchased the same product). The key is identifying a common column that links rows in the self-joined table. The syntax is similar to a regular join, but you're joining the table to itself. Crucially, you need to give different aliases to the table to distinguish between the two copies of the table in the join condition. This is essential for avoiding ambiguity in the query.

Why Self Join SQL is important

Self joins are crucial for analyzing relationships within a single dataset. They enable complex queries that would be difficult or impossible to perform with other join types. They are essential for tasks like finding hierarchical structures, identifying cycles, or comparing data within the same table.

Example Usage


-- Sample table: Sales
CREATE TABLE Sales (
    SalesPerson VARCHAR(50),
    Date DATE,
    SalesAmount DECIMAL(10, 2)
);

INSERT INTO Sales (SalesPerson, Date, SalesAmount) VALUES
('Alice', '2024-01-01', 100),
('Alice', '2024-01-02', 150),
('Bob', '2024-01-01', 120),
('Bob', '2024-01-02', 180),
('Alice', '2024-01-03', 200);

-- Calculate running total of sales for each salesperson
SELECT
    SalesPerson,
    Date,
    SalesAmount,
    SUM(SalesAmount) OVER (PARTITION BY SalesPerson ORDER BY Date) AS RunningTotal
FROM
    Sales;

Common Mistakes

Want to learn about other SQL terms?