The essential SQL topics and sample questions most frequently encountered in data-analyst interviews, along with explanations, examples, and best-practice solutions.
Companies rely on SQL interviews to assess whether a data-analyst candidate can extract, clean, and analyze data efficiently. Most interviews focus on a recurring set of topics—joins, aggregations, window functions, sub-queries, CTEs, and performance tuning. Mastering these areas not only improves interview performance but also prepares you for real-world analytics work.
SQL remains the lingua franca of data. From product analytics to marketing attribution, analysts use SQL every day to transform raw tables into actionable insights. Interview questions replicate those tasks under time pressure. By studying the common patterns, you can:
Typical question: “Return all customers who purchased in January but not in February.”
Interviewers test your understanding of LEFT JOIN
, ANTI JOIN
, and set operations.
WITH jan AS (
SELECT DISTINCT customer_id FROM sales WHERE month = '2024-01'),
feb AS (
SELECT DISTINCT customer_id FROM sales WHERE month = '2024-02')
SELECT j.customer_id
FROM jan j
LEFT JOIN feb f USING (customer_id)
WHERE f.customer_id IS NULL;
Typical question: “Find the top 3 products by total revenue per region.”
Candidates must know GROUP BY
, ORDER BY
, and sometimes correlated sub-queries or window functions for ranking.
Typical question: “For each user, return the 3 most recent sessions.”
This assesses familiarity with ROW_NUMBER()
, PARTITION BY
, and ORDER BY
inside window clauses.
Typical question: “Calculate month-over-month growth in active users.”
Often solved with multi-step CTEs for readability—highly valued in code reviews and modern SQL editors like Galaxy.
Typical question: “Count users retained 7 days after signup.”
Requires proficiency with date arithmetic and join conditions involving offsets.
Typical question: “Normalize country codes and compute revenue per normalized country.”
Typical question: “Why is this query slow and how would you fix it?”
Interviewers look for discussion on indexing, selective predicates, and avoiding SELECT *
.
AVG()
silently drop NULLs, leading to biased metrics.COALESCE
or filter NULLs deliberately.ORDER BY
can yield random results.session_id
or created_at
.The following query answers a favorite interview prompt: “Compute D7 retention—what percentage of users logged in seven days after signup?”
WITH signups AS (
SELECT user_id, MIN(event_time::date) AS signup_date
FROM events
WHERE event_type = 'signup'
GROUP BY user_id ),
login_day7 AS (
SELECT e.user_id, MIN(e.event_time::date) AS login_date
FROM events e
JOIN signups s ON e.user_id = s.user_id
WHERE e.event_type = 'login'
AND e.event_time::date = s.signup_date + INTERVAL '7 days'
GROUP BY e.user_id )
SELECT COUNT(DISTINCT l.user_id)::decimal / COUNT(DISTINCT s.user_id) AS d7_retention_rate
FROM signups s
LEFT JOIN login_day7 l USING (user_id);
Why it works: The signup cohort is isolated first, then joined to logins exactly seven days later, ensuring one-to-one comparison and avoiding double counts.
When practicing these queries in Galaxy:
{{target_date}}
) to practice against rolling dates without manual edits.SQL interview questions for data analysts revolve around a predictable toolkit: joins, aggregations, window functions, CTEs, and performance tuning. Master these concepts, practice under timed settings, and leverage modern tools like Galaxy to streamline your workflow. Doing so will not only help you ace interviews but also make you a more effective analyst on day one.
Interviews often condense real-world data challenges into timed exercises. Knowing the standard SQL patterns lets analysts demonstrate competence quickly, reduces anxiety, and directly translates to on-the-job efficiency. Firms assume that someone who writes clear, optimized SQL will maintain dashboards, debug metrics, and support decision-making with minimal supervision.
Start with clarifying questions, outline your approach verbally, build the query using readable CTEs, and discuss performance trade-offs after achieving a correct result.
Joins, aggregations, window functions, sub-queries/CTEs, date manipulation, and basic optimization techniques appear in nearly every analyst interview.
Galaxy’s AI Copilot auto-completes complex syntax, suggests optimizations, and lets you store reusable snippets in Collections, accelerating iteration and learning.
LeetCode, StrataScratch, Mode Analytics, and company's own open-source question banks provide a wide range of real-world scenarios to hone your skills.