Techniques and best practices for speeding up SQL queries whose bottleneck is an expensive ORDER BY clause.
Optimizing Slow ORDER BY Queries
Learn why ORDER BY can become the dominant cost in otherwise simple SQL statements, and discover concrete indexing, pagination, and query-rewriting strategies to bring sort times from minutes down to milliseconds.
ORDER BY
Is Often SlowSorting a result set feels trivial in everyday programming languages, yet in a relational database it can be the single most expensive step of an execution plan. Unlike procedural code, SQL sorts must work over potentially millions of rows, respect collation rules, and often need to spill to disk when memory limits are reached. Worse, poorly chosen indexes, implicit data-type conversions, and unnecessary broad SELECT lists can force the database engine into a full table scan followed by an external sort.
ORDER BY
LIMIT
and cursor pagination are in play.ETL pipelines, BI dashboards, and customer-facing APIs regularly require ordered data—top N leaderboards, latest events, chronological reports, etc. A slow ORDER BY delays downstream jobs, increases infrastructure cost, and degrades user experience. Since ORDER BY performance closely ties to index and storage design, optimizing it often yields system-wide speedups.
ORDER BY
If an index’s leading column(s) exactly match the ORDER BY keys and the sort direction, the database can read rows in pre-sorted order, eliminating the sort step entirely. For example:
-- Slow query
SELECT id, email, created_at
FROM users
WHERE status = 'active'
ORDER BY created_at DESC;
-- Fast query after covering index
CREATE INDEX idx_users_status_created_at_desc
ON users(status, created_at DESC);
Because status
filters the rows and created_at DESC
orders them, the composite index satisfies both, producing an Index Only Scan in PostgreSQL or a range scan in MySQL.
Mixing ASC and DESC on different columns can block index use. If you must mix directions, explore database-specific features such as CREATE INDEX … (col1 ASC, col2 DESC)
(PostgreSQL 11+, SQL Server) or maintain two indexes—one for each dominant access pattern.
Always filter as many rows as possible before ORDER BY. In SQL that means:
-- Anti-pattern: wrapping filter around a derived table
SELECT *
FROM (
SELECT * FROM events ORDER BY occurred_at DESC
) AS t
WHERE user_id = 42;
-- Better: filter first
SELECT *
FROM events
WHERE user_id = 42
ORDER BY occurred_at DESC;
LIMIT
+ Proper Indexing to Avoid Full SortsWhen you only need the first N rows, adding LIMIT
is not enough. Without an index the database still sorts the entire set. Combine LIMIT with a matching index or TIES
/FETCH FIRST
semantics so the planner stops once enough rows are produced.
Traditional OFFSET pagination forces the engine to skip O(N²) rows: each page re-reads the previous ones. Seek pagination uses a WHERE col < :last_seen
predicate:
-- Seek method
SELECT id, body, created_at
FROM comments
WHERE post_id = :post_id
AND created_at < :cursor_created_at
ORDER BY created_at DESC
LIMIT 50;
This keeps the index scan linear and avoids large OFFSET counts.
If the index does not cover all selected columns, the engine performs a back lookup for each row, adding latency. In PostgreSQL add the remaining columns as INCLUDE; in MySQL/SQL Server you specify them as included columns. Only relevant rows and columns are read in order.
For huge tables, partitioning by time or tenant allows each partition’s local index to satisfy ORDER BY quickly. For analytics engines (BigQuery, Redshift, Snowflake) clustering/sort keys serve a similar role.
Raising PostgreSQL’s work_mem
, MySQL’s sort_buffer_size
, or SQL Server’s tempdb memory grants can help, but curing root-cause indexing issues is nearly always more effective than throwing memory at the problem.
Expressions like ORDER BY DATE(created_at)
defeat index ordering. Create a computed/persisted column or store a truncated timestamp field instead, then index it.
Tools such as EXPLAIN ANALYZE
(PostgreSQL), EXPLAIN FORMAT=JSON
(MySQL 8), or SQL Server’s Actual Execution Plan pinpoint whether the cost lies in a sort operator, an external sort, or a spill to tempdb. Validate each optimization against the plan—not assumptions.
Given a SaaS application with a payments
table:
CREATE TABLE payments (
id BIGSERIAL PRIMARY KEY,
account_id BIGINT NOT NULL,
amount_cents INT NOT NULL,
state TEXT NOT NULL,
paid_at TIMESTAMP NOT NULL
);
The product team needs an API endpoint that returns the most recent 100 successful payments of an account. The naïve query:
SELECT id, amount_cents, paid_at
FROM payments
WHERE account_id = 42 AND state = 'paid'
ORDER BY paid_at DESC
LIMIT 100;
Performance tests show 2.2 seconds at p95. Using the optimization checklist:
CREATE INDEX idx_payments_account_state_paid_at_desc
ON payments(account_id, state, paid_at DESC);
EXPLAIN ANALYZE
. The plan is now an Index Only Scan with no Sort node; execution drops to 5 ms.Inside Galaxy’s desktop SQL editor the workflow is streamlined:
CREATE INDEX
statement.Because Galaxy shares execution plans and history across the workspace, other engineers instantly understand why the index was added and avoid re-introducing the antipattern.
LIMIT
AloneAssuming LIMIT speeds things up without an index; the engine still sorts the full set. Fix by adding a matching index.
OFFSET
PaginationHigh OFFSET values grow linearly in cost. Replace with seek pagination.
Text sorts under a non-default collation can’t use a B-tree index built with the default collation. Build the index with the same collation.
ORDER BY is ubiquitous in analytics—top-N reports, chronological feeds, leaderboards. When it turns into a disk-based external sort, query latency explodes, dashboards time out, and API SLAs break. Optimizing ORDER BY therefore yields large wins across the data stack, cuts cloud spend, and keeps end-users happy.
Because the database must still sort all qualifying rows before applying LIMIT unless an index already provides the rows in sorted order. Add a matching index or rewrite the query to avoid the broad sort.
Use seek pagination: keep the last seen value on the ordered column (e.g., id or timestamp) and filter with WHERE col < :cursor
for descending order. It scales linearly and fully leverages the index.
Look for nodes labelled Sort, External Sort, or filesort with a high cost or large “Memory”/“Spilled” stats. If the plan instead shows an Index Scan and no Sort, the ORDER BY is already optimized.
Galaxy’s AI Copilot analyzes the EXPLAIN plan, flags heavy sort nodes, and suggests covering indexes or query rewrites. You can then commit the fix and endorse the optimized query so the whole team benefits.