How to CREATE INDEX in Redshift

Galaxy Glossary

How do I create an index in Amazon Redshift?

Redshift ignores CREATE INDEX; use SORT KEY, DIST KEY, and materialized views instead.

Sign up for the latest in SQL knowledge from the Galaxy Team!

Description

Does Amazon Redshift support the CREATE INDEX command?

Redshift is a column-oriented, massively parallel analytic database that does not implement secondary B-tree indexes. Any attempt to run CREATE INDEX returns an error. Instead, Redshift relies on SORT KEYs, DIST KEYs, zone maps, and result caching to accelerate queries.

What is the standard CREATE INDEX syntax?

PostgreSQL uses:

CREATE INDEX index_name ON table_name (column1, column2);

Redshift’s SQL parser lacks this statement, so the same syntax is unsupported.

How can I speed up lookups without indexes?

Use SORT KEY for range filters and ORDER BY

A SORT KEY physically orders data blocks, enabling zone maps to skip entire blocks during scans. Choose columns frequently filtered by range or used in ORDER BY, such as order_date.

CREATE TABLE Orders (
id BIGINT IDENTITY(1,1),
customer_id BIGINT,
order_date DATE,
total_amount NUMERIC(12,2)
)
SORTKEY (order_date);

Add DIST KEY for large joins

Set DISTKEY on a high-cardinality column used in joins—customer_id for Orders and Customers tables keeps related rows on the same node and avoids network shuffles.

CREATE TABLE OrderItems (
id BIGINT IDENTITY(1,1),
order_id BIGINT,
product_id BIGINT,
quantity INT
)
DISTKEY(order_id)
SORTKEY(order_id);

Example: speeding up recent-orders lookup

The query below filters the last 7 days. With order_date as SORT KEY, only the most-recent storage blocks are scanned, delivering sub-second results.

SELECT o.id,
c.name,
o.total_amount
FROM Orders o
JOIN Customers c
ON c.id = o.customer_id
WHERE o.order_date >= CURRENT_DATE - INTERVAL '7 day';

Best practices when indexes are unavailable

  • Store time-series data in append-only tables with order_date as SORT KEY.
  • Use EVEN distribution when no clear join key exists.
  • Refresh materialized views on heavy aggregations instead of creating indexes.

Common mistakes and fixes

Running CREATE INDEX and waiting for it to appear

The command fails because Redshift does not implement it. Rewrite the table with proper SORT and DIST keys.

Choosing a low-cardinality column as SORT KEY

A column with few distinct values, like status, clusters similar rows together, forcing full scans. Pick high-cardinality or time-based columns.

Why How to CREATE INDEX in Redshift is important

How to CREATE INDEX in Redshift Example Usage


-- Find recent orders with optimized SORT KEY
SELECT o.id,
       c.name,
       o.total_amount
FROM   Orders o
JOIN   Customers c ON c.id = o.customer_id
WHERE  o.order_date >= CURRENT_DATE - INTERVAL '30 day'
ORDER  BY o.order_date DESC;

How to CREATE INDEX in Redshift Syntax


-- Unsupported in Redshift (works in PostgreSQL)
CREATE INDEX idx_orders_customer_id ON Orders (customer_id);

-- Redshift alternative: create table with keys
CREATE TABLE Orders (
    id            BIGINT      IDENTITY(1,1),
    customer_id   BIGINT,
    order_date    DATE,
    total_amount  NUMERIC(12,2)
)
DISTKEY(customer_id)
SORTKEY(order_date);

Common Mistakes

Frequently Asked Questions (FAQs)

Why does Redshift ignore primary and foreign keys?

Key constraints exist only for query planners; Redshift does not enforce them. They help the optimizer skip joins but offer no physical index.

Is there any way to add an index-like structure later?

You can’t add secondary indexes, but you can ALTER TABLE to add a compound SORT KEY or create a materialized view to pre-aggregate data.

When should I use a compound versus interleaved SORT KEY?

Use compound when queries filter on the leading column. Use interleaved when multiple columns appear in different filter combinations.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie
BauHealth Logo
Truvideo Logo