How to Query JSON in Redshift

Galaxy Glossary

How can I query JSON columns in Amazon Redshift?

Retrieve, parse, and manipulate JSON data stored in Amazon Redshift tables using built-in functions or the SUPER data type.

Sign up for the latest in SQL knowledge from the Galaxy Team!

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What JSON capabilities does Redshift offer?<\/h2>Redshift supports two approaches: classic JSON string columns queried with JSON_EXTRACT_* functions, and the newer SUPER data type with PartiQL for semi-structured data. Both work in all RA3 node types.<\/p>

How do I load JSON data into Redshift?<\/h2>Use COPY with FORMAT AS JSON. Provide either 'auto' or a JSONPaths file to map fields to columns. When targeting SUPER, use FORMAT AS JSON 'auto'.<\/p>

Example COPY with JSONPaths<\/h3>`COPY orders FROM 's3://bucket/orders/' IAM_ROLE 'arn:aws:iam::123:role/redshift' FORMAT AS JSON 's3://bucket/jsonpaths/orders_paths.json';<\/code><\/p>`

`How can I query JSON string columns?<\/h2>`

Call JSON_EXTRACT_PATH_TEXT or JSON_EXTRACT_ARRAY_ELEMENT_TEXT. These return VARCHAR, so cast when you need numeric or date types.<\/p>

Using JSON_EXTRACT_PATH_TEXT<\/h3>`SELECT id, JSON_EXTRACT_PATH_TEXT(extra,'shipping','city') AS city FROM orders;<\/code><\/p>`

`How can I query SUPER columns with PartiQL?<\/h2>`

SUPER stores raw JSON. Query using the table.col.key<\/code> or col['key']<\/code> syntax. PartiQL automatically un-nests arrays and objects.<\/p>

`Example PartiQL query<\/h3>`

SELECT o.id, o.details.shipping.city FROM orders AS o WHERE o.details.total_amount > 100;<\/code><\/p>

`Best practices for JSON in Redshift<\/h2>`

Prefer typed columns for high-value fields. Store the rest in a SUPER or JSON column. Create late-materialized views to project JSON keys as columns for BI tools.<\/p>

How do I update values inside JSON?<\/h2>For JSON strings, rebuild the JSON in application code. For SUPER, use `JSON_SET<\/code> or ||<\/code> to merge objects: UPDATE orders SET details = details || {'status':'shipped'};<\/code><\/p>`

Why How to Query JSON in Redshift is important

How to Query JSON in Redshift Example Usage


-- Find top cities by total sales using JSON
SELECT JSON_EXTRACT_PATH_TEXT(o.extra,'shipping','city') AS city,
       SUM(o.total_amount)                               AS revenue
FROM   Orders AS o
GROUP  BY city
ORDER  BY revenue DESC
LIMIT 5;

How to Query JSON in Redshift Syntax


-- Load JSON file into a normal table
COPY Orders(id, customer_id, order_date, total_amount, extra)
FROM 's3://bucket/orders/'
IAM_ROLE 'arn:aws:iam::123:role/redshift'
FORMAT AS JSON 'auto';

-- Query JSON string column
SELECT id,
       JSON_EXTRACT_PATH_TEXT(extra,'shipping','city')        AS city,
       JSON_EXTRACT_PATH_TEXT(extra,'shipping','postal_code') AS zip
FROM   Orders;

-- Create table with SUPER
CREATE TABLE OrdersSuper (
    id           INT,
    customer_id  INT,
    order_date   DATE,
    details      SUPER
);

-- Query SUPER with PartiQL
SELECT o.id,
       o.details.payment.method        AS pay_method,
       o.details.items[0].product_id   AS first_item
FROM   OrdersSuper AS o;

Common Mistakes

Using VARCHAR to store large JSON blobs without compression. Fix: Compress columns with ENCODE ZSTD or switch to SUPER for efficient storage and querying.
Casting JSON_EXTRACT_PATH_TEXT directly to INT without validation. Fix: First check ISVALIDJSON or use TRY_CAST to avoid runtime errors when keys are missing.

Frequently Asked Questions (FAQs)

Can I index JSON data in Redshift?<\/h3>No traditional indexes exist in Redshift. Use sort keys on frequently filtered columns or projection views for JSON keys.<\/p>

Does Redshift support JSONB like PostgreSQL?<\/h3>Redshift offers SUPER + PartiQL instead of JSONB. SUPER delivers similar functionality with automatic schema discovery.<\/p>

When should I denormalize JSON into columns?<\/h3>Extract high-cardinality keys that appear in most queries—such as customer_id or status—to typed columns. Keep rarely queried attributes in JSON.<\/p>

Want to learn about other SQL terms?

dbt tests

dbt tests are declarative assertions in dbt that automatically validate data quality during model builds.

dbt utils

dbt utils is a community-maintained dbt package that adds time-saving utility macros for analytics engineers, such as generating surrogate keys, safely unioning models, and automating date spines.

dbt Tutorial Glossary: Build, Test & Deploy SQL Models

dbt is an open-source framework that lets data teams transform raw warehouse tables into tested, documented, production-ready datasets using version-controlled SQL.

Trusted by top engineers on high-velocity teams

Assort Health

Curri

Rubie

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.