Install the AWS CLI, create a Redshift cluster, connect with psql, and load data—all from any Linux distribution.
Running Redshift from Linux lets you automate cluster creation, load data, and query warehouses directly in scripts and CI pipelines. The steps below work on Ubuntu, Debian, RHEL, CentOS, and Amazon Linux.
Install AWS CLI v2 for cluster management, the PostgreSQL client (psql
) for SQL access, and optionally the Amazon Redshift ODBC/JDBC driver for GUI or BI tools. An AWS account and IAM user with AmazonRedshiftFullAccess
are mandatory.
Use curl -O https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip
, unzip, then run sudo ./aws/install
. On Debian-based systems install psql via sudo apt install postgresql-client
; on RPM-based systems use sudo yum install postgresql
.
Run aws configure
and enter Access Key, Secret Key, default region, and output format. Credentials are stored in ~/.aws/credentials
.
Execute aws redshift create-cluster
with identifier, node type, node count, master username, password, security-group IDs, and IAM role. The cluster endpoint appears once status becomes available.
Modify the attached security group to allow inbound TCP 5439 from your IP. Alternatively, keep the port closed and connect through an SSH tunnel.
Run psql -h ENDPOINT -p 5439 -U admin -d dev
. If SSL errors appear, add sslmode=require
.
After connecting, run CREATE TABLE
statements for Customers
, Orders
, Products
, and OrderItems
. Use DISTKEY
on customer_id
and SORTKEY
on order_date
to optimize joins.
Stage CSV or Parquet files in S3, then execute COPY
with an IAM role ARN. Redshift ingests data in parallel and skips headers with IGNOREHEADER 1
.
Store commands in shell scripts or CI jobs. Delete non-production clusters via aws redshift delete-cluster --skip-final-cluster-snapshot
to cut costs.
COPY
/UNLOAD
.Wrong AWS region: a psql timeout usually means the cluster lives in another region. Re-run commands with --region
set correctly.
Missing IAM role in COPY
: error 500 occurs when the role isn’t associated. Fix with aws redshift associate-iam-roles
.
Try Galaxy—an AI-powered desktop SQL editor that connects to Redshift, autocompletes queries, and lets teams share vetted SQL in one click.
Yes. Use aws redshift-serverless create-workgroup
and create-namespace
. Connect with the endpoint returned by get-workgroup
.
Absolutely. Create a bastion host in the same VPC and establish an SSH tunnel: ssh -L 5439:ENDPOINT:5439 ec2-user@BASTION
.
A multi-node RA3 cluster typically takes 10-15 minutes to reach status available. Smaller DC2 clusters finish in 5-10 minutes.
Yes. After connecting with psql, create an external schema linked to an AWS Glue Catalog and query data in S3 alongside native tables.
Run aws redshift delete-cluster --cluster-identifier ecommerce-cluster --skip-final-cluster-snapshot
. This deletes compute nodes but preserves snapshots if you didn’t skip them.