Redshift data types control storage size, precision, and performance for each table column.
Redshift stores every value in a column according to a specific data type such as INT, DECIMAL, or VARCHAR. The type fixes how many bytes are used, the valid range, and how values are compared and sorted.
Use SMALLINT (-32,768 to 32,767, 2 bytes) for flags or small counters, INT (-2.1 billion to 2.1 billion, 4 bytes) for IDs, and BIGINT (8 bytes) for large sequences like Order IDs across shards. Smaller types reduce disk and memory usage.
Choose DECIMAL(p,s) when you need fixed precision, e.g., DECIMAL(10,2) for price up to 99,999,999.99. Avoid FLOAT for currency because binary rounding causes inaccuracies.
VARCHAR(n) stores strings up to n characters without trailing space padding, ideal for names and emails. CHAR(n) always pads to n, useful for fixed-length codes like country ISO.
DATE stores calendar dates, TIMESTAMP stores date+time without zone, and TIMESTAMPTZ stores with zone. Convert client times to UTC on load for consistency.
The SUPER data type lets you ingest JSON or Parquet and query with PartiQL. Ideal for flexible product attributes or event logs.
Smaller types speed up scans and reduce costs. Always profile max value before choosing.
Over-allocating VARCHAR(65535) blocks compression. Size it to the real-world max plus margin.
Compare columns of the same type to keep predicates SARGable and enable zone-map pruning.
Yes, use ALTER TABLE ... ALTER COLUMN ... TYPE, but it rewrites data. Copy large tables into a new table for zero-downtime migrations.
Maximum is 65535 bytes. Remember multibyte UTF-8 characters may use more than one byte per character.
Native ARRAY is not supported; use SUPER with PartiQL, or normalise into child tables for high-performance analytics.