SELECT retrieves, filters, aggregates, and transforms rows stored in ClickHouse tables.
SELECT reads data from one or more tables, applies filters, joins, aggregations, and functions, and returns the resulting rows to the client. Because ClickHouse is column-oriented, SELECT is optimized for fast analytical queries.
Use SELECT * to return every column, or list columns explicitly to reduce network and CPU usage. Example: SELECT id, name, price FROM Products
.
Add a WHERE clause. ClickHouse skips entire data parts when the condition matches table indexes. Example: SELECT * FROM Orders WHERE order_date >='2024-01-01'
.
GROUP BY aggregates huge datasets quickly. Always pair aggregate functions with GROUP BY or use WITH TOTALS
for grand totals. Example: SELECT customer_id, sum(total_amount) FROM Orders GROUP BY customer_id
.
Yes. ClickHouse supports INNER, LEFT, RIGHT, and ANY joins. Keep join keys small and low-cardinality for performance. Example below joins Orders with Customers.
Use ORDER BY and LIMIT to sort and cut result size. ClickHouse streams results, so LIMIT reduces memory usage. Example: SELECT * FROM Products ORDER BY price DESC LIMIT 10
.
ARRAY JOIN explodes array columns into multiple rows; GLOBAL JOIN forces distributing a small table to every node in a cluster. Use them only when necessary to avoid large shuffles.
• Always project only needed columns.
• Filter early with WHERE.
• Prefer pre-aggregated materialized views for heavy dashboards.
• Check system.query_log
for slow queries and add indexes.
Yes. Use CREATE TABLE new_tbl AS SELECT ...
or INSERT INTO new_tbl SELECT ...
to materialize query results.
Absolutely. You can use scalar or table subqueries in FROM, JOIN, or WHERE clauses for complex logic.
Query system.processes
to see currently running queries and system.query_log
for historical performance metrics.