Star and snowflake schemas are design patterns for organizing data-warehouse tables to balance query speed, storage, and maintainability.
A star schema stores one central fact table (e.g., OrderItems) surrounded by denormalized dimension tables such as Customers, Products, and Dates. Each dimension is a single table, giving joins a simple spoke-like layout.
A snowflake schema further normalizes those dimensions. For example, a Customers dimension may split into Customers, Countries, and Regions tables. The extra layers reduce duplication but add more joins.
Use a star schema when BI queries must run fast, the data volumes fit on modern disks, and dimensions are relatively small. Fewer joins mean simpler SQL and quicker aggregation.
Pick a snowflake schema when dimension tables are large or change frequently. Normalization keeps updates consistent and saves storage, at the cost of slightly slower queries.
Create a wide fact table OrderItems_Fact with foreign keys to denormalized Customers_Dim, Products_Dim, and Dates_Dim. Copy descriptive columns (name, price) into the dimension tables so analysts avoid additional joins.
Normalize the dimension tables: split Customers_Dim into Customers_Dim, Cities_Dim, and Countries_Dim. Store only surrogate keys in child tables. Update ETL pipelines to load these tables in the correct order.
Star schemas usually win for read-heavy analytics because MariaDB’s optimizer handles fewer joins. Snowflake schemas may match performance if you add covering indexes on foreign keys and use partitioning on the fact table.
Yes. Many teams denormalize critical dimensions (star) and normalize infrequently queried ones (snowflake). Evaluate each dimension on its update rate, size, and query frequency.
Yes. The optimizer can reorder joins and push down predicates, especially when indexes exist on foreign keys.
Surrogate integer keys speed joins, decouple ETL from source system IDs, and simplify slowly changing dimensions.