Glossary/OLTP, OLAP & Workload Types

Snowflake Schema

A snowflake schema is a data warehouse design pattern extending star schemas by normalizing dimension tables into multiple related tables, reducing redundancy at the cost of additional joins.

Star schemas denormalize dimensions to simplify querying, storing attributes in single-level tables. Snowflake schemas apply normalization, breaking dimensions into related tables. A product dimension in a star schema contains product_id, name, category, and category_description. In a snowflake schema, product and category are separate tables; products reference categories through foreign keys.

This normalization reduces storage by eliminating redundant category descriptions. When a category description changes, snowflake updates one row; star schema updates thousands of product rows. However, snowflake adds complexity: queries require additional joins (product to category), and the schema becomes less intuitive for business users unfamiliar with normalization.

Snowflake schemas are less common than star schemas in modern data warehousing. Cloud storage is inexpensive, making denormalization cost-effective. Query engines efficiently handle additional joins. Business users and BI tools expect the simplicity of star schemas. Snowflake schemas remain useful in storage-constrained environments or when data volumes are enormous, making redundancy elimination significant.

Key Characteristics

  • Normalize dimension tables into multiple related tables
  • Reduce data redundancy through normalization
  • Require additional joins to reconstruct dimension attributes
  • More complex schema structure than star schemas
  • Save storage at cost of query complexity
  • Support hierarchical dimension relationships

Why It Matters

  • Reduce storage requirements for large dimensional hierarchies
  • Simplify dimension maintenance by updating one table per change
  • Support hierarchical dimensional relationships explicitly
  • Enable flexible drill-down analysis through normalized hierarchies
  • Reduce data inconsistency through single-source-of-truth dimensions
  • Optimize for environments with strict storage constraints

Example

`
-- Star schema (denormalized) vs Snowflake (normalized)

-- STAR SCHEMA (single-level dimensions)
CREATE TABLE dim_product_star (
  product_id INT PRIMARY KEY,
  product_name VARCHAR(255),
  category VARCHAR(100),
  category_description TEXT,
  subcategory VARCHAR(100),
  subcategory_description TEXT,
  department VARCHAR(50),
  supplier_id INT
);

-- SNOWFLAKE SCHEMA (normalized dimensions)
CREATE TABLE dim_category (
  category_id INT PRIMARY KEY,
  category_name VARCHAR(100),
  category_description TEXT
);

CREATE TABLE dim_subcategory (
  subcategory_id INT PRIMARY KEY,
  subcategory_name VARCHAR(100),
  category_id INT REFERENCES dim_category,
  subcategory_description TEXT
);

CREATE TABLE dim_department (
  department_id INT PRIMARY KEY,
  department_name VARCHAR(50)
);

CREATE TABLE dim_product_snowflake (
  product_id INT PRIMARY KEY,
  product_name VARCHAR(255),
  subcategory_id INT REFERENCES dim_subcategory,
  department_id INT REFERENCES dim_department,
  supplier_id INT
);

-- Query comparison:

-- Star schema query (fewer joins)
SELECT 
  p.category,
  p.department,
  SUM(f.revenue) as total_revenue
FROM fct_sales f
JOIN dim_product_star p ON f.product_id = p.product_id
GROUP BY p.category, p.department;

-- Snowflake schema query (more joins)
SELECT 
  c.category_name,
  d.department_name,
  SUM(f.revenue) as total_revenue
FROM fct_sales f
JOIN dim_product_snowflake p ON f.product_id = p.product_id
JOIN dim_subcategory s ON p.subcategory_id = s.subcategory_id
JOIN dim_category c ON s.category_id = c.category_id
JOIN dim_department d ON p.department_id = d.department_id
GROUP BY c.category_name, d.department_name;
`

Coginiti Perspective

Coginiti's SMDL can model snowflake schema structures by defining entities for each normalized dimension table and declaring relationships (one_to_one, one_to_many, many_to_one) between them. Semantic SQL resolves these relationship chains into correct multi-table joins automatically, so analysts query across normalized dimensions without manually constructing the join paths. CoginitiScript publication can materialize both the fact and normalized dimension tables as part of a single pipeline with dependency ordering handled through block references.

See Semantic Intelligence in Action

Coginiti operationalizes business meaning across your entire data estate.