Glossary/OLTP, OLAP & Workload Types

Fact Table

A fact table is a database table in a star or snowflake schema that stores measures (quantitative data) and foreign keys to dimensions, representing events or transactions in a business process.

Fact tables are the central structure in dimensional modeling, recording the occurrence of business events. A sales fact table has one row per transaction: product sold, customer, store, date, revenue, quantity. A web analytics fact table has one row per page view: user, page, time, duration. Each row is an atomic, measurable occurrence.

Fact tables contain two types of columns: measures (numeric values to aggregate) and foreign keys (references to dimensions). Measures answer "how much" questions (revenue, units sold, page duration). Foreign keys enable "drill-down" analysis (by product, region, time). Good fact table design separates what was measured (facts) from how to analyze (dimensions).

Fact tables are typically very large, containing billions of rows for years of operational data. Indexing is critical: foreign keys need indexes for joins, and date indexes enable time-based partitioning. Columnar storage excels for fact tables, reading only relevant measures and dimensions from disk. Modern data warehouses use fact tables as the foundation for analytics, with aggregated versions (hourly, daily) for performance.

Key Characteristics

  • Store measures (numeric, aggregatable values) for business events
  • Include foreign keys to dimension tables
  • One row per atomic business event or transaction
  • Often very large (billions to trillions of rows)
  • Indexed on foreign keys for efficient joins
  • Partitioned by date or other time dimension

Why It Matters

  • Enable aggregation of business metrics across any dimension
  • Provide foundation for analytical queries and reporting
  • Support efficient analysis of large event streams
  • Enable tracking of business process metrics over time
  • Facilitate drill-down analysis from summary to detail
  • Support real-time and historical analytics on event data

Example

`
-- Fact table: Sales transactions

CREATE TABLE fct_sales (
  -- Surrogate key (optional but common)
  sale_id BIGINT PRIMARY KEY,
  
  -- Foreign keys to dimensions
  date_id INT NOT NULL,
  product_id INT NOT NULL,
  customer_id INT NOT NULL,
  store_id INT NOT NULL,
  
  -- Measures (numeric, aggregatable values)
  revenue DECIMAL(10, 2),
  quantity INT,
  discount_amount DECIMAL(10, 2),
  tax_amount DECIMAL(10, 2),
  
  -- Optional: Slowly changing dimension keys for history
  product_key INT,  -- Current product reference
  product_history_key INT  -- Historical product version
  
  -- Indexes for joins
  FOREIGN KEY (date_id) REFERENCES dim_date,
  FOREIGN KEY (product_id) REFERENCES dim_product,
  FOREIGN KEY (customer_id) REFERENCES dim_customer,
  FOREIGN KEY (store_id) REFERENCES dim_store
);

CREATE INDEX idx_fct_sales_date ON fct_sales(date_id);
CREATE INDEX idx_fct_sales_product ON fct_sales(product_id);
CREATE INDEX idx_fct_sales_customer ON fct_sales(customer_id);
CREATE INDEX idx_fct_sales_store ON fct_sales(store_id);

-- Analytical queries aggregate fact measures by dimension attributes

-- Query 1: Total revenue by product
SELECT 
  p.category,
  p.product_name,
  SUM(f.revenue) as total_revenue,
  SUM(f.quantity) as units_sold
FROM fct_sales f
JOIN dim_product p ON f.product_id = p.product_id
GROUP BY p.category, p.product_name
ORDER BY total_revenue DESC;

-- Query 2: Revenue trend over time
SELECT 
  d.year,
  d.month,
  SUM(f.revenue) as monthly_revenue
FROM fct_sales f
JOIN dim_date d ON f.date_id = d.date_id
GROUP BY d.year, d.month
ORDER BY d.year, d.month;

-- Query 3: Multi-dimensional analysis
SELECT 
  d.year,
  c.region,
  p.category,
  SUM(f.revenue) as region_category_revenue,
  COUNT(*) as transaction_count
FROM fct_sales f
JOIN dim_date d ON f.date_id = d.date_id
JOIN dim_customer c ON f.customer_id = c.customer_id
JOIN dim_product p ON f.product_id = p.product_id
WHERE d.year = 2024
GROUP BY d.year, c.region, p.category;
`

Coginiti Perspective

Coginiti models fact tables as SMDL entities with measures that specify aggregation types such as sum, count_distinct, avg, or custom expressions. The MEASURE() function in Semantic SQL handles correct aggregation at any grain, so queries against fact tables produce accurate results even when analysts group by different dimension combinations. Incremental publication strategies (append, merge, merge_conditionally) allow CoginitiScript pipelines to update fact tables efficiently without full reloads, which matters given the volume of rows fact tables typically accumulate.

Related Concepts

Dimension TableStar SchemaSnowflake SchemaData WarehouseMeasures and DimensionsFact ConsolidationGrain (Fact Table)

See Semantic Intelligence in Action

Coginiti operationalizes business meaning across your entire data estate.