Fact Table
A fact table is a database table in a star or snowflake schema that stores measures (quantitative data) and foreign keys to dimensions, representing events or transactions in a business process.
Fact tables are the central structure in dimensional modeling, recording the occurrence of business events. A sales fact table has one row per transaction: product sold, customer, store, date, revenue, quantity. A web analytics fact table has one row per page view: user, page, time, duration. Each row is an atomic, measurable occurrence.
Fact tables contain two types of columns: measures (numeric values to aggregate) and foreign keys (references to dimensions). Measures answer "how much" questions (revenue, units sold, page duration). Foreign keys enable "drill-down" analysis (by product, region, time). Good fact table design separates what was measured (facts) from how to analyze (dimensions).
Fact tables are typically very large, containing billions of rows for years of operational data. Indexing is critical: foreign keys need indexes for joins, and date indexes enable time-based partitioning. Columnar storage excels for fact tables, reading only relevant measures and dimensions from disk. Modern data warehouses use fact tables as the foundation for analytics, with aggregated versions (hourly, daily) for performance.
Key Characteristics
- ▶Store measures (numeric, aggregatable values) for business events
- ▶Include foreign keys to dimension tables
- ▶One row per atomic business event or transaction
- ▶Often very large (billions to trillions of rows)
- ▶Indexed on foreign keys for efficient joins
- ▶Partitioned by date or other time dimension
Why It Matters
- ▶Enable aggregation of business metrics across any dimension
- ▶Provide foundation for analytical queries and reporting
- ▶Support efficient analysis of large event streams
- ▶Enable tracking of business process metrics over time
- ▶Facilitate drill-down analysis from summary to detail
- ▶Support real-time and historical analytics on event data
Example
` -- Fact table: Sales transactions CREATE TABLE fct_sales ( -- Surrogate key (optional but common) sale_id BIGINT PRIMARY KEY, -- Foreign keys to dimensions date_id INT NOT NULL, product_id INT NOT NULL, customer_id INT NOT NULL, store_id INT NOT NULL, -- Measures (numeric, aggregatable values) revenue DECIMAL(10, 2), quantity INT, discount_amount DECIMAL(10, 2), tax_amount DECIMAL(10, 2), -- Optional: Slowly changing dimension keys for history product_key INT, -- Current product reference product_history_key INT -- Historical product version -- Indexes for joins FOREIGN KEY (date_id) REFERENCES dim_date, FOREIGN KEY (product_id) REFERENCES dim_product, FOREIGN KEY (customer_id) REFERENCES dim_customer, FOREIGN KEY (store_id) REFERENCES dim_store ); CREATE INDEX idx_fct_sales_date ON fct_sales(date_id); CREATE INDEX idx_fct_sales_product ON fct_sales(product_id); CREATE INDEX idx_fct_sales_customer ON fct_sales(customer_id); CREATE INDEX idx_fct_sales_store ON fct_sales(store_id); -- Analytical queries aggregate fact measures by dimension attributes -- Query 1: Total revenue by product SELECT p.category, p.product_name, SUM(f.revenue) as total_revenue, SUM(f.quantity) as units_sold FROM fct_sales f JOIN dim_product p ON f.product_id = p.product_id GROUP BY p.category, p.product_name ORDER BY total_revenue DESC; -- Query 2: Revenue trend over time SELECT d.year, d.month, SUM(f.revenue) as monthly_revenue FROM fct_sales f JOIN dim_date d ON f.date_id = d.date_id GROUP BY d.year, d.month ORDER BY d.year, d.month; -- Query 3: Multi-dimensional analysis SELECT d.year, c.region, p.category, SUM(f.revenue) as region_category_revenue, COUNT(*) as transaction_count FROM fct_sales f JOIN dim_date d ON f.date_id = d.date_id JOIN dim_customer c ON f.customer_id = c.customer_id JOIN dim_product p ON f.product_id = p.product_id WHERE d.year = 2024 GROUP BY d.year, c.region, p.category; `
Coginiti Perspective
Coginiti models fact tables as SMDL entities with measures that specify aggregation types such as sum, count_distinct, avg, or custom expressions. The MEASURE() function in Semantic SQL handles correct aggregation at any grain, so queries against fact tables produce accurate results even when analysts group by different dimension combinations. Incremental publication strategies (append, merge, merge_conditionally) allow CoginitiScript pipelines to update fact tables efficiently without full reloads, which matters given the volume of rows fact tables typically accumulate.
Related Concepts
More in OLTP, OLAP & Workload Types
Analytical Workload
An analytical workload is a class of database queries that examine, aggregate, and analyze large volumes of historical data to extract business insights and support decision-making.
Dimension Table
A dimension table is a database table in a star or snowflake schema that stores descriptive attributes used to filter, group, and drill-down in analytical queries.
HTAP (Hybrid Transactional/Analytical Processing)
HTAP is a database architecture that supports both transactional workloads and analytical workloads on the same data system, enabling real-time analytics without separate data warehouses.
Mixed Workload
A mixed workload is a database system handling both transactional and analytical queries simultaneously, requiring architecture balancing responsive operational performance with efficient aggregate analysis.
OLAP (Online Analytical Processing)
OLAP is a database workload class optimized for rapid execution of complex queries that aggregate and analyze large datasets across multiple dimensions.
OLTP (Online Transaction Processing)
OLTP is a database workload class optimized for rapid execution of small, focused transactions that insert, update, or query individual records in operational systems.
See Semantic Intelligence in Action
Coginiti operationalizes business meaning across your entire data estate.