Glossary/Semantic Layer & Metrics

Semantic Drift

Semantic drift occurs when the meaning or definition of a data element diverges from its documented semantic definition, causing metric inconsistencies and analysis errors.

Semantic drift is a gradual corruption of semantic consistency. It starts innocently: a developer adds a new revenue type that's technically included in the revenue sum but violates the original definition (revenue should exclude service fees, but the new type includes them). An analyst applies a filter inconsistently (usually filtering test accounts, but forgetting in one dashboard). Over time, the semantic definition and actual data usage diverge, and teams unknowingly use different versions of metrics.

Semantic drift causes serious problems because it's invisible. Revenue reports look correct but disagree between teams because each uses different test account filters. Customer count appears stable but is actually declining due to a change in how accounts are marked as active. Semantic drift is insidious because it doesn't trigger errors: the data queries execute fine and produce numbers, just wrong ones.

Semantic drift is detected through data observability, metric anomaly detection, and periodic semantic audits. When a metric unexpectedly changes, it could signal semantic drift (the definition has implicitly changed) rather than business change. Preventing drift requires strong governance: formal definition updates, change approvals, testing, and monitoring. Organizations with strong semantic layers and data observability catch drift early.

Key Characteristics

  • Gradual divergence between documented and actual semantics
  • Often invisible: queries execute without errors
  • Caused by ad-hoc logic changes or undocumented assumptions
  • Detectable through metric anomalies and lineage tracking
  • Compound over time as more definitions drift
  • Preventable through strict governance and testing

Why It Matters

  • Accuracy: Undetected drift leads to analysis errors and wrong decisions
  • Reconciliation: Metric disagreements become difficult to investigate
  • Governance: Indicates governance failures and control gaps
  • Trust: Erodes confidence in data and analytics
  • Compliance: Can violate regulatory definitions of metrics

Example

A revenue metric is defined as "paid subscriptions only." Over months, the system adds free trial revenue, partner revenue, and professional services revenue to the underlying query. The metric definition document still says "paid subscriptions," but the actual calculation includes four revenue types. Revenue reports now have unintentional drift.

Coginiti Perspective

Coginiti reduces semantic drift by making SMDL the single source of truth for business definitions. Because Semantic SQL queries resolve against SMDL at runtime, there is no separate copy of metric logic that can diverge from the authoritative definition. The Analytics Catalog's promotion workflow (personal, shared, project hub) adds a review checkpoint that catches unintended definition changes before they reach production. The #+test framework provides an additional detection mechanism, allowing teams to assert expected metric values and flag drift when pipeline results deviate.

See Semantic Intelligence in Action

Coginiti operationalizes business meaning across your entire data estate.