Glossary/Data Governance & Quality

Data Contracts

A data contract is a formal agreement specifying the expectations between data producers and consumers, including schema, quality guarantees, freshness SLAs, and remediation obligations.

A data contract documents the explicit guarantees a data producer makes to consumers: "This table will have these columns with these types, will be updated daily, will be 99% complete, and breaches will be escalated within 2 hours." Contracts prevent surprises: rather than assuming what a dataset looks like, consumers have explicit guarantees. If a producer violates the contract (missing data, schema change, missed SLA), the consumer knows they have a justified complaint.

Data contracts emerged from the realization that informal expectations between producers and consumers lead to constant friction. A producer thinks they can add columns freely; consumers break when the schema changes. A producer doesn't think freshness matters; a consumer's report is stale. Contracts make expectations explicit and enforceable. They also enable decoupling: once a contract is agreed, the producer can evolve their system as long as the contract is maintained.

Data contracts typically include: schema (columns, types, constraints), completeness guarantees (acceptable null rates), timeliness (freshness SLA), uniqueness (duplicate rate acceptable), lineage (how data is derived), and escalation procedures (who to contact if contract is breached). Some data contracts are formal documents; others are code (schema registries, dbt contracts). Organizations increasingly treat data contracts like API contracts in software: violations are serious and must be remedied.

Key Characteristics

  • Documents expectations between producers and consumers
  • Specifies schema, quality, and freshness guarantees
  • Includes SLA targets and escalation procedures
  • Version-controlled and communicated
  • Violations tracked and remediated
  • Enables decoupling and agility

Why It Matters

  • Clarity: Explicit expectations prevent misunderstandings
  • Reliability: Consumers know what to expect and can plan
  • Accountability: Violations are objective and can be escalated
  • Agility: Producers can change implementation as long as contract holds
  • Trust: Contracts build confidence between producers and consumers

Example

Data contract: Orders table. Schema: (order_id: int, customer_id: int, amount: decimal, created_at: timestamp). Freshness SLA: updated hourly, max lag 4 hours. Quality: customer_id null rate < 0.1%, amount >= 0 always. Completeness: >= 99%. Owner: Order Systems team. Violations escalate to data engineering on-call.

Coginiti Perspective

Coginiti supports data contract enforcement through multiple mechanisms. SMDL entity definitions formalize the schema and semantics that consumers depend on, acting as a contract between the semantic layer and its users. CoginitiScript #+test blocks can encode contract assertions (expected columns, value ranges, row counts) that run as part of publication pipelines. Incremental publication strategies with unique_key and update_on_changes_in parameters formalize update contracts between pipeline stages.

See Semantic Intelligence in Action

Coginiti operationalizes business meaning across your entire data estate.