Why governance fails at the implementation stage and how to avoid it with dbt

Webinar recap: data governance in the Modern Data Stack, end-to-end lineage, data contracts and KPC field feedback.

#Data #Data Governance #DbtLabs #IA&GenIA #metadata

CONTEXT

By 2026, data teams had industrialized their pipelines. Yet, governance remained a separate project, disconnected from transformation tools—resulting in dead documentation, unguaranteed quality, and a lack of traceability. This is the observation made by Matthieu Augier (KPC) and Hicham Babahmed (dbt Labs) during their webinar common of May 26, 2026.

A problem that will still be all too common in 2026

Whether it's an SME with data scattered across Excel files, or a large corporation with a complex IT system and conflicting definitions between divisions, the observation is the same: trust in the data is lacking. AI projects fail due to a lack of mastery of semantics, strategic decisions rely on unreconciled figures, and regulatory requirements (GDPR, Solvency II, PII) remain difficult to address without traceability.

On the ground, the same tasks come up mission after mission: clarifying roles and responsibilities (data owner, data steward, data custodian), pragmatically improving quality, building a living catalog and addressing regulatory auditability.

Unreliable data: The question "what constitutes quality data?" remains without an operational answer in most organizations.
Under-controlled regulation: GDPR, Solvency II, PII: constraints are tightening without traceability being in place.
AI in rapid decline: AI projects generate a "fail fast" when the semantics of the data have not been stabilized beforehand.

KPC Approach: The Data Governance Office: a system within a system

At the house of KPC, The answer to these challenges lies in the construction of a Data Governance Office — a dedicated unit that coordinates roles, stakeholders, quality, cataloging, lineage, and compliance. Three organizational models coexist:

Centralized: A single DGO unit feeds all divisions. Single vision, homogeneous governance.
(Most frequent)
Federated Shared foundation + autonomy by vertical branch — a balance of central coherence and local agility by domain.
DecentralizedIndependent data flows per domain. Requires excellent data literacy. Risk of duplication and divergent definitions without a strong Data Governance Officer (DGO).

« The idea is to capitalize on the documentation only once, share it with all those who produce or consume the data, and make them as autonomous as possible.«

Matthieu Augier – Director of Data Governance, KPC

KPC x dbt Labs Webinar

In each of these models, the data actors — data owners, stewards, custodians, engineers — have distinct scopes, but encounter the same point of friction: Without shared and living documentation, everyone ends up reconstructing their own truth.. This is precisely where the transformation tooling changes the game.

Webinar Replay

KPC X dbt Labs Data Governance

Find the full video on Youtube !

LIVE DEMO

How dbt incorporates governance into pipelines

In practice, each governance pillar is implemented using the native functionalities of dbt Platform — without external tools. The live demonstration rolled it out pillar by pillar.

1. End-to-end lineage and column-by-column traceability

The DAG (Directed Acyclic Graph) dbt offers a complete visual map: from raw data sources to end uses (Power BI dashboards, Tableau, AI models). In a multi-project environment with central, finance, and marketing teams, each team only sees the public models—models not declared public are automatically hidden. protected. The lineage goes down to the column, allowing you to know precisely how each field has evolved through the transformations.

« One morning, your dashboard no longer displays the expected data. With dbt Catalog, you can return to the model, check the status of the last orchestration, the test results, and the lineage — without involving the data engineer.«

Hicham Babahmed – Partner Solutions Architect, dbt Labs

KPC x dbt Labs Webinar

2. Data contracts: governance before production

Defined in the YAML files of the dbt project, the data contracts they run even before the model is created — they block the production of any data that would violate the contract (incorrect column type, value out of range, IBAN too long…). The difference with data tests is structural: The contract outlines what is decided upstream with the business; the test verifies what is observed during execution..

Data contracts are shared and replicable: a contract validated for one object can be duplicated and adapted to a similar object, accelerating data quality improvement at scale. This is particularly powerful in a federated model where multiple teams feed the same data platform.

3. Catalogue, metadata and semantic layer

Dbt Catalog centralizes descriptions, types, tests, orchestration status, and recommendations for each model. The semantic layer allows for the definition of reusable business KPIs for all teams—and queryable in natural language via connected LLMs. Role-based access control manages who consumes the semantic layer, particularly for compute-intensive uses.

« More and more, business professionals are coming to dbt Platform — not to code, but to verify that the data they will present in meetings is correct.«

Hicham Babahmed – Partner Solutions Architect, dbt Labs

KPC x dbt Labs Webinar

KPC'S POINT OF VIEW

Reduce friction between data and business teams

In the missions KPC, The contribution of dbt lies in one specific point: the friction between data and business teams. For years, this stemmed from a lack of communication: business requirements were not formalized, the data produced did not meet expectations, and no one had a common point of truth.

Dbt then plays a role as a translator between the two worlds — that is precisely what the platform unlocks. Business users can write their requirements directly into the platform, visualize what the models produce, and challenge data engineers on quality—all without intermediaries. Experts are no longer overwhelmed: they document once, and the platform provides continuous support.

Another highlighted operational benefit dbt operates as a cross-platform layer across multiple platforms (Snowflake, Databricks, BigQuery, Redshift). In companies that have adopted multiple data clouds, it unifies governance without requiring a migration.

« The Data Governance Office is a system within a system: it coordinates roles and stakeholders—technical, business, or hybrid: data owner, data steward, data custodian, data engineer. And it approaches quality primarily as a matter of process, not just tools. »

Matthieu Augier – Director of Data Governance, KPC

KPC x dbt Labs Webinar

Key takeaways from this webinar

Effective governance does not reside in a disconnected catalog tool — it resides in the dbt pipelines themselves.

Automatic lineage + integrated documentation + data contracts = the three pillars for traceable and trustworthy data, right from production.

Data contracts and data tests are complementary but distinct: the contract frames what is decided with the business upstream; the test observes what happens during execution.

dbt adapts to the three organizational models (centralized, federated, decentralized) — the most frequently encountered in a client context remains the centralized one.

The dbt platform acts as a bridge between business and data — experts document once, the platform responds continuously.