Integration
Collectors

dbt core

4min

The Kensu collector for dbt core is a powerful tool designed to enhance data observability in Snowflake environment. By collecting metadata from dbt as well as the data sources used by its models, it provides valuable insights into data lineage, schema changes, and metrics.

How it works

Document image


Step 1: Extraction of dbt Artifacts

The Kensu collector is adept at retrieving essential metadata generated by dbt via the already implemented dbt plugin, which includes:

  • Metadata related to the models and jobs.
  • A comprehensive list of data sources, such as Snowflake tables or BigQuery datasets.
  • Detailed Lineage, encompassing the input/output relationships among data sources.

Step 2: Metadata Extraction from Data Sources

  • Data Source Schema Extraction: For each table identified in the lineage, the collector retrieves schema details, such as column names and data types. This helps in understanding the structure of data being manipulated.
  • Quality Metrics Collection: Alongside schema information, the system collects a range of quality metrics for the data sources involved. These metrics may include column-level statistics like null counts, unique value counts, and distribution summaries, providing a comprehensive view of data quality.

Step 3: Transmission to Kensu Core

After collection, all observations—including lineage information, schema details, and quality metrics—are transmitted to Kensu Core, which will process and analyze the metadata to offer actionable insights, such as anomaly detection, data quality issues, and performance bottlenecks.

This process works in pair with an agent. In the next pages, see how you can configure it with Elementary data for Snowflake  or Elementary data for BigQuery.