The Marketing department of the city store regularly organizes a phone campaign to communicate sales offers to the customers having bought any article in the last months.
The list of customers and phone numbers that the marketing team has to contact is recorded in a Postgres table which is created on demand by a pipeline.
The pipeline uses Python and dbt programs.
When triggered, the pipeline runs:
1. data_load.py, declined in two versions, load_first_campain.py ans load_second_campain.py, which sends two files to a Postgres table: the articles ordered during the period and the list of customers. The first table contains information about the orders, while the second one contains personal information about the customers.
2. Marketing, a dbt project with 2 models:
- The first model, orders_and_customers.sql will join both data tables created by data_load
- The second one, contact_list.sql, creates a list of customers the team will contact
Here is the flow:
In this example, after having performed the , you will be able to:
- data observability information embedded in the pipeline. This includes:
- Data sources metadata
- Technical data lineage
- Data profiling (metrics)
- Rules for dbt outcome
- Use the technical data lineage to analyze the propagation of an error and