Kensu Documentation

⌘K
Getting started with the Kensu Community Edition
Marketing campaign
Financial data report
Getting credentials
Recipe: Observe Your First Pipeline
Agents: getting started
Python
PySpark
Scala Spark
Databricks Notebook
Agent Listing
Docs powered by archbee 
10min

Prevent further issues

It has been several months since the last campaign. The Marketing team asks you to execute the pipeline in order to prepare a new campaign.

In your terminal window, run the second script:

Shell
|

Detect an issue with Kensu

You log into the Kensu platform, and you see that a new ticket has been created!

1️⃣ Click on the Ticket title to see the list of tickets

Document image

2️⃣ By clicking on the + icon, you can access the details and context of the ticket. You will see in which project, application, and environment the rule has been violated.

Document image

3️⃣ Click on the data source name, testme.testme_schema.contact_list, in order to be redirected to the corresponding data source page.

Document image

Analyze the issue

The collected metadata and observability metrics will help you to define the source of the issue.

1️⃣ The first step involves using the lineage in order to find the origin of the data. Click on thecustomer_list.csv rectangle representing the data source.

Document image

2️⃣ Click on View data source details in order to navigate to this data source.

3️⃣ Once on the customer_list.csvdata source, a shortcut allows to see the missing values observability metrics. Click on the button to display the chart.

Document image



In the table below the chart, you can see both executions, ordered by descending timestamp. You can observe that the latest execution contains about 49% of null values for the column of interest - phone.nullrows.

In this case, it means that half of the customers cannot be contacted by phone. There might have been a change in the customer behavior, reducing the proportion of people giving their phone number, while the email counter of null rows seems to be stable. The marketing team decides to change the process and to contact the customers by email. Therefore, they ask the data team to provide them with the email addresses.

Create a variability rule on the number of missing email addresses

In order to avoid further issues, you decide to create a variability rule. This rule will detect a high volatility in the number of missing values of the email field: email.nullrows, so that the same kind of situation you had with the phone numbers won't happen with the email addresses.

1️⃣ In the Rules section, hover the mouse over the Add rule button and click Add Variability

Document image

2️⃣ Put these parameters:

  • Statistic Name: Use the drop-down to select email.nullrows . You can also filter the values by typing the first letter in the text field.
  • Maximum variation value: Put 20, meaning you want to be notified if the variation exceeds 20%
  • Leave the For no more than the past field blank.
Document image

After clicking OK, you will see a new rule in the Rules section:

Document image

As the latest execution contained 16 nullrows, the next execution will be linked to a ticket if the number of nullrows in the email column exceeds 20 or is below 12.

Updated 07 Nov 2022
Did this page help you?
Yes
No
UP NEXT
Financial data report
Docs powered by archbee 
TABLE OF CONTENTS
Detect an issue with Kensu
Analyze the issue
Create a variability rule on the number of missing email addresses