Kensu Documentation

⌘K
Getting started with the Kensu Community Edition
Marketing campaign
Financial data report
Getting credentials
Recipe: Observe Your First Pipeline
Agents: getting started
Python
PySpark
Scala Spark
Databricks Notebook
Agent Listing
Docs powered by archbee 
10min

Create a monitoring rule in Kensu

Run the program for the first time

In this section, you will execute the pipeline by running the data_ingestion.py and reporting.py scripts. You can find those scripts in the repository you cloned under the folder python_code.

Those programs manipulate CSV files using the vanilla Pandas in cooperation with the Kensu Pandas layer. This extra layer will augment the installed Pandas version with Data Observability capabilities such as tracing and logging your data usage, but also profile (e.g. compute metrics) on the data consumed and produced.

Notes

Both applications follow the Data Observability Driven Development (DODD) principles.



For general information on how to use Kensu with Python look at Configure the Python Agent.

To execute the pipeline, run the Python or Docker commands below by choosing one of

  • Docker Pandas
  • Docker PySpark

  • Local Pandas
  • Local PySpark

These commands will run both Python scripts, one after the other, using data from November 2021.

Docker Pandas
Docker PySpark
LocaL Pandas
Local PySpark
|

🎉 Congratulations! 🎉 You've sent your first data observability information to Kensu! You can now go to the Kensu data sources page and review the data.

In the next section, you will learn how to use the Kensu UI to view and work with the data observations to, for example, troubleshoot any data problems.

Create a Min-Max rule in Kensu

Suppose that the Risk Officers have agreed to track several quality metrics for the business.

The first metric is the monthly volatility of the returns for the Buzzfeed stock. If the volatility exceeds 20%, the risk officers will reduce the amount of Buzzfeed stocks in the portfolio.

Definition



The volatility of the "returns", also called the risk, is the standard deviation of the returns.

To calculate this, the reporting officer added a new rule to the report_buzzfeed.csv data source. This fires an alert when the standard deviation of Intraday_delta exceeds 20%.

How to Create the Rule

1️⃣ Go to the Kensu main page.

2️⃣ Click on Data Sources.

Document image



3️⃣  Select the data source for which you want to add a rule, report_buzzfeed.csv, in the column Logical Data Source Name.

Document image



4️⃣ Hover the mouse over the Add rule button and click Add Min-Max.

Document image



5️⃣ Put these parameters:

  • Statistic Name: Use the drop-down to select Intraday_Delta.std(for Pandas) or Intraday_Delta.stddev(for PySpark). You can also filter the values by typing the first letter in the text field.
  • Maximum valid value: Put 0.2, meaning you want to be notified if the value exceeds 0.2.
Document image

After clicking OK, you will see a new rule in the Rules section:

Document image

On the next page, we will trigger an alert for this rule. Then we will see how to use Kensu to find the root cause.

Updated 07 Nov 2022
Did this page help you?
Yes
No
UP NEXT
Find the root cause with Kensu
Docs powered by archbee 
TABLE OF CONTENTS
Run the program for the first time
Create a Min-Max rule in Kensu
How to Create the Rule