10min
Financial data report
Based on some sample data of an asset management company, we have to execute a data pipeline to prepare financial reports.
The financial reports include an inventory of key assets, including stocks. Those reports are used by Risk Officers to make decisions on the portfolio.
The pipeline uses Python programs.
Every month, the pipeline runs:
- data_ingestion.py reads data on all the stocks in the portfolio, joins the data frames, and stores the results in the master data. It uses pandas and Kensu-py.
- reporting.py prepared by the Reporting Officer, extracts stock data and creates a new column, Intraday_delta. That computes the daily return for a single stock. This program runs with pandas, and you can also try it with pyspark.
Here is the flow:

Representation of the Financial data report pipeline including data observability
In this example, after having proceeded with the Installation and Configuration, you will:
- Generate data observability information embedded in a Python pipeline. This includes:
- Data sources metadata
- Technical data lineage
- Data profiling (metrics)
- Add data monitoring rules:
- From the UI
- Programmatically
- Use the technical data lineage to find the root cause of data issues flagged by a validation rule.
Start with the Installation and configuration


Updated 10 May 2022
Did this page help you?
Yes
No