Financial data report
Based on some sample data of an asset management company, we have to execute a data pipeline to prepare financial reports.
The financial reports include an inventory of key assets, including stocks. Those reports are used by Risk Officers to make decisions on the portfolio.
The pipeline uses Python programs.
Every month, the pipeline runs:
- data_ingestion.py reads data on all the stocks in the portfolio, joins the data frames, and stores the results in the master data. It uses pandas and Kensu-py.
- reporting.py prepared by the Reporting Officer, extracts stock data and creates a new column, Intraday_delta. That computes the daily return for a single stock. This program runs with pandas, and you can also try it with pyspark.
Here is the flow:
In this example, after having proceeded with the , you will:
- embedded in a Python pipeline. This includes:
- Data sources metadata
- Technical data lineage
- Data profiling (metrics)
- Add data monitoring rules:
- Use the technical data lineage to of data issues flagged by a validation rule.
Start with the