An extract-transform-load (ETL) pipeline is built to collect the business data. These data are processed to generate insights and forecasts using exploratory data analysis (EDA) and predictive models. The results of the analyses are stored in a PostgreSQL database which connects to a Power BI dashboard.
Using Airflow, the entire flow of data from the source up to the Postgres database is automated. The scripts are set to run monthly (every first day of the month at 6 AM).
- Build the docker image:
docker-compose build
- Initialize Airflow:
docker-compose up airflow-init
- Run docker-compose:
docker-compose up
- Go to
localhost:8080
in your browser, enter the credentials, and look for the sales_forecast_pipeline DAG.