This notebook is an example of how to use payment data to classify payers. In this particular scenario we looked at several years of payments from tenants across several houses in California with the aim of finding out those tenants that are at risk of not paying the rent.
The notebook showcases the typical stages of a classification algorithm. First we load, clean and understand the raw data. We then create some features that we can use for our analyisis. We apply a variety of models (seven actually) and compare them against each other measuring their accuracy. Once we chose the model, we do a final calibration of the data used to attempt to get better results.
This is just an example of how to use Pandas and SciKit learn in a quick classification challenge.
Github has an open issue with rendering .ipynb files, so it may not work for you. But dont fear, you can:
The work here has been carried out by Ernesto Monroy, LiLib Koo, Ollie Mirnezemi and Victor Torrent as part of an assignment for Imperial Business School
The intent of sharing the project is to show how to perform an analysis, not provide a final solution that can be applied to your case. However, if you do find some of the code useful, go ahead and take it!
The MIT license applies to this project.