Datagotchi Health is an algorithm designed to predict mental health outcomes based on lifestyle behaviors and provide recommendations for improving mental health.
Before you begin, ensure you have met the following requirements:
- Python 3.9
- R
- Poetry (for Python virtual environment)
- Make (to run the Makefile)
To install the required dependencies, follow these steps:
- Clone the repository:
git clone https://github.com/yourusername/datagotchi-health.git
- Navigate to the project directory:
cd datagotchi-health
- Install the dependencies using Poetry:
poetry install
To get started with the project, follow these steps:
- Request access to the data by contacting the repository owner.
- Create a
.env
file in the project root directory with theDATA_PATH
environment variable set to the location of your data:DATA_PATH='path/to/data'
- Edit the
config.py
file to suit your specific needs. - Run the ML pipeline :
- Step 1 : create features :
make create-features
- Step 2 : select features :
make select-features
- Step 3 : Run the cross-validation:
make run-crossval
The repository is organized as follows: code
code/
├── cleaning/
│ └── (various cleaning functions)
├── eda/
│ └── (exploratory data analysis scripts)
└── ml/
└── (machine learning workflow from raw data to evaluated predictions)
- cleaning: Contains scripts for data cleaning.
- eda: Contains scripts for exploratory data analysis to explore and understand the data.
- ml: Contains scripts for the machine learning workflow, including data processing, model training, and evaluation.