This is the code that I demonstrated during talk I gave in the Molecular Simulations Class (CHE596). These ipython notebooks are an example of an end-to-end ML workflow which includes the following steps:
- Data Collection (get_data.ipynb) & Curation (clean_data.ipynb)
- Feature Representation (generate_features.ipynb)
- Model Training & Evalutation (train_model.ipynb)
- Model Explanation (with_chemml.ipynb)
To run all these you need to create a conda environment and install the necessary packages. Follow the steps below:
-
Open a terminal.
-
Create a new conda environment. Replace
envname
with the name you want to give to your environment:
conda create --name envname python=3.12
- Activate the conda environment:
conda activate envname
- Clone this repository and navigate to it:
cd path/to/your/project
- Install the necessary packages using pip:
pip install -r requirements.txt
- To run the 'with_chemml.ipynb' notebook, go to https://hachmannlab.github.io/chemml/#installation-and-dependencies to install ChemML's dependencies.
I would like to acknowledge Dr. Patrick Walters. Some of this code here has been taken from his invaluable Practical Cheminformatics Tutorials(https://github.com/PatWalters/practical_cheminformatics_tutorials).