I created this repository for the purpose of understanding what the current trends are in the development of computational models in toxicology, especially in the context of QSARs/QSPRs. It includes a series of personal notes that summarise main ideas of recent published articles in peer-reviewed Journals that can serve as promising examples.
Quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs) models are mathematical models developed to assess chemical induced toxicity using continuous (regression, e.g., LD50) or discrete (classification, i.e., binary, multi-classes) predictions based on molecular descriptors that are computationally calculated using a range of software, or determined experimentally from the molecules themselves to describe the structure of the chemical, e.g., the relationship between physico-chemical or biochemical properties (e.g., LogP) and biological activity. It allows building the model based on the correlation between the variable of interest (target toxicity) and chemical structure and associated properties.
Reference | LR | KNN | SVM | RF | NB | XGB | ANN | DNN | GCN | Endpoint(s) |
---|---|---|---|---|---|---|---|---|---|---|
Mansouri et al (2019) | Y | Y | Y | pKa | ||||||
Zaslavskyi et al (2019) | Y | Y | Y | Bioactivity | ||||||
Li et al (2020) | Y | Y | Y | Y | Acute toxicity | |||||
Xu et al (2020) | Y | Y | Y | Y | Y | 14 organs | ||||
Garcia de Lomana et al (2021) | Y | Y | Y | Y | Y | MIEs thyroid | ||||
Jain et al (2021) | Y | Y | Y | Acute toxicity | ||||||
Li et al (2021) | Y | Y | Y | Y | Y | Y | DILI | |||
Liu et al (2021) | Y | Y | Y | BBB | ||||||
Rathman et al (2021) | Y | Y | Y | Y | Y | DILI | ||||
Ulrich et al (2021) | Y | LogP | ||||||||
Zhoue et al (2021) | Y | Y | Y | Y | DIR |
LR: Linear Regression; KNN: k-Nearest Neighbors; SVM: Support Vector Machine; RF: Random Forest; NB: Naïve Bayes; XGB: eXtreme Gradient Boosting; ANN: Artificial Neural Network; DNN: Deep Neural Network; CGN: Graph Convolutional Network. Y: Yes. pKa = − log10 Ka (acid dissociation constant also called the protonation or ionization constant). MIEs: Molecular initiating Events. DILI: Drug-Induced Liver Injury. BBB: Blood-Brain Barrier. LogP: logKow (the octanol–water partition coefficient Kow). DIR: Drug-induced rhabdomyolysis
- Multi-task modelling
- Battery of in silico models and combination of MLs
- Ensembling learning (e.g., models trained on different fingerprints/descriptors)
- Combination of continuous regression with classification modeling approaches
- Consensus model (the predicted toxicity is estimated by taking an average of the predicted toxicities from each single model)
- Model interpretation and explainability including model benchmark (comparison with other models, datasets)
- The choice of molecular descriptors’ on the impact on model performance
- Sequential feature selection strategy
- Uncertainty quantification (e.g., using Dempster-Shafer decision theory)
Books:
- In Silico Toxicology: Principles and Applications
- Recent Advances in QSAR Studies: Methods and Applications
- In Silico Methods for Predicting Drug Toxicity
- The History of Alternative Test Methods in Toxicology
Other links: