Skip to content

maranasgroup/CatPred-DB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains the datasets and scripts pertaining to the publication

CatPred: A comprehensive framework for deep learning in vitro enzyme kinetic parameters kcat, Km and Ki

DOI

Citation CatPred biorxiv pre-print:
@article {Boorla2024.03.10.584340,
	author = {Veda Sheersh Boorla and Costas D. Maranas},
	title = {CatPred: A comprehensive framework for deep learning in vitro enzyme kinetic parameters kcat, Km and Ki},
	elocation-id = {2024.03.10.584340},
	year = {2024},
	doi = {10.1101/2024.03.10.584340},
	publisher = {Cold Spring Harbor Laboratory},
	URL = {https://www.biorxiv.org/content/early/2024/03/26/2024.03.10.584340},
	eprint = {https://www.biorxiv.org/content/early/2024/03/26/2024.03.10.584340.full.pdf},
	journal = {bioRxiv}
}
Table of contents

CatPred-DB datasets

Datasets referenced in the publication can be found in these directories

./datasets/
├── processed               		# processed datasets one each for kcat, Km and Ki
├── splits                  		# training/validation/test dataset splits for kcat, Km and Ki
├── all_natural_metabolite_names.json 		# compiled names of natural metabolites from BRENDA
├── metabolite_inchi_smiles_brenda_pubchem.tsv 	# compiled list of inchi and smiles strings for brenda molecules

CatPred-DB scripts

Custom scripts used for processing, splitting and analysis can be found in these directories

./scripts/
├── processing               	# for processing brenda/sabio datasets and merging them
├── pdb                  		# for creating files required for datasets including structural information (pdb)
├── add_pdb_paths.py 		# adds pdb paths to a dataset file
├── count_natural.py 		# counts natural vs. non-natural entries
├── create_pdbrecords.py 		# create pdb records files to go along with datasets required for training
├── make_splits.py 			# create train/valid/test splits from processed datasets

License

This source code is licensed under the MIT license found in the LICENSE file in the root directory of this source tree.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published