Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wild type sequence for stability dataset #131

Open
lzhangUT opened this issue Aug 15, 2022 · 1 comment
Open

wild type sequence for stability dataset #131

lzhangUT opened this issue Aug 15, 2022 · 1 comment

Comments

@lzhangUT
Copy link

Hi,
I am interested in using you tape data for my deep learning, but we would like to generate pdb file from the wild type sequence first. we sort of figured out the wild type sequence for fluroscent data is the one with num_mutation=0, but couldn't figure out the wild type sequence for stability score, as we looked into it, there were a few of them with stability_score =1.
would you mind share with me the wild type sequence for the stabililty_score?
Thank you.

@agitter
Copy link

agitter commented Jan 17, 2023

I've been looking into the original data files from the Rocklin 2017 stability paper, and the wild type sequences in the saturation mutagenesis (ssm2) experiment are clearly indicated there. Their entries in the name column of the Rocklin file are

EEHEE_rd3_0037.pdb
EEHEE_rd3_1498.pdb
EEHEE_rd3_1702.pdb
EEHEE_rd3_1716.pdb
EHEE_0882.pdb
EHEE_rd2_0005.pdb
EHEE_rd3_0015.pdb
HEEH_rd2_0779.pdb
HEEH_rd3_0223.pdb
HEEH_rd3_0726.pdb
HEEH_rd3_0872.pdb
HHH_0142.pdb
HHH_rd2_0134.pdb
HHH_rd3_0138.pdb
Pin1
hYAP65
villin

If you want to find them in stability_test.json in the TAPE data, match those to the id entries or look for id entries that don't have an additional underscore specifying the mutation. For instance, EEHEE_rd3_0037.pdb is a wild type instance but EEHEE_rd3_0037.pdb_A19D is a mutation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants