Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrectly indexed interacting residues #148

Open
kaw97 opened this issue Dec 12, 2023 · 2 comments
Open

Incorrectly indexed interacting residues #148

kaw97 opened this issue Dec 12, 2023 · 2 comments

Comments

@kaw97
Copy link

kaw97 commented Dec 12, 2023

Bug description: not all interacting residues are indexed properly. When I run PLIP and check the indexing, most of them are correct. However, a few are incorrect. In some cases, this appears to be due to PLIP's numbering not taking into consideration sequences that start with "X".

Expected behavior: the positions of interacting residues should all agree with the indexed sequence of the chain

To reproduce the bug, one can run PLIP, parse its output, and compare the interacting residues to the parent sequence. This was fairly involved, so I am providing the practice test data set I've been using and a jupyter notebook with code that will take a test dataset and reproduce the error.

I think you should be able to just fill in the path for the practice data and hit run all, but I'm including the full notebook just in case. I've put the files on my google drive here: https://drive.google.com/drive/folders/1biFRoGwM9PeC_ZNMIAEek0ZNQAAREA4Q?usp=sharing

Thanks a bunch for plip! It's a massive help to my project.

@phosphoTig
Copy link

I am facing the same issue. The indexing is off for most of the interactions when running PLIP (local or on the web interface). When looking at the PLIP interface on Swissmodel the indexing is correct. Unsure why this is happening, but it seems to be a recent issue. I really like PLIP, and it a wonderful tool, would really like to continue using it.

@kalinni
Copy link
Contributor

kalinni commented Feb 29, 2024

Hi @kaw97

I had a quick look at one of your examples and I don't think this is an issue within PLIP. For pdb id 6kjr the sequence you obtain in your colab starts with MGSSHHHHHHSS. Looking at the PDB file the first three residues present are NPA.

The pdb file contains a full sequence starting with MGS… as well, additionally I find the following information:

REMARK 465 MISSING RESIDUES                                                     
REMARK 465 THE FOLLOWING RESIDUES WERE NOT LOCATED IN THE                       
REMARK 465 EXPERIMENT. (M=MODEL NUMBER; RES=RESIDUE NAME; C=CHAIN               
REMARK 465 IDENTIFIER; SSSEQ=SEQUENCE NUMBER; I=INSERTION CODE.)                
REMARK 465                                                                      
REMARK 465   M RES C SSSEQI                                                     
REMARK 465     MET A   -19                                                      
REMARK 465     GLY A   -18                                                      
REMARK 465     SER A   -17                                                      
REMARK 465     SER A   -16                                                      
REMARK 465     HIS A   -15                                                      
REMARK 465     HIS A   -14                                                      
REMARK 465     HIS A   -13                                                      
REMARK 465     HIS A   -12                                                      
REMARK 465     HIS A   -11                                                      
REMARK 465     HIS A   -10                                                      
REMARK 465     SER A    -9                                                      
REMARK 465     SER A    -8                                                      
REMARK 465     GLY A    -7                                                      
REMARK 465     LEU A    -6                                                      
REMARK 465     VAL A    -5                                                      
REMARK 465     PRO A    -4                                                      
REMARK 465     ARG A    -3                                                      
REMARK 465     GLY A    -2                                                      
REMARK 465     SER A    -1                                                      
REMARK 465     HIS A     0                                                      
REMARK 465     MET A     1                                                      
REMARK 465     LYS A     2                                                      
REMARK 465     GLN A     3                                                      
REMARK 465     THR A     4                                                      
REMARK 465     ILE A     5                                                      
REMARK 465     SER A     6                                                      
REMARK 465     HIS A   366                                                      
REMARK 465     HIS A   367                                                      
REMARK 465     HIS A   368                                                      
REMARK 465     HIS A   369                                                      
REMARK 465     HIS A   370                                                      
REMARK 465     HIS A   371   

I think these negative residue numbers are not accounted for in your code. The Arg at position 166 that forms the salt bridge that your colab code can't find is there at that position in the pdb file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants