Cannot get valid molecule from running sample.py for a long time #4

simmed00 · 2022-02-09T07:27:58Z

I read your paper which looks great.
I tried to run according to the instruction, sample.py, using the sample.yml and the two model ckpts downloaded from google drive. I waited for a long time (like half an hour) , and it still cannot produce a valid molecule, it shows [Pool] Queue 300 | Finished 0 | Failed 55.

I tried to debug, I found that most generated graph does not pass the if statement "if data_next.status == STATUS_FINISHED:".
Even for those few that can pass the if statement, it failed during "rdmol = reconstruct_from_generated(data_next)", which throws exception shown as "Ignoring, because reconstruction error encountered."

luost26 · 2022-02-09T07:42:50Z

Hi, can you provide the index of the failed test data?

simmed00 · 2022-02-09T08:48:59Z

Thank you for your prompt reply. I used the default index=0, I tried index=1, but the situation is same.
I run step and found that the reconstruction failed at line "rd_mol = convert_ob_mol_to_rd_mol(mol)"

simmed00 · 2022-02-09T09:01:33Z

I further check downstream, it failed at "Chem.SanitizeMol(rd_mol,Chem.SANITIZE_ALL^Chem.SANITIZE_KEKULIZE)"

luost26 · 2022-02-09T09:04:30Z

Thank you for your information. I am looking into it.

jwelliavSG · 2022-05-16T07:20:53Z

Hi, I am having the exact same issue - is this simply because the model takes a long time to sample valid molecules. I am getting 25 valid molecules after about 12 hours. 6590 were discounted either because they were duplicates or invalid.

jwelliavSG · 2022-05-16T07:21:26Z

Thanks in advance and also great paper ! Learnt a lot from it.

tdby · 2022-06-08T06:52:41Z

Hi, I am meeting the similar issue.
I have run the sample.py a few days,many samples_%d.pt are created,but it always shows [Pool] Queue 300 | Finished 0 | Failed 0.
When I debug it,I found that no one can pass the statement “if data_next.status == STATUS_FINISHED:”.
I tried to monitor the data“y_frontier” from the function “get_next” in sample.py, the frontiers always exist and status can not be “finished”.
Could you tell how can I solve it? Thank you!

luost26 · 2022-06-08T07:01:24Z

Hi, I am meeting the similar issue. I have run the sample.py a few days,many samples_%d.pt are created,but it always shows [Pool] Queue 300 | Finished 0 | Failed 0. When I debug it,I found that no one can pass the statement “if data_next.status == STATUS_FINISHED:”. I tried to monitor the data“y_frontier” from the function “get_next” in sample.py, the frontiers always exist and status can not be “finished”. Could you tell how can I solve it? Thank you!

Can you try another random seed?

Btw, please check out our latest work on SBDD (accepted to ICML 2022) here: https://github.com/pengxingang/Pocket2Mol
It is significantly faster and generates molecule with higher quality.

tdby · 2022-06-08T07:30:32Z

Thanks for your reply.
I have tried several seeds including the example “4yhj.pdb” in sample_for_pdb.py, and some different data_id from “test_list.tsv” in sample.py.The result is the same. I think that is not my problem.

luost26 · 2022-06-08T07:40:40Z

Thanks for your reply. I have tried several seeds including the example “4yhj.pdb” in sample_for_pdb.py, and some different data_id from “test_list.tsv” in sample.py.The result is the same. I think that is not my problem.

What's the version of RDKit and OpenBabel in your environment ?

tdby · 2022-06-08T07:48:35Z

RDkit==2022.03.3, and OpenBabel==3.1.1
Indeed it's different from your standard version.
Thanks for your advice,I will change it and try again.

luost26 · 2022-06-08T07:51:20Z

RDkit==2022.03.3, and OpenBabel==3.1.1 Indeed it's different from your standard version. Thanks for your advice,I will change it and try again.

Thanks!
Again, I strongly recommend you to checkout our latest work here: https://github.com/pengxingang/Pocket2Mol

tdby · 2022-06-08T07:57:00Z

I will read it carefully.You and your team are very helpful for me.

HaotianZhangAI4Science · 2022-09-27T12:08:40Z

I thought it was caused by the the openbabel under windows

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot get valid molecule from running sample.py for a long time #4

Cannot get valid molecule from running sample.py for a long time #4

simmed00 commented Feb 9, 2022 •

edited

Loading

luost26 commented Feb 9, 2022

simmed00 commented Feb 9, 2022

simmed00 commented Feb 9, 2022

luost26 commented Feb 9, 2022

jwelliavSG commented May 16, 2022

jwelliavSG commented May 16, 2022

tdby commented Jun 8, 2022

luost26 commented Jun 8, 2022

tdby commented Jun 8, 2022

luost26 commented Jun 8, 2022

tdby commented Jun 8, 2022

luost26 commented Jun 8, 2022

tdby commented Jun 8, 2022

HaotianZhangAI4Science commented Sep 27, 2022

Cannot get valid molecule from running sample.py for a long time #4

Cannot get valid molecule from running sample.py for a long time #4

Comments

simmed00 commented Feb 9, 2022 • edited Loading

luost26 commented Feb 9, 2022

simmed00 commented Feb 9, 2022

simmed00 commented Feb 9, 2022

luost26 commented Feb 9, 2022

jwelliavSG commented May 16, 2022

jwelliavSG commented May 16, 2022

tdby commented Jun 8, 2022

luost26 commented Jun 8, 2022

tdby commented Jun 8, 2022

luost26 commented Jun 8, 2022

tdby commented Jun 8, 2022

luost26 commented Jun 8, 2022

tdby commented Jun 8, 2022

HaotianZhangAI4Science commented Sep 27, 2022

simmed00 commented Feb 9, 2022 •

edited

Loading