Jarvis data #54

Nokimann · 2022-01-07T08:30:39Z

Thank you for your work on the efficient way to predict the ML method for the molecular system.

But, I couldn't reproduce the paper.

I found that Jarvis summarizes QM9 datasets with normalization, and I issued in jarvis.

I tested ALIGNN, but cannot reproduce it for the unnormalized QM9 datasets.
Only the normalized QM9 dataset provided by Jarvis works to reproduce prediction values in paper.

knc6 · 2022-01-12T00:51:27Z

Hi @Nokimann

Which property/task did you try to reproduce, and how much difference did you find?

Nokimann · 2022-01-17T07:49:20Z

@knc6

I got 0.029 MAE (~600 epochs) and 0.002 MAE (~300 epochs) of U0 data for the unnormalized data and the normalized data, respectively.

std ~10 of the QM9 data, which means one order high. So, it is pretty reasonable.

knc6 · 2022-01-23T18:31:40Z

You are right. We didn't multiply the std with corresponding MAEs but we should have.

For some properties (with std <1) in QM9 ALIGNN model performance becomes better than reported now but for models such as U0 it becomes worse. We are working on an erratum right now and will update the arXiv preprint as well as the README file soon. The performance on JARVIS-DFT and MP dataset remains intact.

We note that if we train for 1000 or so epochs, we can get U0 MAE upto 0.014 eV. For reference, the std for QM9 tasks:
.

Thanks @Nokimann for catching this mistake.
Also adding @bdecost to the thread.

gasteigerjo · 2022-02-21T12:44:19Z

It seems like this impacts one of the main claims in the paper, but unfortunately there has been no update in the last month. The paper, readme, and arXiv still show the wrong results. Would you have any update on the progress of fixing this?

knc6 · 2022-02-25T23:19:38Z

@klicperajo

We have updated the README file now with the 1000 epoch run and multiplication of MAEs with corresponding standard deviations. On a related point ( usnistgov/jarvis#202 (comment) ), I see using different package datasets such as from PyG or DGL might give you different graphs. Hence, we choose to learn directly from xyz/POSCAR files. Our goal is that after we train a model, a user can feed a POSCAR/xyz file to get predictions using pretrained.py which might be possible but not too easy using PyG/DGL based datasets.

#PyG
from torch_geometric.datasets.qm9 import QM9
q=QM9(root='.')
x=[]
for i in q:
  x.append(i.edge_attr.shape[0])
print (sum(x)) #4883516

#DGL
from dgl.data.qm9 import QM9Dataset
y=[]
for i in data:
  y.append(i[0].num_edges())
print (sum(y)) #36726502

gasteigerjo · 2022-03-11T17:08:08Z

That is great to see, thank you! Any progress on arXiv and npj?

I was not suggesting to use the PyG or DGL datasets, but rather to provide the non-standardized data (in eV or similar). I have seen this mistake of reporting standardized error instead of real units several times now. We should make sure that the straightforward way of evaluation is the correct one. Otherwise this error will be repeated again.

knc6 · 2022-11-07T16:35:17Z

@gasteigerjo Author correction is now available at: https://www.nature.com/articles/s41524-022-00913-5

gasteigerjo · 2022-11-07T18:38:01Z

Thank you for making the effort to amend these numbers!

knc6 mentioned this issue Jan 23, 2022

Update QM9 test results on README, arXiv etc. #56

Closed

3 tasks

knc6 closed this as completed Nov 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jarvis data #54

Jarvis data #54

Nokimann commented Jan 7, 2022 •

edited

Loading

knc6 commented Jan 12, 2022

Nokimann commented Jan 17, 2022 •

edited

Loading

knc6 commented Jan 23, 2022 •

edited

Loading

gasteigerjo commented Feb 21, 2022

knc6 commented Feb 25, 2022

gasteigerjo commented Mar 11, 2022

knc6 commented Nov 7, 2022

gasteigerjo commented Nov 7, 2022

Jarvis data #54

Jarvis data #54

Comments

Nokimann commented Jan 7, 2022 • edited Loading

knc6 commented Jan 12, 2022

Nokimann commented Jan 17, 2022 • edited Loading

knc6 commented Jan 23, 2022 • edited Loading

gasteigerjo commented Feb 21, 2022

knc6 commented Feb 25, 2022

gasteigerjo commented Mar 11, 2022

knc6 commented Nov 7, 2022

gasteigerjo commented Nov 7, 2022

Nokimann commented Jan 7, 2022 •

edited

Loading

Nokimann commented Jan 17, 2022 •

edited

Loading

knc6 commented Jan 23, 2022 •

edited

Loading