-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider use of fingerprint distance instead of StructureMatcher
for comparison between generated and test
#39
Comments
Compare matminer featurizers with hash approach |
Ok, how do we design the experiment? Potential fingerprints:
Then, look at how the distribution of pairwise distances looks like. How many in 1% percentile distance, etc.? |
Found this from CDVAE manuscript which is right in line with what you mentioned previously:
They used Euclidean distances between Magpie feature vectors for compositional distance and between CrystalNN fingerprint for structural fingerprints, and a "match" meant both the compositional and structural (Euclidean) distances were lower than (somewhat) arbitrarily chosen thresholds. I lean towards using ElMD for the compositional distance via chem_wasserstein. Maybe using Earth Mover's Distance for the CrystalNN fingerprint as well via dist-matrix. For now for simplicity, maybe stick with CDVAE's implementation? If eventually we do go with |
|
Planning to implement loading precomputed compositional and structural fingerprints from FigShare (still need to calculate and upload) to save time computing the metric, since the structural fingerprinting can take a while. The fingerprints for generated structures will still need to be computed by the user, but should only be a few minutes for 1000 structures. Signing off for now, though. |
Can always circle back to this or create a new issue, but a CDVAE-style implementation of a coverage metric seems to be functional now. |
From CDVAE paper:
The text was updated successfully, but these errors were encountered: