FM-v4 branch into main #752

misko · 2024-07-09T16:48:31Z

Merging a long running branch called fm-v4 created to support quick changes to main for FM work

Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error`

…variance

codecov · 2024-08-20T17:41:51Z

Codecov Report

Attention: Patch coverage is 86.20690% with 8 lines in your changes missing coverage. Please review.

Files	Patch %	Lines
src/fairchem/core/common/utils.py	85.71%	5 Missing ⚠️
src/fairchem/core/datasets/_utils.py	88.88%	1 Missing ⚠️
src/fairchem/core/modules/loss.py	0.00%	1 Missing ⚠️
src/fairchem/core/trainers/base_trainer.py	90.00%	1 Missing ⚠️

Files	Coverage Δ
src/fairchem/core/modules/transforms.py	`59.37% <100.00%> (+4.20%)`	⬆️
src/fairchem/core/datasets/_utils.py	`88.88% <88.88%> (-2.78%)`	⬇️
src/fairchem/core/modules/loss.py	`66.07% <0.00%> (ø)`
src/fairchem/core/trainers/base_trainer.py	`89.22% <90.00%> (-0.11%)`	⬇️
src/fairchem/core/common/utils.py	`68.79% <85.71%> (+0.98%)`	⬆️

rayg1234 · 2024-08-20T17:42:17Z

src/fairchem/core/common/utils.py

@@ -393,7 +402,11 @@ def create_dict_from_args(args: list, sep: str = "."):
    return return_dict


-def load_config(path: str, previous_includes: list | None = None):
+def load_config(
+    path: str, previous_includes: list | None = None, include_paths: list | None = None


maybe comment on the diff between paths, previous_includes and include_paths?

also optional - add test on yml with include paths if doesnt exist, can also make task for BE :)

rayg1234 · 2024-08-20T17:44:03Z

src/fairchem/core/common/utils.py

@@ -999,34 +1024,70 @@ class _TrainingContext:
            task_name = "s2ef"
        elif trainer_name in ["energy", "equiformerv2_energy"]:
            task_name = "is2re"
+        elif "multitask" in trainer_name.lower():


can we get rid of this .lower() and make it exact, this makes it ambiguous on how to specify the trainer name

rayg1234 · 2024-08-20T17:46:48Z

src/fairchem/core/common/utils.py

+                raise RuntimeError(
+                    f"Required key missing from config: {missing_keys!s}"
+                )
+            trainer = trainer_cls(


can we move these into dict and pass in by trainer_cls(**dict), add the arguments that diff manually for multitask, then we can specify the common arguments between trainer and multitask trainer once without copying

rayg1234 · 2024-08-20T17:48:08Z

src/fairchem/core/datasets/_utils.py

@@ -13,7 +13,9 @@
    from torch_geometric.data import Data


-def rename_data_object_keys(data_object: Data, key_mapping: dict[str, str]) -> Data:
+def rename_data_object_keys(


comment on what this does

rayg1234 · 2024-08-20T17:50:00Z

src/fairchem/core/trainers/base_trainer.py

@@ -571,6 +575,9 @@ def load_checkpoint(
        self.step = checkpoint.get("step", 0)
        self.best_val_metric = checkpoint.get("best_val_metric", None)
        self.primary_metric = checkpoint.get("primary_metric", None)
+        self.config["cmd"]["parent"] = checkpoint["config"]["cmd"].get(


where is this parent actually used?

I dont think its read anywhere, i believe it was intended to link fine tuning runs to their parent runs. @mshuaibii is this right?

lets remove if its not used

Yeah it was introduced to organize things in wandb by this entry. But you guys probably have a different solution at this point so ommitting is fine

rayg1234 · 2024-08-20T17:59:23Z

src/fairchem/core/modules/transforms.py

+from contextlib import suppress
+
+with suppress(ImportError):
+    from fairchem.experimental.foundation_models.multi_task_dataloader.transforms.data_object import *  # noqa


rayg1234

rayg1234 · 2024-08-20T22:57:26Z

src/fairchem/core/trainers/base_trainer.py

@@ -232,6 +234,8 @@ def load(self) -> None:
        self.load_loss()
        self.load_optimizer()
        self.load_extras()
+        if self.config["optim"].get("load_datasets_and_model_then_exit", False):


why is this needed if this is the last line of the init anyways?

nimashoghi and others added 30 commits August 24, 2023 18:27

Update BalancedBatchSampler to use datasets' data_sizes method

ae4add3

Replace BalancedBatchSampler's `force_balancing` and `throw_on_error` parameters with `on_error`

Remove python 3.10 syntax

01fe2b4

Documentation

2bf8213

Added set_epoch method

7ba5b8a

Format

a367d1e

Changed "resolved dataset" message to be a debug log to reduce log spam

46e3c57

Minor changes to support multitask

87714f5

add in pickle data set; add in stat functions for combining mean and …

3105359

…variance

checksums for equiformer

e170f53

detach compute metrics and add checksum function for linear layer

3ea4dc4

Merge branch 'main' into fm-v2-pickle

bbda257

change name to dataset_configs

102667f

add seed option

2c571fd

remove pickle dataset

319a597

remove pickle dataset

d1f2ccf

add experimental datatransform to ase_dataset

1e7548d

update with main

845bce3

clean up batchsampler and tests

86da069

base dataset class

ff628dd

move lin_ref to base dataset

122197f

inherit basedataset for ase dataset

fb4ce16

filter indices prop

c9e1759

updated import for ase dataset

85b8ab9

Merge branch 'fm-v3' of github.com:Open-Catalyst-Project/ocp into fm-v3

e227de3

added create_dataset fn

95d3e6f

yaml load fix

b6c640e

create dataset function instead of filtering in base

7fa1904

remove filtered_indices

04c96bf

make create_dataset and LMDBDatabase importable from datasets

ea35b57

create_dataset cleanup

dc98285

misko and others added 6 commits August 13, 2024 21:22

move avg num nodes

fc269b8

Merge branch 'main' into fm-v4

039f9e6

Merge branch 'main' into fm-v4

3ec098c

Merge remote-tracking branch 'origin' into fm-v4

21eecd4

optional import from experimental

f3e1c38

fix lint

f2302bf

misko requested review from rayg1234 and wood-b August 20, 2024 17:06

misko added enhancement New feature or request minor Minor version release labels Aug 20, 2024

rayg1234 reviewed Aug 20, 2024

View reviewed changes

misko added 2 commits August 20, 2024 21:47

add comments, refactor common trainer args in a single dictionary

69648fb

add comments, refactor common trainer args in a single dictionary

0b4c5ee

rayg1234 previously approved these changes Aug 20, 2024

View reviewed changes

rayg1234 reviewed Aug 20, 2024

View reviewed changes

remove parent

07efac0

misko dismissed rayg1234’s stale review via 07efac0 August 21, 2024 00:06

misko requested a review from rayg1234 August 21, 2024 00:09

rayg1234 approved these changes Aug 21, 2024

View reviewed changes

misko enabled auto-merge August 21, 2024 00:59

misko added this pull request to the merge queue Aug 21, 2024

Merged via the queue into main with commit 3899aac Aug 21, 2024
9 checks passed

misko deleted the fm-v4 branch August 21, 2024 03:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FM-v4 branch into main #752

FM-v4 branch into main #752

misko commented Jul 9, 2024 •

edited

Loading

codecov bot commented Aug 20, 2024 •

edited

Loading

rayg1234 Aug 20, 2024

rayg1234 Aug 20, 2024

rayg1234 Aug 20, 2024 •

edited

Loading

rayg1234 Aug 20, 2024

rayg1234 Aug 20, 2024

rayg1234 Aug 20, 2024

misko Aug 20, 2024

rayg1234 Aug 20, 2024

mshuaibii Aug 21, 2024

rayg1234 Aug 20, 2024

rayg1234 left a comment

rayg1234 Aug 20, 2024

FM-v4 branch into main #752

FM-v4 branch into main #752

Conversation

misko commented Jul 9, 2024 • edited Loading

codecov bot commented Aug 20, 2024 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rayg1234 Aug 20, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rayg1234 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

misko commented Jul 9, 2024 •

edited

Loading

codecov bot commented Aug 20, 2024 •

edited

Loading

rayg1234 Aug 20, 2024 •

edited

Loading