Skip to content

Commit

Permalink
Merge pull request #475 from ACCESS-NRI/456-manifest-exe-true-bug
Browse files Browse the repository at this point in the history
Refactor manifest logic

 - Removed ScanInputs manifest option
- Symlinks to work directory are no longer generated through the manifests when reproduce is True. 
- Fix bug so reproduce true picks up on new files
- Stored manifests are used as source of truth for md5 hashes (this was previously the case for just input manifests). So if a calculated fast hash matches the fast hash in the stored manifest, then the full hash from the stored manifest is used. This is to avoid re-calculating any md5 hashes.
  • Loading branch information
jo-basevi committed Aug 22, 2024
2 parents fcd414b + 8fae6f9 commit 793447d
Show file tree
Hide file tree
Showing 7 changed files with 235 additions and 372 deletions.
16 changes: 3 additions & 13 deletions docs/source/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -251,28 +251,18 @@ section for details.
reproducible experiments. The default value is the value of the global
``reproduce`` flag, which is set using a command line argument and
defaults to *False*. These options **override** the global ``reproduce``
flag. If set to *True* payu will refuse to run if the hashes in the
flag. If set to *True* payu will refuse to run if the MD5 hashes in the
relevant manifest do not match.

``exe`` (*Default: global reproduce flag*)
Enforce executable reproducibility. If set to *True* will refuse to
run if hashes do not match.
Enforce executable reproducibility.

``input`` (*Default: global reproduce flag*)
Enforce input file reproducibility. If set to *True* will refuse to
run if hashes do no match. Will not search for any new files.
Enforce input file reproducibility.

``restart`` (*Default: global reproduce flag*)
Enforce restart file reproducibility.

``scaninputs`` (*Default: True*)
Scan input directories for new files. Set to *False* when reproduce input
is *True*.

If a manifest file is complete and it is desirable to not add spurious
files to the manifest but allow existing files to change, setting this
option to *False* would allow that behaviour.

``ignore`` (*Default: .\**):
List of ``glob`` patterns which match files to ignore when scanning input
directories. This is an array, so multiple patterns can be specified on
Expand Down
24 changes: 6 additions & 18 deletions docs/source/manifests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -71,24 +71,12 @@ for each model run.
Manifest updates
----------------

Each of the manifests is updated in a slightly different way which reflects
the way the files are expected to change during an experiment.

The executable manifest is recalculated each time the model is run.
Executables are generally fairly small in size and number, so there is very
little overhead calculating full MD5 hashes. This also means there is no
need to check that exectutable paths are still correct and also any
changes to executables are automatically included in the manifest.

The restart manifest is also recalculated for every run as there is no expectation
that restart (or pickup) files are ever the same between normal model runs.

The input manifest changes relatively rarely and can often contain a small
number of very large files. It is this combination that can cause a significant
time overhead if full MD5 hashes have to be computed for every run. By using
binhash, a fast change-sensitive hash, these time consuming hashes only
need be computed when a change has been detected. So the slow md5 hashes
are recalculated as little as possible.
Each time the model is run, binhash for each filepath is recalculated
and compared with stored manifest values. If a new filepath has been added,
or the binhash differs from the stored value, the full MD5 hash is
recalculated. By using binhash, a fast change-sensitive hash,
these time consuming MD5 hashes only need be computed when a change has
been detected. So the slow md5 hashes are recalculated as little as possible.

Manifest options
----------------
Expand Down
Loading

0 comments on commit 793447d

Please sign in to comment.