Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get model executables from path #379

Closed
micaeljtoliveira opened this issue Nov 6, 2023 · 18 comments · Fixed by #439
Closed

Get model executables from path #379

micaeljtoliveira opened this issue Nov 6, 2023 · 18 comments · Fixed by #439
Assignees

Comments

@micaeljtoliveira
Copy link
Contributor

Recently we added the option to install ACCESS-OM3 with Spack. Usually the location where Spack installs the packages includes a hash in the naming scheme, which makes it a bit tedious to keep the full executable path up to date in the payu config.yaml file. On the other hand, we are using Spack to automatically create environment modules for OM3 that add the corresponding path to the OM3 binaries to the PATH environment variable. Therefore it would be nice if payu could get the executable from the path.

Basically, instead of doing:

exe: /g/data/ik11/spack/0.20.1/opt/linux-rocky8-cascadelake/intel-2021.6.0/access-om3-main-7cryrsy/bin/access-om3-MOM6-CICE6

modules:
  use:
      - /g/data/ik11/spack/0.20.1/modules/access-om3/0.3.0/linux-rocky8-cascadelake
  load:
      - access-om3

we would like to simply write:

exe: access-om3-MOM6-CICE6

modules:
  use:
      - /g/data/ik11/spack/0.20.1/modules/access-om3/0.3.0/linux-rocky8-cascadelake
  load:
      - access-om3
@aidanheerdegen
Copy link
Collaborator

aidanheerdegen commented Nov 8, 2023

I like this idea as it makes easier for users to switch model versions, as there is no need to determine the (rather convoluted) paths to spack built executables.

Just out of interest, what is the reason to encode the configuration version in the module use path rather than the usual convention of defining in the call to module load?

e.g.

exe: access-om3-MOM6-CICE6

modules:
  use:
      - /g/data/ik11/spack/0.20.1/modules/access-om3/linux-rocky8-cascadelake
  load:
      - access-om3/0.3.0

@micaeljtoliveira
Copy link
Contributor Author

Just out of interest, what is the reason to encode the configuration version in the module use path rather than the usual convention of defining in the call to module load?

This is because of how our Spack instance is currently set up. We are installing each version of OM3 in a different Spack environment (there's a couple of reasons for this, but this might stop doing it in the future). Then, because some versions have the same dependencies and because I didn't want to have the Spack hash in the module name, each environment needs a unique module path to install its modules. The simplest way to achieve that was to add the OM3 version to the module path. That explains why the version appears in the use statement. Regarding the load statement, one could actually specify the version, but since there's only one version of OM3 in that set of modules, there's no need.

@micaeljtoliveira
Copy link
Contributor Author

I discussed this with @harshula yesterday and he raised a few valid concerns about picking stuff from $PATH. I therefore suggested the following alternative:

exe: $SPACK_ACCESS_OM3_ROOT/bin/access-om3-MOM6-CICE6

modules:
  use:
      - /g/data/ik11/spack/0.20.1/modules/access-om3/0.3.0/linux-rocky8-cascadelake
  load:
      - access-om3

This would work for the COSIMA spack instance, because each module defines an environment variable SPACK_{name}_ROOT that contains the installation prefix of the package. A similar solution could be adopted by ACCESS-NRI.

Currently the above proposal does not work because payu loads the modules after handling the executable, so the variable is not defined when needed. Might not be too hard to change this.

@harshula
Copy link

harshula commented Nov 9, 2023

Is it possible to require a module version? e.g. access-om3/x.y.z. Not requiring a version can cause unexpected problems for users that we'll have to debug.

@micaeljtoliveira
Copy link
Contributor Author

Just be aware that if you check that there's a version, that the version naming scheme might be a lot more complicated than 'x.y.z'. You also cannot assume that the module will always have the form {name}/{version}, as one could have written the above example as:

modules:
  use:
      - /g/data/ik11/spack/0.20.1/modules/access-om3/0.3.0/
  load:
      - linux-rocky8-cascadelake/access-om3/0.3.0
   

@aidanheerdegen
Copy link
Collaborator

@micaeljtoliveira you make good point about what constitutes a valid version.

Not sure that is solvable in a general sense in a useful way.

@harshula
Copy link

harshula commented Nov 9, 2023

Wouldn't module avail contain the list of valid modules and versions?

@micaeljtoliveira
Copy link
Contributor Author

Wouldn't module avail contain the list of valid modules and versions?

Using that information should make it easier to correctly match the version (or lack of it).

This actually made me realise there's another possible issue: it's possible that there are two modules with the same name/version in two different paths. This is not trivial to solve, as we cannot restrict payu to only use the paths set in the config.yaml file.

With some many possible ways for users to make mistakes, I really think that enforcing reproducibility using the manifest information should be made the default, regardless of any failsafes that are added to payu.

@aidanheerdegen
Copy link
Collaborator

This actually made me realise there's another possible issue: it's possible that there are two modules with the same name/version in two different paths. This is not trivial to solve, as we cannot restrict payu to only use the paths set in the config.yaml file.

Well.. we could by purging all module information and then use and load only the paths and modules defined in config.yaml.

I haven't had long enough to decide what the downsides of that might be, but TBH it is quite appealing.

With some many possible ways for users to make mistakes, I really think that enforcing reproducibility using the manifest information should be made the default, regardless of any failsafes that are added to payu.

There is nothing to stop anyone from adding

reproduce:
  exe: True

to the config.yaml of an experiment. Combined with some CI checks on your config repository to ensure this is set to True will cover publishing of configs that ensure executables are unchanged, e.g. https://github.com/COSIMA/cleanconfig

@micaeljtoliveira
Copy link
Contributor Author

Well.. we could by purging all module information and then use and load only the paths and modules defined in config.yaml.

That would mean forcing users to explicitly set "use /apps/Modules/modulefiles' to have access to the MPI modules, no?

Still doesn't prevent the users from explicitly "using" two different module paths that include the same version of a module.

There is nothing to stop anyone from adding

True and I'm very tempted to start doing it systematically. Still worth considering doing it by default, as not all users will be aware of these issues.

@aidanheerdegen
Copy link
Collaborator

That would mean forcing users to explicitly set "use /apps/Modules/modulefiles' to have access to the MPI modules, no?

Maybe not (thinking of the RPATH goodies in spack built executables), but potentially even worse it would require module load PBS to be able to re-submit itself.

Ok, axe that as an idea! :)

Still worth considering doing it by default, as not all users will be aware of these issues.

I created this issue to discuss this in more detail:

#383

Your input is welcome!

@aidanheerdegen
Copy link
Collaborator

True and I'm very tempted to start doing it systematically.

FYI ACCESS-NRI/build-ci#119

@aekiss
Copy link
Contributor

aekiss commented Jan 25, 2024

Would be nice to have this feature - in the meantime we'll manually keep the exe and module in sync, which is error-prone COSIMA/access-om3#93

@aidanheerdegen
Copy link
Collaborator

we'll manually keep the exe and module in sync, which is error-prone

Looks like I've come around to your way of thinking @aekiss

@aidanheerdegen
Copy link
Collaborator

This has become a blocker for ACCESS-NRI/build-cd#62

So I think we want to prioritise this @jo-basevi

@jo-basevi jo-basevi self-assigned this May 1, 2024
@aidanheerdegen
Copy link
Collaborator

aidanheerdegen commented May 2, 2024

I don't think the suggested solution from @micaeljtoliveira would work currently for ACCESS-OM2 models as each component is installed as it's own package, in a separate hierarchy.

As it stands the module load access-om2/XXXX.XX.X populates $PATH with the location of all the bin directories where components are installed. This is mighty convenient. We could change the module configuration to populate a different and specific environment variable, e.g. $SPACK_BIN_PATH or similar.

@micaeljtoliveira has suggested another change that allows for more generic path generation

ACCESS-NRI/model-config-tests#10

but I don't know how it works with multi-package models, and we've decided to use the spack build hash to connect executables used in experiments to the build provenance for that executable.

I do like the idea of changing information (model version) in a single place with the module load. It means we have an API (modules) that can deal with multi-package models, or single package like ACCESS-OM3

We're also planning to add config QA checks to make sure the paths defined in the exe.yaml manifest match the versions in the spack.location.json artefact produced as part of the build process.

@micaeljtoliveira
Copy link
Contributor Author

@micaeljtoliveira has suggested another change that allows for more generic path generation

ACCESS-NRI/model-config-tests#10

but I don't know how it works with multi-package models, and we've decided to use the spack build hash to connect executables used in experiments to the build provenance for that executable.

I think one could get all the paths and spack hashes by inspecting the symlinks spack creates in the view's bin directory.

@aidanheerdegen
Copy link
Collaborator

I think one could get all the paths and spack hashes by inspecting the symlinks spack creates in the view's bin directory.

Nice. It may be worth doing this anyway, as a belt and braces approach, to allow for different ways to do the same thing.

@jo-basevi had a neat idea of inspecting the modules that are loaded to see what they add to the $PATHvariable and only search those for matches to binaries, and I think this has a number of benefits:

  1. It only happens when there is a module load, so is much less likely to randomly affect experiments by picking up executables from odd paths.
  2. It limits the search scope to only paths added by loaded modules, so again, much less likely to pick up the wrong executable
  3. It is quite flexible, and allows users (and CI automation) to easily switch between model versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants