-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Promote PBS constants to optional parameters #121
Promote PBS constants to optional parameters #121
Conversation
Codecov Report
@@ Coverage Diff @@
## master #121 +/- ##
==========================================
- Coverage 88.44% 88.26% -0.19%
==========================================
Files 27 27
Lines 1385 1474 +89
==========================================
+ Hits 1225 1301 +76
- Misses 160 173 +13
|
08c04bb
to
4d07515
Compare
Integration tests
#!/bin/bash
bench_example_dir='bench_example_test_multiprocessing_param'
rm -rf $bench_example_dir
git clone [email protected]:CABLE-LSM/bench_example.git $bench_example_dir
cd $bench_example_dir
git reset --hard 6287539e96fc8ef36dc578201fbf9847314147fb
sed -i 's/project:.*/project: tm70/g' config.yaml
sed -i 's/experiment:.*/experiment: AU-Tum/g' config.yaml
sed -i 's/ccc561\/v3.0-YP-changes/sb8430\/test-branch/g' config.yaml
echo "
fluxsite:
multiprocess: False
" >> config.yaml
echo "
science_configurations:
- cable:
cable_user:
CONSISTENCY_CHECK: False
" >> config.yaml
benchcab run -v Job standard output:
#!/bin/bash
bench_example_dir='bench_example_test_pbs_params'
rm -rf $bench_example_dir
git clone [email protected]:CABLE-LSM/bench_example.git $bench_example_dir
cd $bench_example_dir
git reset --hard 6287539e96fc8ef36dc578201fbf9847314147fb
sed -i 's/project:.*/project: tm70/g' config.yaml
sed -i 's/ccc561\/v3.0-YP-changes/sb8430\/test-branch/g' config.yaml
echo "
fluxsite:
pbs:
ncpus: 20
mem: 20GB
walltime: 01:00:00
" >> config.yaml
benchcab run -v Job summary:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work. Just thinking we might need options to differentiate between fluxsite and spatial.
benchcab/bench_config.py
Outdated
# the "pbs" key is optional | ||
if "pbs" in config: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we will need 2 PBS keys: one for fluxsite and one for spatial as the resources are likely going to be quite different.
So should we rename "pbs" to "pbs_fluxsite" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree. I'm thinking we go with this structure:
fluxsite:
pbs:
ncpus: 16
...
multiprocessing: True
spatial:
pbs:
ncpus: 16
...
mainly because multiprocessing
is probably specific to only fluxsite tests. My guess is we won't be running multiple cable MPI executables in parallel on a single PBS job like we do with the fluxsite case (e.g. if we use payu run
to execute each spatial task, we would only be able to run one task per PBS job). Although it might be possible to run multiple MPI instances on a single PBS job (see here for example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that looks good.
benchcab/bench_config.py
Outdated
# the "multiprocessing" key is optional | ||
if "multiprocessing" in config and not isinstance(config["multiprocessing"], bool): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, do we need 2: one for fluxsite and one for spatial? Not sure here, so we could leave as is for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per reply above
multiprocessing
is probably specific to only fluxsite tests. My guess is we won't be running multiple cable MPI executables in parallel on a single PBS job like we do with the fluxsite case (e.g. if we usepayu run
to execute each spatial task, we would only be able to run one task per PBS job). Although it might be possible to run multiple MPI instances on a single PBS job (see here for example).
benchcab/internal.py
Outdated
NCPUS = 18 | ||
MEM = "30GB" | ||
WALL_TIME = "6:00:00" | ||
DEFAULT_PBS: Any = {"ncpus": 18, "mem": "30GB", "walltime": "6:00:00", "storage": []} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need 2, one for fluxsite and one for spatial.
This change promotes PBS related constants to optional parameters in the configuration file so that PBS flags can be set at runtime. This is useful in running multiple benchcab instances with different job parameters such as memory and the number of CPUs. This will also allow us to easily find an optimal number of CPUs to use to maximise performance. This change also adds the ability to switch on and off multiprocessing at runtime via an optional parameter in the config file. Fixes #104
Co-authored-by: Claire Carouge <[email protected]>
Co-authored-by: Claire Carouge <[email protected]>
Co-authored-by: Claire Carouge <[email protected]>
Co-authored-by: Claire Carouge <[email protected]>
Co-authored-by: Claire Carouge <[email protected]>
06a94be
to
d5ba907
Compare
Co-authored-by: Claire Carouge <[email protected]>
d5ba907
to
03a724c
Compare
In config.yaml options, 'multiprocessing' should be 'multiprocess'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a couple of suggestions for the documentation to specify fluxsite
and pbs
are optional.
Co-authored-by: Claire Carouge <[email protected]>
Co-authored-by: Claire Carouge <[email protected]>
This change promotes PBS related constants to optional parameters in the configuration file so that PBS flags can be set at runtime.
This is useful in running multiple benchcab instances with different job parameters such as memory and the number of CPUs. This will also allow us to easily find an optimal number of CPUs to use to maximise performance.
This change also adds the ability to switch on and off multiprocessing at runtime via an optional parameter in the config file.
Fixes #104