Skip to content

Commit

Permalink
Merge pull request #12 from vub-hpc/2021b
Browse files Browse the repository at this point in the history
update for Hortense and CESM-deps/2-foss-2021b
  • Loading branch information
wpoely86 authored Jun 22, 2022
2 parents 1d4b0c8 + 4206485 commit c16c2e1
Show file tree
Hide file tree
Showing 6 changed files with 246 additions and 26 deletions.
80 changes: 65 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,23 +63,28 @@ update-cesm-machines /path/to/cesm-x.y.z/cime/config/cesm/machines/ /path/to/ces
```

All three machine files (*machines*, *compilers* and *batch*) have to be
updated to get a working installation of CESM/CIME in Hydra or Breniac.
updated to get a working installation of CESM/CIME in the VSC clusters.

### Hydra

* The only module needed is `CESM-deps`. It has to be loaded at all times, from
cloning of the sources to case submission.

* Using `--machine hydra` is optional as long the user is in a compute node or
a login node in Hydra. CESM uses `NODENAME_REGEX` in `config_machines.xml` to
identify the host machine.

* The only module needed is `CESM-deps`. It has to be loaded at all times, from
cloning of the sources to case submission.
* Users have two options on creation of new cases for `--compiler`. One is
`gnu`, based on the GNU Compiler Collection. The other is `intel`, based on
Intel compilers. The versions of each compiler are described in the
[easyconfigs](#easyconfigs) of each CESM-deps module.

* There is a single configuration for the compiler that is tailored to nodes
* There is a single configuration for both compilers that is tailored to nodes
with Skylake CPUs, including the login nodes.

* CESM is not capable of detecting and automatically adding the required
libraries to its build process. The current specification of `SLIBS` contains
just what we found so far to be required.
just what we found to be required (so far).

* By design, CESM sets a specific queue with `-q queue_name`, otherwise it
fails to even create the case. In Hydra we use the partition `skylake_mpi` as
Expand Down Expand Up @@ -108,6 +113,32 @@ updated to get a working installation of CESM/CIME in Hydra or Breniac.
will derive the job to `q1h`, `q24h` or `q72h` depending on the walltime
requested.

### Hortense

* The only module needed to create, setup and build cases is `CESM-deps`.
It has to be loaded at all times, from cloning of the sources to case
submission. CESM will also load vsc-mympirun at runtime to be able to
use MPI.

* Using `--machine hortense` is optional as long the user is in a non-GPU
compute node or a login node in Hortense. CESM uses `NODENAME_REGEX` in
`config_machines.xml` to identify the host machine.

* Cases are submitted with Slurm's sbatch as CESM is not compatible with
[jobcli](https://github.com/hpcugent/jobcli).

* By default all cases are run in the `cpu_rome` partition. Optionally, cases
can also be submitted to `cpu_rome_512` with the high memory nodes.

* Downloading input data from the default FTP servers of UCAR is not possible.
Input data has to be manually downloaded until an alternative method is set
up for Hortense.

* The recommended workflow is to create the case as usual, setup and build the
case in the compute nodes with the job script
[case.setupbuild.slurm](scripts/case.setupbuild.slurm) and then submit the
case as usual with `case.submit`.

## File structure

All paths defined in `config_machines.xml` are below `$VSC_SCRATCH`. These
Expand Down Expand Up @@ -218,32 +249,51 @@ each job. This can cause problems in a heterogeneous environment as not all
nodes might provide the same hardware features as the login nodes.

The example job scripts in [cesm-config/scripts](scripts) solve this problem
by executing all steps in the compute nodes of the cluster. In this way, the
by executing these steps in the compute nodes of the cluster. In this way, the
compilation can be optimized to the host machine, simplifying the configuration,
and the user does not have to worry about where the case is build and where it
is executed.

* [case.slurm](scripts/case.slurm): performs setup, build and execution of the
case

* [case.setupbuild.slurm](scripts/case.setupbuild.slurm): performs setup and
build of the case, then the user can use `case.submit` as usual

## Easyconfigs

* CESM-deps loads all dependencies to build and run CESM cases
### CESM-deps

Loads all dependencies to build and run CESM cases.

* [CESM-deps-2-intel-2019b.eb](easyconfigs/CESM-deps/CESM-deps-2-intel-2019b.eb):

* `CESM-deps-2-intel-2019b.eb` is used in Breniac
* only option in the two Breniac clusters

* `CESM-deps-2-intel-2019b.eb` is used in Hydra
* used in Hydra with `--compiler=intel`

* CESM-tools loads software commonly used to analyse the results of the
simulations
* [CESM-deps-2-foss-2021b.eb](easyconfigs/CESM-deps/CESM-deps-2-foss-2021b.eb):

* `CESM-tools-2-foss-2019a.eb` is available in Hydra
* only option in Hortense

* used in Hydra with `--compiler=gnu`

Our easyconfigs of CESM-deps are based on those available in
[EasyBuild](https://github.com/easybuilders/easybuild-easyconfigs/tree/master/easybuild/easyconfigs/c/CESM-deps).
However, the CESM-deps module in Hydra and Breniac also contains the
configuration files and scripts from this repository, which are located in the
installation directory (`$EBROOTCESMMINDEPS`). Hence, our users have direct
access to these files once `CESM-dep/2-intel-2019b` is loaded. The usage
instructions of our CESM-deps modules also provide a minimum set of
instructions to create cases with this configuration files.
access to these files once `CESM-deps` is loaded. The usage instructions of our
CESM-deps modules also provide a minimum set of instructions to create cases
with this configuration files.

### CESM-tools

Loads software commonly used to analyse the results of the simulations.

* [CESM-tools-2-foss-2019a.eb](easyconfigs/CESM-tools/CESM-tools-2-foss-2019a.eb):

* available in Hydra

## CPRNC

Expand Down
50 changes: 50 additions & 0 deletions easyconfigs/CESM-deps/CESM-deps-2-foss-2021b.eb
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
easyblock = 'Bundle'

name = 'CESM-deps'
version = '2'

homepage = 'https://www.cesm.ucar.edu/models/cesm2/'
description = """CESM is a fully-coupled, community, global climate model that
provides state-of-the-art computer simulations of the Earth's past, present,
and future climate states."""

toolchain = {'name': 'foss', 'version': '2021b'}

dependencies = [
('CMake', '3.22.1'),
('Python', '3.9.6'),
('lxml', '4.6.3'),
('Perl', '5.34.0'),
('XML-LibXML', '2.0207'),
('ESMF', '8.2.0'),
('netCDF', '4.8.1'),
('netCDF-Fortran', '4.5.3'),
('netCDF-C++4', '4.3.1'),
('PnetCDF', '1.12.3'),
('Subversion', '1.14.1'),
('git', '2.33.1', '-nodocs'),
('git-lfs', '3.2.0', '', True),
]

components = [
# install extra configuration tools for CESM
('cesm-config', '1.6.0', {
'easyblock': 'Tarball',
'source_urls': ['https://github.com/vub-hpc/%(name)s/archive'],
'sources': [{'download_filename': 'v%(version)s.tar.gz', 'filename': SOURCE_TAR_GZ}],
'start_dir': '%(name)s-%(version)s',
}),
]

sanity_check_paths = {
'files': ['bin/update-cesm-machines', 'scripts/case.pbs', 'scripts/case.slurm'],
'dirs': ['machines', 'irods'],
}

usage = """Environment to build and run CESM v2 simulations
1. Download a release of CESM v2: `git clone -b release-cesm2.2.0 https://github.com/ESCOMP/cesm.git cesm-2.2.0`
2. Add external programs for CESM: `cd cesm-2.2.0; ./manage_externals/checkout_externals`
3. Update config files: `update-cesm-machines cime/config/cesm/machines/ $EBROOTCESMMINDEPS/machines/`
4. Create case: `cd cime/scripts && ./create_newcase --machine ...`"""

moduleclass = 'geo'
14 changes: 14 additions & 0 deletions machines/config_batch.xml
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,20 @@ Author: Alex Domingo (Vrije Universiteit Brussel)
</queues>
</batch_system>

<batch_system MACH="hortense" type="slurm">
<batch_submit>sbatch</batch_submit>
<jobid_pattern>job (\d+)</jobid_pattern>
<submit_args>
<arg flag="-t" name="$JOB_WALLCLOCK_TIME"/>
<arg flag="-p" name="$JOB_QUEUE"/>
<arg flag="-A" name="$PROJECT"/>
</submit_args>
<queues>
<queue walltimemin="00:15:00" walltimemax="72:00:00" nodemax="256" default="true">cpu_rome</queue>
<queue walltimemin="00:15:00" walltimemax="72:00:00" nodemax="42">cpu_rome_512</queue>
</queues>
</batch_system>

<batch_system MACH="breniac" type="pbs">
<directives>
<directive>-l nodes={{ num_nodes }}:ppn={{ tasks_per_node }}</directive>
Expand Down
51 changes: 49 additions & 2 deletions machines/config_compilers.xml
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,32 @@ Author: Alex Domingo (Vrije Universiteit Brussel)

<config_compilers>

<compiler MACH="hydra">
<ESMF_LIBDIR>$ENV{EBROOTESMF}/lib</ESMF_LIBDIR>
<NETCDF_PATH>$ENV{EBROOTNETCDF}</NETCDF_PATH>
<NETCDF_C_PATH>$ENV{EBROOTNETCDFMINCPLUSPLUS4}</NETCDF_C_PATH>
<NETCDF_FORTRAN_PATH>$ENV{EBROOTNETCDFMINFORTRAN}</NETCDF_FORTRAN_PATH>
<PNETCDF_PATH>$ENV{EBROOTPNETCDF}</PNETCDF_PATH>
<PIO_FILESYSTEM_HINTS>gpfs</PIO_FILESYSTEM_HINTS>
</compiler>
<compiler MACH="hydra" COMPILER="gnu">
<MPICC> mpicc </MPICC>
<MPICXX> mpicxx </MPICXX>
<MPIFC> mpif90 </MPIFC>
<SCC> gcc </SCC>
<SCXX> g++ </SCXX>
<SFC> gfortran </SFC>
<CFLAGS>
<append DEBUG="FALSE"> -O2 </append>
</CFLAGS>
<FFLAGS>
<append DEBUG="FALSE"> -O2 -fallow-argument-mismatch -fallow-invalid-boz </append>
</FFLAGS>
<SLIBS>
<append> -L$ENV{EBROOTSCALAPACK}/lib -lscalapack -L$ENV{EBROOTOPENBLAS}/lib -lopenblas </append>
<append> -L$ENV{EBROOTNETCDFMINFORTRAN}/lib -lnetcdff -L$ENV{EBROOTNETCDF}/lib -lnetcdf </append>
</SLIBS>
</compiler>
<compiler MACH="hydra" COMPILER="intel">
<MPICC> mpiicc </MPICC>
<MPICXX> mpiicpc </MPICXX>
Expand All @@ -51,14 +77,35 @@ Author: Alex Domingo (Vrije Universiteit Brussel)
<append DEBUG="FALSE"> -xCOMMON-AVX512 -no-fma </append>
</FFLAGS>
<SLIBS>
<append> -L$ENV{EBROOTNETCDFMINFORTRAN}/lib -lnetcdff -L$ENV{EBROOTNETCDF}/lib64 -lnetcdf </append>
<append> -L$ENV{EBROOTNETCDFMINFORTRAN}/lib -lnetcdff -L$ENV{EBROOTNETCDF}/lib -lnetcdf </append>
</SLIBS>
</compiler>

<compiler MACH="hortense">
<ESMF_LIBDIR>$ENV{EBROOTESMF}/lib</ESMF_LIBDIR>
<NETCDF_PATH>$ENV{EBROOTNETCDF}</NETCDF_PATH>
<NETCDF_C_PATH>$ENV{EBROOTNETCDFMINCPLUSPLUS4}</NETCDF_C_PATH>
<NETCDF_FORTRAN_PATH>$ENV{EBROOTNETCDFMINFORTRAN}</NETCDF_FORTRAN_PATH>
<PNETCDF_PATH>$ENV{EBROOTPNETCDF}</PNETCDF_PATH>
<PIO_FILESYSTEM_HINTS>gpfs</PIO_FILESYSTEM_HINTS>
<PIO_FILESYSTEM_HINTS>lustre</PIO_FILESYSTEM_HINTS>
</compiler>
<compiler MACH="hortense" COMPILER="gnu">
<MPICC> mpicc </MPICC>
<MPICXX> mpicxx </MPICXX>
<MPIFC> mpif90 </MPIFC>
<SCC> gcc </SCC>
<SCXX> g++ </SCXX>
<SFC> gfortran </SFC>
<CFLAGS>
<append DEBUG="FALSE"> -O2 </append>
</CFLAGS>
<FFLAGS>
<append DEBUG="FALSE"> -O2 -fallow-argument-mismatch -fallow-invalid-boz </append>
</FFLAGS>
<SLIBS>
<append> -L$ENV{EBROOTSCALAPACK}/lib -lscalapack -L$ENV{EBROOTOPENBLAS}/lib -lopenblas </append>
<append> -L$ENV{EBROOTNETCDFMINFORTRAN}/lib -lnetcdff -L$ENV{EBROOTNETCDF}/lib -lnetcdf </append>
</SLIBS>
</compiler>

<compiler MACH="breniac" COMPILER="intel">
Expand Down
60 changes: 51 additions & 9 deletions machines/config_machines.xml
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,12 @@ Author: Alex Domingo (Vrije Universiteit Brussel)
<config_machines>

<machine MACH="hydra">
<DESC> Hydra - heterogeneous cluster at VUB, batch system is PBS</DESC>
<DESC> Hydra - heterogeneous cluster at VUB, batch system is Slurm</DESC>
<NODENAME_REGEX>(node3.*\.hydra\.(os|brussel\.vsc)|login.*\.cerberus\.os)</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>intel</COMPILERS>
<MPILIBS>impi</MPILIBS>
<COMPILERS>intel,gnu</COMPILERS>
<MPILIBS compiler="intel">impi</MPILIBS>
<MPILIBS compiler="gnu">openmpi</MPILIBS>
<CIME_OUTPUT_ROOT>$ENV{VSC_SCRATCH}/cesm/output</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>$ENV{VSC_SCRATCH}/cesm/inputdata</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>$ENV{VSC_SCRATCH}/cesm/inputdata/atm/datm7</DIN_LOC_ROOT_CLMFORC>
Expand All @@ -55,11 +56,10 @@ Author: Alex Domingo (Vrije Universiteit Brussel)
<MAX_TASKS_PER_NODE>40</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>40</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>FALSE</PROJECT_REQUIRED>
<mpirun mpilib="impi">
<mpirun mpilib="default">
<executable>srun</executable>
<arguments>
<arg name="num_tasks">--ntasks={{ total_tasks }}</arg>
<arg name="cpu_bind">--cpu_bind=sockets --cpu_bind=verbose</arg>
<arg name="kill-on-bad-exit">--kill-on-bad-exit</arg>
</arguments>
</mpirun>
Expand All @@ -74,14 +74,56 @@ Author: Alex Domingo (Vrije Universiteit Brussel)
<cmd_path lang="csh">module -q</cmd_path>
<modules>
<command name="purge"/>
<command name="load">XML-LibXML/2.0201-GCCcore-8.3.0</command>
<command name="load">Python/2.7.16-GCCcore-8.3.0</command>
<command name="load">CMake/3.15.3-GCCcore-8.3.0</command>
<command name="load">git/2.23.0-GCCcore-8.3.0-nodocs</command>
</modules>
<modules compiler="intel">
<command name="load">CESM-deps/2-intel-2019b</command>
</modules>
<modules compiler="gnu">
<command name="load">CESM-deps/2-foss-2021b</command>
</modules>
</module_system>
<resource_limits>
<resource name="RLIMIT_STACK">-1</resource>
</resource_limits>
</machine>

<machine MACH="hortense">
<DESC> Hortense - Tier-1 cluster at UGent, batch system is Slurm</DESC>
<NODENAME_REGEX>(node|login)\d+\.dodrio\.os</NODENAME_REGEX>
<OS>LINUX</OS>
<COMPILERS>gnu</COMPILERS>
<MPILIBS compiler="gnu">openmpi</MPILIBS>
<CIME_OUTPUT_ROOT>$ENV{VSC_SCRATCH}/cesm/output</CIME_OUTPUT_ROOT>
<DIN_LOC_ROOT>$ENV{VSC_SCRATCH}/cesm/inputdata</DIN_LOC_ROOT>
<DIN_LOC_ROOT_CLMFORC>$ENV{VSC_SCRATCH}/cesm/inputdata/atm/datm7</DIN_LOC_ROOT_CLMFORC>
<DOUT_S_ROOT>$CIME_OUTPUT_ROOT/archive/$CASE</DOUT_S_ROOT>
<BASELINE_ROOT>$ENV{VSC_SCRATCH}/cesm/cesm_baselines</BASELINE_ROOT>
<CCSM_CPRNC>$ENV{VSC_SCRATCH}/cesm/tools/cprnc/cprnc</CCSM_CPRNC>
<GMAKE_J>16</GMAKE_J>
<BATCH_SYSTEM>slurm</BATCH_SYSTEM>
<SUPPORTED_BY>HPC-Ugent</SUPPORTED_BY>
<MAX_TASKS_PER_NODE>128</MAX_TASKS_PER_NODE>
<MAX_MPITASKS_PER_NODE>128</MAX_MPITASKS_PER_NODE>
<PROJECT_REQUIRED>TRUE</PROJECT_REQUIRED>
<mpirun mpilib="default">
<executable>mympirun</executable>
</mpirun>
<module_system type="module">
<init_path lang="perl">/usr/share/lmod/lmod/init/perl</init_path>
<init_path lang="python">/usr/share/lmod/lmod/init/env_modules_python.py</init_path>
<init_path lang="csh">/usr/share/lmod/lmod/init/csh</init_path>
<init_path lang="sh">/usr/share/lmod/lmod/init/sh</init_path>
<cmd_path lang="perl">/usr/share/lmod/lmod/libexec/lmod perl -q</cmd_path>
<cmd_path lang="python">/usr/share/lmod/lmod/libexec/lmod python -q</cmd_path>
<cmd_path lang="sh">module -q</cmd_path>
<cmd_path lang="csh">module -q</cmd_path>
<modules>
<command name="purge"/>
</modules>
<modules compiler="gnu">
<command name="load">CESM-deps/2-foss-2021b</command>
<command name="load">vsc-mympirun</command>
</modules>
</module_system>
<resource_limits>
<resource name="RLIMIT_STACK">-1</resource>
Expand Down
17 changes: 17 additions & 0 deletions scripts/case.setupbuild.slurm
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
#!/bin/bash
#SBATCH --job-name=CESM-case-setupbuild
#SBATCH --output="%x-%j.out"
#SBATCH --time=4:00:00
#SBATCH --nodes 1
#SBATCH --ntasks 16

module load CESM-deps/2-foss-2021b

# CESM case setup
echo
./case.setup --reset
./preview_run

# CESM case build
echo
./case.build

0 comments on commit c16c2e1

Please sign in to comment.