Replies: 6 comments 14 replies
-
When i did my test i also did them completely without writing output and restarts, so i commented in fesom_module.F90 Line 378 in 09de730 and Line 381 in 09de730 although there still might remain the problem of reading the forcing. Martin reported in MITgcm of significantly performance differences when reading the forcing between ollie and albedo |
Beta Was this translation helpful? Give feedback.
-
Silly question, for some of the compile-time runs, you have both I also took the liberty of plotting your wall times so we get a nice overview, see attachment. file_contents = """
intel gcc intel_AVX2 gcc_AVX2 intel_NEC gcc_NEC
----------------------------------------------------------------------------------------------------------------
128 111m43.309s 115m31.859s 49m50.963s 104m33.633s 50m48.135s 97m6.001s
512 39m49.813s 39m27.205s 15m14.597s 35m54.552s 21m47.681s 30m26.279s
1024 30m46.163s 31m7.212s 13m27.178s 26m42.096s 19m57.458s 26m20.768s
"""
import pandas as pd
import io
import matplotlib.pyplot as plt
# assuming the file content is in a variable called "file_contents"
df = pd.read_csv(io.StringIO(file_contents), sep=r'\s{2,}', engine='python')
df.index.name = "CPU Partitioning"
df = df.transpose()
# convert the dataframe to timedeltas
df = df.applymap(pd.to_timedelta)
df.plot() And also a ping to @sebhinck who is our spack expert: this may be of interest for the overall software stack! |
Beta Was this translation helpful? Give feedback.
-
@mandresm When did you build the OpenMPI with spack? I would recommend to only use the OpenMPI provided by us admins, because there might be some cruicial flags that you were missing. However, I have configured the OpenMPI Spack package last week, so that user would automatically build their own "optimised" version. But I don't really see the point of doing so.. What Martin Losch and I have seen is that, due to the heavily increased number of cores per node (compared to Ollie), the memory bandwidth might be a crucial factor. So you might want to decrease the number of threads per node. Thanks @mandresm for your tests! |
Beta Was this translation helpful? Give feedback.
-
OK, my silly question was indeed silly, I don't know how to read apparently, that was part of Miguel's notes... |
Beta Was this translation helpful? Give feedback.
-
@mandresm Did you get the flags from the final make command, or are they already visible in the CMakeLists.txt? |
Beta Was this translation helpful? Give feedback.
-
EDIT: No simulations crashed anymore, the results in the table are the final ones. Following the comments from @patrickscholz (commenting out the output parts of the code), Malte's (16, 32, 64, 127, 128 tasks per node) and @sebhinck's (setting For consistency, I have not switched yet switched to the modules provided by the Albedo admins, but that's the next step. I have not tested yet Malte's suggestion on "how the nodes are spread all over the racks". Another thing to test soon. I did not have the time yet to use This is the new table with the findings: nhrs: node hours
The conclusions I draw from this set of tests:
|
Beta Was this translation helpful? Give feedback.
-
Hi everyone!
I've been running some year-long tests in Albedo with FESOM with the aim of finding which compiler and compiler flags work best in Albedo right now. If you'd like me to run more tests, discuss the results or you have suggestions please add a comment to this discussion. The results are summarized in the table below where the rows indicate the number of MPI tasks (all tests without multithreading and no OpenMP) and the column names stand for:
intel_AVX2
, so the configuration that tries to imitate Levante's oneNotes
-O3 -DNDEBUG -O3
flags for thegcc
compilations come from (@hegish any ideas?). Thegcc
tests were run with the pristineCMakeLists.txt
of FESOM while forgcc_AVX2
andgcc_NEC
I have modified this line on the same file accordingly:fesom2/src/CMakeLists.txt
Line 164 in 09de730
-march=core-avx2
probably means that I am not doing a good job with gcc compiler flags.Modules and libraries
I have tried to reproduce exactly the
intel
/openmpi
configuration for which we obtained a good performance in Levante (intel_AVX2
), and for that I have compiled locally (usingspack
) the libraries that FESOM needs, mimicking Levante's ones.intel
I have also done the equivalent for
gcc
:gcc
Additional details
Partition:
mpp
FESOM branch:
refactoring
FESOM git SHA:
40fbbd6
Libraries compilation with spack
Here is how I compiled the FESOM dependencies:
intel
gcc
FESOM sources and binaries
/albedo/work/user/mandresm/model_codes/fesom-2.1-refact_gcc
/albedo/work/user/mandresm/model_codes/fesom-2.1-refact_gcc_AV2
/albedo/work/user/mandresm/model_codes/fesom-2.1-refact_gcc_NEC
/albedo/work/user/mandresm/model_codes/fesom-2.1-refact_intel
/albedo/work/user/mandresm/model_codes/fesom-2.1-refact_intel_AV2
/albedo/work/user/mandresm/model_codes/fesom-2.1-refact_intel_NEC
Work directoriers for the simulations
/albedo/work/user/mandresm/fesom-refact_gcc_1024/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_128/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_512/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_AV2_1024/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_AV2_128/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_AV2_512/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_NEC_1024/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_NEC_128/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_gcc_NEC_512/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_1024/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_128/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_512/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_AV2_1024/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_AV2_128/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_AV2_512/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_NEC_1024/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_NEC_128/run_20010101-20011231/work
/albedo/work/user/mandresm/fesom-refact_intel_NEC_512/run_20010101-20011231/work
Slurm stdout/stderr in:
/albedo/work/user/mandresm/fesom-refact_gcc_1024/log/fesom-refact_gcc_1024_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_128/log/fesom-refact_gcc_128_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_512/log/fesom-refact_gcc_512_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_AV2_1024/log/fesom-refact_gcc_AV2_1024_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_AV2_128/log/fesom-refact_gcc_AV2_128_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_AV2_512/log/fesom-refact_gcc_AV2_512_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_NEC_1024/log/fesom-refact_gcc_NEC_1024_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_NEC_128/log/fesom-refact_gcc_NEC_128_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_gcc_NEC_512/log/fesom-refact_gcc_NEC_512_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_1024/log/fesom-refact_intel_1024_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_128/log/fesom-refact_intel_128_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_512/log/fesom-refact_intel_512_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_AV2_1024/log/fesom-refact_intel_AV2_1024_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_AV2_128/log/fesom-refact_intel_AV2_128_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_AV2_512/log/fesom-refact_intel_AV2_512_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_NEC_1024/log/fesom-refact_intel_NEC_1024_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_NEC_128/log/fesom-refact_intel_NEC_128_fesom_compute_20010101-20011231_*.log
/albedo/work/user/mandresm/fesom-refact_intel_NEC_512/log/fesom-refact_intel_NEC_512_fesom_compute_20010101-20011231_*.log
@patrickscholz @pgierz @koldunovn @dsidoren
Beta Was this translation helpful? Give feedback.
All reactions