-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UFS-WM cpld_debug_p8 and cpld_control_p8 gnu test case hangs on hera #2263
Comments
@uturuncoglu @RatkoVasic-NOAA This issue could be an issue with openmpi (especially old version of gnu) on hera. But worth to note that the issue became visible at the call ESMF_InfoBroadcast(info, rootPet=fcstPetList(1), rc=rc). |
An ticket about this issue was created on ESMF support. |
An update for Hera GNU: Spack-stacks 1.5.1 and 1.6.0 with packages for ufs-weather-model and ufs-srweather-app have been built on Hera with GNU/13.3.0 compiler. Spack-stack v1.6.0 built with ESMF/8.6.1 and MAPL/2.46.0. A first check of running the RTs: some pass, some RT fail
More testing is needed maybe on the specific tests. Locations of the spack-stacks (NB: packages for UFS-WM and UFS-SRW only!) /scratch2/NCEPDEV/stmp1/role.epic/spack-stack/spack-stack-1.6.0_gnu13.3/envs/ufs-wm-srw-rocky8 My WM tests with spack-stack-1.6.0 are in and with spack-stack-1.5.1 (run with -w option) are in A modulefile for using spack-stack-1.6.0:
The ufs_common.lua for use with spack-stack1.6.0:
A modulefile for using spack-stack-1.5.1:
|
I tested @natalie-perlin installation, and tests that were failing on Hera using GNU compiler now work. There are so many other tests to be done. @jkbk2004 I suggest weather-model group to test because some of tests are failing just because of not bit-identical results (which is expected). |
All the regression tests with gnu/13.3.0 compiler and spack-stack/1.6.0 have successfully passed for the weather model, |
Description
The tests passed on hercules. It is possibly caused by either the version of the GNU compiler or the version of the MPI library. The line that causes the hang was identified is the line:
https://github.com/NOAA-EMC/fv3atm/pull/775/files#diff-dc3da9b9c37c068b769128e69328ab808bb6a17947cae75342a9a462cebf63ebR1187
The test also works with default mpi tasks on hera. Need to follow with the issue to ESMF team.
Turned off the test case on hera in allow diagnostic accumulation bucket to change in fv3atm integration #2128
To Reproduce:
Additional context
Failure message from error log for cpld_debug_p8 and cpld_control_p8 gnu.
The OSC pt2pt component does not support MPI_THREAD_MULTIPLE in this release.
Workarounds are to run on a single node, or to use a system with an RDMA
capable network such as Infiniband.
Output
The text was updated successfully, but these errors were encountered: