-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UWM failed on HERA ROCKY #2211
Comments
I am repeating on orion, jobs are running now, at least no dying job at this moment. |
@jiandewang That error message is coming from WW3 I believe. Can you check what you have in log.ww3? |
@jiandewang can you re-run? WW3_input_data_20220624 is re-covered on hera. |
I also see this error when I run cpld_control_p8 with the current develop branch:
/scratch1/NCEPDEV/stmp2/Dusan.Jovic/FV3_RT/rt_1182930/cpld_control_p8_intel |
@jiandewang @DusanJovic-NOAA That is why the original WW3-input data needs to be retained. Input data should never be overwritten. Only adding is allowable. |
@jkbk2004 Please make that everyone on your team understands the importance of NOT overwriting input data. |
We always backed up. @zach1221 @FernandoAndrade-NOAA FYI |
@jiandewang just confirming that it is an input error. Was there a reason that the WW3 input data was over-written? We add a specific date/time stamp so that we can version control the input and not over-write it. |
My fault! input directory names were switched back and forth. |
just are running normal now. Close this issue |
same problem happened on c5, need to do the same fixing |
@jiandewang It looks to me the files are OK on Gaea. Are you sure your rt didn't fail on Gaea because of this #2198? The fix for this will be coming in w/ the WW3 PR today but before that you need to modify this part of rt.sh STMP=/gpfs/f5/epic/scratch |
@DeniseWorthen yes I changed those two lines otherwise my job will not be able to be sumbitted. the UWM is based on yesterday's commit |
@jiandewang Sorry about interruption. WW3_input_data_20220624 is restored. looks like running ok. Can you check again?
|
thanks for the quick action, let me re-launch my job. |
works fine now. close |
Description
I am testing updated MOM6 code in UWM but got unexpected failure so I turned back to develop branch and changed nothing, but got the same error.
EXTCDE MPI_ABORT, IEXIT= 52
see error information at /scratch1/NCEPDEV/stmp2/Jiande.Wang/FV3_RT/rt_73092/cpld_control_p8_mixedmode_intel
To Reproduce:
clone today's UWM (hash # c54e986)
run one of S2S job, for my case I ran "cpld_control_p8_mixedmode_inte"
Additional context
Output
The text was updated successfully, but these errors were encountered: