Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arch task returns 0 when htar fails #1982

Closed
AndrewEichmann-NOAA opened this issue Oct 26, 2023 · 2 comments
Closed

arch task returns 0 when htar fails #1982

AndrewEichmann-NOAA opened this issue Oct 26, 2023 · 2 comments
Labels
bug Something isn't working

Comments

@AndrewEichmann-NOAA
Copy link
Contributor

What is wrong?

The arch task is returning 0 and marked by rocoto has having completed successfully when htar fails and returns a non-zero status. Example from log below:

+ exglobal_archive.sh[265]: htar -P -cvf /NCEPDEV/marineda/1year/Andrew.Eichmann/HERA/scratch/arch-test/2021032418/gdas_restartb.tar gdas.20210324/18//model_data/atmos/restart
[connecting to hpsscore1.fairmont.rdhpcs.noaa.gov/1217]
ERROR: htar_MakeAllSubdirs: Permission error creating intermediate HPSS subdirectory: /NCEPDEV/marineda/1year/Andrew.Eichmann
HTAR: HTAR FAILED
###WARNING  htar returned non-zero exit status.
            72 = /apps/hpss/bin/htar -P -cvf /NCEPDEV/marineda/1year/Andrew.Eichmann/HERA/scratch/arch-test/2021032418/gdas_restartb.tar gdas.20210324/18//model_data/atmos/restart
+ exglobal_archive.sh[266]: status=72
+ exglobal_archive.sh[267]: case ${targrp} in
+ exglobal_archive.sh[274]: '[' 72 -ne 0 ']'
+ exglobal_archive.sh[274]: '[' 2021032418 -ge 2021032512 ']'
+ exglobal_archive.sh[278]: set_strict
+ preamble.sh[41]: [[ YES == \Y\E\S ]]
+ preamble.sh[43]: set -eu
+ exglobal_archive.sh[281]: shopt -u extglob
+ exglobal_archive.sh[287]: exit 0
+ exglobal_archive.sh[1]: postamble exglobal_archive.sh 1698268242 0
+ preamble.sh[68]: set +x
End exglobal_archive.sh at 21:11:20 with error code 0 (time elapsed: 00:00:38)
+ JGLOBAL_ARCHIVE[32]: status=0
+ JGLOBAL_ARCHIVE[33]: [[ 0 -ne 0 ]]
+ JGLOBAL_ARCHIVE[42]: [[ -e OUTPUT.111163 ]]
+ JGLOBAL_ARCHIVE[50]: cd /scratch1/NCEPDEV/stmp2/Andrew.Eichmann/RUNDIRS/arch-test
+ JGLOBAL_ARCHIVE[51]: [[ NO = \N\O ]]
+ JGLOBAL_ARCHIVE[51]: rm -rf /scratch1/NCEPDEV/stmp2/Andrew.Eichmann/RUNDIRS/arch-test/arch.110645
+ JGLOBAL_ARCHIVE[53]: exit 0
+ JGLOBAL_ARCHIVE[1]: postamble JGLOBAL_ARCHIVE 1698268240 0
+ preamble.sh[68]: set +x
End JGLOBAL_ARCHIVE at 21:11:20 with error code 0 (time elapsed: 00:00:40)
+ arch.sh[17]: status=0
+ arch.sh[19]: exit 0
+ arch.sh[1]: postamble arch.sh 1698268238 0
+ preamble.sh[68]: set +x
End arch.sh at 21:11:21 with error code 0 (time elapsed: 00:00:43)

What should have happened?

I would expect the arch task to fail when any htar command within it fails.

What machines are impacted?

Hera

Steps to reproduce

  1. Run a cycling experiment with HPSS_PROJECT in config.base set to something the user has no access to
  2. Let the arch task run - the first one appears to suffice

Additional information

Is this behavior in fact expected?

Do you have a proposed solution?

No response

@AndrewEichmann-NOAA AndrewEichmann-NOAA added bug Something isn't working triage Issues that are triage labels Oct 26, 2023
@WalterKolczynski-NOAA
Copy link
Contributor

@AndrewEichmann-NOAA can you test again? I think this may be been fixed as a side effect of #1967, which was merged this afternoon.

@WalterKolczynski-NOAA WalterKolczynski-NOAA removed the triage Issues that are triage label Oct 26, 2023
@AndrewEichmann-NOAA
Copy link
Contributor Author

@WalterKolczynski-NOAA Yup it works now. I mean it doesn't work. In the right way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants