Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an archive task to GEFS system to archive files in HPSS #2895

Merged
merged 151 commits into from
Sep 25, 2024

Conversation

AntonMFernando-NOAA
Copy link
Contributor

@AntonMFernando-NOAA AntonMFernando-NOAA commented Sep 7, 2024

Description

Type of change

  • new feature

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO
  • Does this change require an update to any of the following submodules? NO

How has this been tested?

  • CI test in HERA and HERCULES

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • I have made corresponding changes to the documentation if necessary

@AntonMFernando-NOAA AntonMFernando-NOAA changed the title Add an archive task to GEFS system to archive files to HPSS Add an archive task to GEFS system to archive files in HPSS Sep 7, 2024
Copy link
Contributor

@EricSinsky-NOAA EricSinsky-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some initial suggestions.

parm/archive/gefs.yaml.j2 Outdated Show resolved Hide resolved
parm/archive/gefs.yaml.j2 Outdated Show resolved Hide resolved
@emcbot
Copy link

emcbot commented Sep 24, 2024

Experiment C96C48_hybatmDA FAILED on Hercules in Build# 2 in
/work2/noaa/stmp/CI/HERCULES/2895/RUNTESTS/EXPDIR/C96C48_hybatmDA_ff38f832

@emcbot emcbot added CI-Orion-Running **Bot use only** CI testing on Orion for this PR is in-progress CI-Orion-Failed **Bot use only** CI testing on Orion for this PR has failed and removed CI-Orion-Building **Bot use only** CI testing is cloning/building on Orion CI-Orion-Running **Bot use only** CI testing on Orion for this PR is in-progress labels Sep 24, 2024
@emcbot
Copy link

emcbot commented Sep 24, 2024

Experiment C96_atm3DVar FAILED on Orion in Build# 3 in
/work2/noaa/stmp/CI/ORION/2895/RUNTESTS/EXPDIR/C96_atm3DVar_ff38f832

@emcbot
Copy link

emcbot commented Sep 24, 2024

Experiment C96C48_hybatmDA FAILED on Orion in Build# 3 in
/work2/noaa/stmp/CI/ORION/2895/RUNTESTS/EXPDIR/C96C48_hybatmDA_ff38f832

@emcbot emcbot added CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed and removed CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed labels Sep 24, 2024
@emcbot
Copy link

emcbot commented Sep 24, 2024

CI Failed on Hercules in Build# 2
Built and ran in directory /work2/noaa/stmp/CI/HERCULES/2895


Experiment C96_atm3DVar_ff38f832 Terminated with 0 tasks failed and 0 dead at Tue Sep 24 15:44:02 CDT 2024
Experiment C96_atm3DVar_ff38f832 Terminated: *STALLED*
Experiment C96C48_hybatmDA_ff38f832 Terminated with 0 tasks failed and 0 dead at Tue Sep 24 15:44:07 CDT 2024
Experiment C96C48_hybatmDA_ff38f832 Terminated: *STALLED*
Experiment C48_ATM_ff38f832 Completed 1 Cycles: *SUCCESS* at Tue Sep 24 17:32:47 CDT 2024
Experiment C48_S2SWA_gefs_ff38f832 Completed 1 Cycles: *SUCCESS* at Tue Sep 24 18:34:17 CDT 2024
Experiment C48_S2SW_ff38f832 Completed 1 Cycles: *SUCCESS* at Tue Sep 24 18:40:00 CDT 2024

@emcbot emcbot added CI-Orion-Failed **Bot use only** CI testing on Orion for this PR has failed and removed CI-Orion-Failed **Bot use only** CI testing on Orion for this PR has failed labels Sep 25, 2024
@emcbot
Copy link

emcbot commented Sep 25, 2024

CI Failed on Orion in Build# 3
Built and ran in directory /work2/noaa/stmp/CI/ORION/2895


Experiment C96_atm3DVar_ff38f832 Terminated with 0 tasks failed and 0 dead at Tue Sep 24 04:24:21 PM CDT 2024
Experiment C96_atm3DVar_ff38f832 Terminated: *STALLED*
Experiment C96C48_hybatmDA_ff38f832 Terminated with 0 tasks failed and 0 dead at Tue Sep 24 04:24:23 PM CDT 2024
Experiment C96C48_hybatmDA_ff38f832 Terminated: *STALLED*
Experiment C48_ATM_ff38f832 Completed 1 Cycles: *SUCCESS* at Tue Sep 24 05:37:09 PM CDT 2024
Experiment C48_S2SWA_gefs_ff38f832 Completed 1 Cycles: *SUCCESS* at Tue Sep 24 07:22:46 PM CDT 2024
Experiment C48_S2SW_ff38f832 Completed 1 Cycles: *SUCCESS* at Tue Sep 24 07:45:22 PM CDT 2024

@emcbot emcbot added CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully and removed CI-Hera-Running **Bot use only** CI testing on Hera for this PR is in-progress labels Sep 25, 2024
@emcbot
Copy link

emcbot commented Sep 25, 2024

CI Passed on Hera in Build# 1
Built and ran in directory /scratch1/NCEPDEV/global/CI/2895


Experiment C48mx500_3DVarAOWCDA_fcbb6494 Completed 2 Cycles: *SUCCESS* at Wed Sep 25 00:41:40 UTC 2024
Experiment C48_ATM_fcbb6494 Completed 1 Cycles: *SUCCESS* at Wed Sep 25 01:54:05 UTC 2024
Experiment C48_S2SWA_gefs_fcbb6494 Completed 1 Cycles: *SUCCESS* at Wed Sep 25 02:09:14 UTC 2024
Experiment C48_S2SW_fcbb6494 Completed 1 Cycles: *SUCCESS* at Wed Sep 25 02:44:51 UTC 2024
Experiment C96_atm3DVar_fcbb6494 Completed 3 Cycles: *SUCCESS* at Wed Sep 25 03:39:58 UTC 2024
Experiment C96C48_hybatmDA_fcbb6494 Completed 3 Cycles: *SUCCESS* at Wed Sep 25 06:25:42 UTC 2024
Experiment C96C48_ufs_hybatmDA_fcbb6494 Completed 2 Cycles: *SUCCESS* at Wed Sep 25 07:26:39 UTC 2024
Experiment C96C48_hybatmaerosnowDA_fcbb6494 Completed 3 Cycles: *SUCCESS* at Wed Sep 25 07:26:41 UTC 2024

@emcbot emcbot added CI-Wcoss2-Passed **Bot use only** CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress labels Sep 25, 2024
@emcbot
Copy link

emcbot commented Sep 25, 2024

All CI Test Cases Passed on Wcoss2:

Experiment C48_ATM_ff38f832 *** SUCCESS *** at 09/24/24 09:42:13 PM
Experiment C48_S2SW_ff38f832 *** SUCCESS *** at 09/24/24 10:07:08 PM
Experiment C96C48_hybatmDA_ff38f832 *** SUCCESS *** at 09/24/24 10:42:29 PM
Experiment C96C48_hybatmaerosnowDA_ff38f832 *** SUCCESS *** at 09/25/24 12:07:26 AM
Experiment C96C48_ufs_hybatmDA_ff38f832 *** SUCCESS *** at 09/25/24 12:56:19 AM
Experiment C96_atm3DVar_extended_ff38f832 *** SUCCESS *** at 09/25/24 08:49:39 AM

@DavidHuber-NOAA
Copy link
Contributor

Apparently I put in a bad memory spec for Orion and Hercules. Debugging.

@DavidHuber-NOAA
Copy link
Contributor

Orion and Hercules are going down for maintenance, so I won't be able to test this. The mem_node_max for both systems are 192GB and 512GB, but it seems that Slurm does not allow requesting a full node's worth of memory. This is similar to Jet and WCCOSS2 where some amount of memory appears to be reserved for the OS. This seems to be a relatively new configuration as these were tested as part of #2727.

I turned both of these down by 12GB and will retest when the system comes back up tomorrow. Since this doesn't affect this PR directly, and all tests passed on Hera and WCCOSS2, I believe this PR is ready to be merged.

Copy link
Contributor

@EricSinsky-NOAA EricSinsky-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested the most recent additions to the GEFS waves products. Checked the archived products on HPSS and all looks good. Even though the archive job will only run when DO_EXTRACTVARS is YES, this should be ok for the reforecast since DO_EXTRACTVARS will be YES for the reforecast (which is currently the primary target for the GEFS archive task).

@aerorahul aerorahul dismissed WalterKolczynski-NOAA’s stale review September 25, 2024 14:45

addressed changes. needs a re-review if necessary

@aerorahul aerorahul merged commit 7088a91 into NOAA-EMC:develop Sep 25, 2024
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Failed **Bot use only** CI testing on Hercules for this PR has failed CI-Orion-Failed **Bot use only** CI testing on Orion for this PR has failed CI-Wcoss2-Passed **Bot use only** CI testing on WCOSS for this PR has completed successfully
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add archive task to GEFS workflow
8 participants