Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add capability to run forecast in segments #2795

Merged

Conversation

WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA WalterKolczynski-NOAA commented Jul 25, 2024

Description

Adds the ability to run a forecast in segments instead of all at once. To accomplish this, a new local checkpnts variable is introduced to config.base to contain a comma-separated list of intermediate stopping points for the forecast. This is combined with FHMIN_GFS and FHMAX_GFS to create a comma-separated string FCST_SEGMENTS with all the start/end points that is used by config.fcst and rocoto workflow. Capability to parse these into python lists was added to wxflow in an accompanying PR. If checkpnts is an empty string, this will result in a single-segment forecast.

To accommodate the new segment metatasks that must be run serially, the capability of create_task() was expanded to allow a dictionary key of is_serial, which controls whether a metatask is parallel or serial using pre-existing capability in rocoto. The default when not given is parallel (i.e. most metatasks).

Resolves #2274
Refs NOAA-EMC/wxflow#39
Refs NOAA-EMC/wxflow#40

Type of change

  • New feature (adds functionality)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO

How has this been tested?

  • Forecast-only on Hercules
  • GEFS on Hercules

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • I have made corresponding changes to the documentation if necessary

@WalterKolczynski-NOAA
Copy link
Contributor Author

I still have some more testing to do (and I think documentation to update), but wanted to get a draft out since people are going on leave.

@WalterKolczynski-NOAA
Copy link
Contributor Author

Will check and see if the documentation needs to be updated Friday, but I think the last few bugs from this update are gone.

@WalterKolczynski-NOAA WalterKolczynski-NOAA marked this pull request as ready for review July 25, 2024 22:43
@WalterKolczynski-NOAA WalterKolczynski-NOAA added the CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules label Jul 25, 2024
@emcbot emcbot added CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully and removed CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules labels Jul 25, 2024
@emcbot
Copy link

emcbot commented Jul 26, 2024

CI Passed Hercules at
Built and ran in directory /work2/noaa/stmp/CI/HERCULES/2795


Experiment C48_ATM_cd3d7c8b Completed 1 Cycles: *SUCCESS* at Thu Jul 25 19:58:13 CDT 2024
Experiment C96_atm3DVar_cd3d7c8b Completed 3 Cycles: *SUCCESS* at Thu Jul 25 21:16:52 CDT 2024
Experiment C96C48_hybatmDA_cd3d7c8b Completed 3 Cycles: *SUCCESS* at Thu Jul 25 21:17:05 CDT 2024
Experiment C48_S2SW_cd3d7c8b Completed 1 Cycles: *SUCCESS* at Thu Jul 25 21:41:48 CDT 2024
Experiment C48_S2SWA_gefs_cd3d7c8b Completed 1 Cycles: *SUCCESS* at Thu Jul 25 22:22:53 CDT 2024

@DavidHuber-NOAA
Copy link
Contributor

Hercules testing completed successfully, resetting label.

@DavidHuber-NOAA DavidHuber-NOAA removed the CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress label Jul 26, 2024
@WalterKolczynski-NOAA
Copy link
Contributor Author

Appears no documentation updates are needed at this time after all.

Comment on lines 294 to 304
export FCST_SEGMENTS_STR_GFS="@FCST_SEGMENTS_GFS@"
IFS=', ' read -ra FCST_SEGMENTS_GFS <<< "${FCST_SEGMENTS_STR_GFS}"
if (( ${FCST_SEGMENT:- -1} < 0 )); then
# Jobs other than the forecast don't care about segments, only the
# absolute start and end
declare -x FHMIN_GFS=${FCST_SEGMENTS_GFS[0]}
declare -x FHMAX_GFS=${FCST_SEGMENTS_GFS[-1]}
else
declare -x FHMIN_GFS=${FCST_SEGMENTS_GFS[${FCST_SEGMENT}]}
declare -x FHMAX_GFS=${FCST_SEGMENTS_GFS[${FCST_SEGMENT}+1]}
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to not do any of this in a config file and calculate this in a j-job or exscript?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not without massive additional changes. FHMAX_GFS especially gets used later in this config, and then also in the job-specific configs that would be sourced immediately afterwards.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-done how we discussed Mon afternoon.

@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules and removed CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully labels Jul 29, 2024
@emcbot emcbot added CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules CI-Hercules-Running **Bot use only** CI testing on Hercules for this PR is in-progress and removed CI-Hercules-Ready **CM use only** PR is ready for CI testing on Hercules CI-Hercules-Building **Bot use only** CI testing is cloning/building on Hercules labels Jul 29, 2024
@emcbot emcbot added the CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully label Jul 30, 2024
@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS and removed CI-Wcoss2-Failed **Bot use only** CI testing on WCOSS for this PR has failed labels Aug 10, 2024
@emcbot emcbot added CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS and removed CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS labels Aug 10, 2024
@emcbot
Copy link

emcbot commented Aug 10, 2024

CI Update on Wcoss2 at 08/10/24 06:08:53 AM
============================================
Cloning and Building global-workflow PR: 2795
with PID: 174460 on host: clogin03

@emcbot emcbot added CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress and removed CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS labels Aug 10, 2024
@emcbot
Copy link

emcbot commented Aug 10, 2024

Automated global-workflow Testing Results:

Machine: Wcoss2
Start: Sat Aug 10 06:15:51 UTC 2024 on clogin03
---------------------------------------------------
Build: Completed at 08/10/24 06:51:39 AM
Case setup: Completed for experiment C48_ATM_c73eecd2
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_c73eecd2
Case setup: Skipped for experiment C48_S2SWA_gefs_c73eecd2
Case setup: Completed for experiment C48_S2SW_c73eecd2
Case setup: Completed for experiment C96_atm3DVar_extended_c73eecd2
Case setup: Skipped for experiment C96_atm3DVar_c73eecd2
Case setup: Completed for experiment C96_atmaerosnowDA_c73eecd2
Case setup: Completed for experiment C96C48_hybatmDA_c73eecd2
Case setup: Completed for experiment C96C48_ufs_hybatmDA_c73eecd2

@emcbot emcbot added CI-Wcoss2-Failed **Bot use only** CI testing on WCOSS for this PR has failed and removed CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress labels Aug 10, 2024
@emcbot
Copy link

emcbot commented Aug 10, 2024

Experiment C96C48_hybatmDA_c73eecd2 FAIL on Wcoss2 at 08/10/24 07:06:33 AM

Error logs:

/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2795/RUNTESTS/COMROOT/C96C48_hybatmDA_c73eecd2/logs/2021122018/enkfgdasfcst_mem001.log
/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2795/RUNTESTS/COMROOT/C96C48_hybatmDA_c73eecd2/logs/2021122018/enkfgdasfcst_mem002.log

Follow link here to view the contents of the above file(s): (link)

@WalterKolczynski-NOAA WalterKolczynski-NOAA added CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS and removed CI-Wcoss2-Failed **Bot use only** CI testing on WCOSS for this PR has failed labels Aug 10, 2024
@emcbot emcbot added CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS and removed CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS labels Aug 10, 2024
@emcbot
Copy link

emcbot commented Aug 10, 2024

CI Update on Wcoss2 at 08/10/24 07:40:51 AM
============================================
Cloning and Building global-workflow PR: 2795
with PID: 168059 on host: clogin03

@emcbot emcbot added CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress and removed CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS labels Aug 10, 2024
@emcbot
Copy link

emcbot commented Aug 10, 2024

Automated global-workflow Testing Results:

Machine: Wcoss2
Start: Sat Aug 10 07:46:57 UTC 2024 on clogin03
---------------------------------------------------
Build: Completed at 08/10/24 08:22:39 AM
Case setup: Completed for experiment C48_ATM_d0cd5295
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_d0cd5295
Case setup: Skipped for experiment C48_S2SWA_gefs_d0cd5295
Case setup: Completed for experiment C48_S2SW_d0cd5295
Case setup: Completed for experiment C96_atm3DVar_extended_d0cd5295
Case setup: Skipped for experiment C96_atm3DVar_d0cd5295
Case setup: Completed for experiment C96_atmaerosnowDA_d0cd5295
Case setup: Completed for experiment C96C48_hybatmDA_d0cd5295
Case setup: Completed for experiment C96C48_ufs_hybatmDA_d0cd5295

@emcbot emcbot added CI-Wcoss2-Passed **Bot use only** CI testing on WCOSS for this PR has completed successfully and removed CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress labels Aug 10, 2024
@emcbot
Copy link

emcbot commented Aug 10, 2024

All CI Test Cases Passed on Wcoss2:

Experiment C48_ATM_d0cd5295 *** SUCCESS *** at 08/10/24 09:36:13 AM
Experiment C48_S2SW_d0cd5295 *** SUCCESS *** at 08/10/24 09:48:16 AM
Experiment C96C48_hybatmDA_d0cd5295 *** SUCCESS *** at 08/10/24 10:42:29 AM
Experiment C96_atmaerosnowDA_d0cd5295 *** SUCCESS *** at 08/10/24 11:33:18 AM
Experiment C96C48_ufs_hybatmDA_d0cd5295 *** SUCCESS *** at 08/10/24 12:03:21 PM
Experiment C96_atm3DVar_extended_d0cd5295 *** SUCCESS *** at 08/10/24 09:57:35 PM

@@ -24,7 +24,7 @@ if [[ "${DOIAU}" == "YES" ]]; then
export aero_bkg_times="3,6,9"
export JEDIYAML="${PARMgfs}/gdas/aero/variational/3dvar_fgat_gfs_aero.yaml.j2"
else
export aero_bkg_times="6"
export aero_bkg_times="6," # Trailing comma is necessary so this is treated as a list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very clever!

Copy link
Contributor

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@DavidHuber-NOAA DavidHuber-NOAA merged commit 1d53953 into NOAA-EMC:develop Aug 12, 2024
5 checks passed
DavidHuber-NOAA added a commit to DavidHuber-NOAA/global-workflow that referenced this pull request Aug 13, 2024
…e_rocoto

* origin/develop:
  Jenkins Pipeline Updates (NOAA-EMC#2815)
  Add Gaea C5 to CI (NOAA-EMC#2814)
  Add support for forecast-only runs on AWS (NOAA-EMC#2711)
  Add fixes to products for when REPLAY IC's are used  (NOAA-EMC#2755)
  Add capability to run forecast in segments (NOAA-EMC#2795)
@WalterKolczynski-NOAA WalterKolczynski-NOAA deleted the feature/fcst_segments branch August 21, 2024 21:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI-Hera-Passed **Bot use only** CI testing on Hera for this PR has completed successfully CI-Hercules-Passed **Bot use only** CI testing on Hercules for this PR has completed successfully CI-Wcoss2-Passed **Bot use only** CI testing on WCOSS for this PR has completed successfully
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need capability to run multiple forecast segments
7 participants