Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stat Analysis Implementation #6

Draft
wants to merge 39 commits into
base: feature/anlstat
Choose a base branch
from

Conversation

kevindougherty-noaa
Copy link
Collaborator

First pass at adding stat analysis script to copy appropriate stat files.

kevindougherty-noaa and others added 20 commits August 20, 2024 13:42
Update the test for UNAVAILABLE from inclusion to count: `if
['UNAVAILABLE'] > 0`
in `rocotocheck.py `script.
This PR converts the staging job from shell to python and introduces the
use of yaml.

Changes in this PR:

1. Rename `scripts/exglobal_stage_ic.sh` to
`scripts/exglobal_stage_ic.py`.
2. Update `jobs/JGLOBAL_STAGE_IC` to use `.py` script extension. Move
`COM*` variable declarations and member loop down into yaml and python
respectively. Move `GDATE/gPDY/gcyc` settings up to JJOB from ex-script
and replace with newer cycle variables (as done in forecast job).
3. Create `parm/stage` folder to hold newly created `stage.yaml.j2`,
which both mimics forecast-only functionality in existing
`scripts/exglobal_stage_ic.sh` and adds functionality for cycled mode.
4. Create `ush/python/pygfs/task/stage.py` to house staging job python
functions for call from `scripts/exglobal_stage_ic.py`.
5. Remove `stage_ic` job rocoto dependencies from xml. Do not need and
removes area of duplicate maintenance.
6. Add cycled staging jobs for gdas and enkf suites.
7. Rename `model_data` to `model` for issue NOAA-EMC#2686

There will now be distinct `stage_ic` jobs for each `RUN`:
`gdasstage_ic`, `gfsstage_ic`, `enkfgdasstage_ic`, `stage_ic` (for
gefs).

Related work was done to set up new symlink folder structure under
supported platform `ICSDIR` folder for use by updated staging job.

Resolves NOAA-EMC#2475
Resolves NOAA-EMC#2650
Resolves NOAA-EMC#2686

---------

Co-authored-by: Rahul Mahajan <[email protected]>
Co-authored-by: Walter Kolczynski - NOAA <[email protected]>
Co-authored-by: David Huber <[email protected]>
This PR will support ATM forecast only run on Azure
)

This PR adds the capability to update the ensemble of snow states 
by recentering the ensemble mean to the deterministic snow analysis
and applying increments as appropriate.

Resolves NOAA-EMC#2585

---------

Co-authored-by: David Huber <[email protected]>
Co-authored-by: Rahul Mahajan <[email protected]>
Co-authored-by: Guillaume Vernieres <[email protected]>
Co-authored-by: AntonMFernando-NOAA <[email protected]>
Co-authored-by: Anil Kumar <[email protected]>
Co-authored-by: TerrenceMcGuinness-NOAA <[email protected]>
Add a parameter "pass_full_omega_to_physics_in_non_hydrostatic_mode" in
FV3 namelist. It was set to "true" to use a new method to diagnose
omega. This PR is based on the /ufs-community/ufs-weather-model#2327)

Corresponding parameter changed in GFSv17 related regression tests
ufs-community/ufs-weather-model#2373))
Changes to make GEFS C48 case run on AWS.

After C48 ATM forecast only runs on AWs, the next step is to make GEFS
C48 run on AWS.
Changes to AWS env, and yaml files.

Resolves NOAA-EMC#2817
Refs NOAA-EMC#2711
Support global-workflow ATM forecast only runs on Google.

Add/Modify env, yaml, and python scripts changes to make global-workflow
ATM forecast only runs on GSP.

  Resolves NOAA-EMC#2831
  Refs NOAA-EMC#2826
  Refs NOAA-EMC#2711
Module git-lfs is required to run CI test on gaea machine and added
gaea in the Jenkinsfile
- Module (git-lfs) added in the modulefiles/module_gwsetup.gaea.lua
- Gaea added (Jenkinsfile)
This PR adds JEDI ATM lgetkf observer and solver jobs to
global-workflow. This is approach is akin GSI-based eobs and eupd.
Splitting the single JEDI ATM lgetkf job into separate observer and
solver jobs improves memory and computational efficiency.

Resolves NOAA-EMC#2415
Make ATM-OCN-ICE coupling model run on AWS.

This adds capability to run UFS atm-ocn-ice coupling on AWS.

Resolves NOAA-EMC#2858
This PR corrects a bug in the staging job for ocean `MOM.res_#` IC
files. The `OCNRES` value was coming in as an integer (e.g. `25`) but
the `ocean.yaml.j2` file was checking for `"025"`. Correct to now set
OCNRES to be three digits in staging script and also correct the for
loop range to include third file.

Resolves NOAA-EMC#2864
…AA-EMC#2738)

This PR adds in support for computing files needed for the aerosol
analysis **B**. This includes a new task, `aeroanlgenb`. This work was
performed by both me and @andytangborn

Resolves NOAA-EMC#2501
Resolves NOAA-EMC#2737

---------

Co-authored-by: Andrew.Tangborn <[email protected]>
Co-authored-by: Walter Kolczynski - NOAA <[email protected]>
Adds files `atmi009.nc`, `atmi003.nc`, `ratmi009.nc`, and `ratmi003.nc`
to list of files to be staged for ICs, if available. These are necessary
for starting an IAU run, and are currently missing.

Resolves NOAA-EMC#2874
# Description

Support global-worflow GEFS C48 on Google Cloud.

Make env. var. and yaml file changes, so global-workflow GEFS C48 case
can run properly on Google Cloud.

Resolves NOAA-EMC#2860
AnningCheng-NOAA and others added 9 commits September 6, 2024 12:02
Use the updated 2013 to 2024 mean MERRA2 climatology instead of 2003 to
2014 mean

Depends on NOAA-EMC#2887 
Refs: ufs-community/ufs-weather-model#2272
Refs: ufs-community/ufs-weather-model#2273
…MC#2893)

This changes the order of the cleanup job so that the working directory
is deleted at the end. It also adds the `-ignore_readdir_race` flag to
`find` to prevent errors if a file was deleted after the list of files
was collected. This can happen if two consecutive cycles run the cleanup
job at the same time.
This updates the model hash to include the UPP update needed to be able
to run the post processor on Orion, thus reenabling support on that
system.

A note on the UPP: it is using a newer version of g2tmpl that requires a
separate spack-stack 1.6.0 installation. This version of g2tmpl will be
standard in spack-stack 1.8.0, but for now requires loading separate
modules for the UPP.

A note on running analyses on Orion: due to a yet-unknown issue causing
the BUFR library to run much slower on Orion when compared with Rocky 8,
the GSI and GDASApp are expected to run significantly slower than on any
other platform (on the order of an hour longer).

Lastly, I made adjustments to the build_all.sh script to send more cores
to compiling the UFS and GDASApp. Under this configuration, the GSI,
UPP, UFS_Utils, and WW3 pre/post executables finish compiling before the
UFS when run with 20 cores.

Resolves NOAA-EMC#2694 
Resolves NOAA-EMC#2851 

---------

Co-authored-by: Rahul Mahajan <[email protected]>
Co-authored-by: Walter.Kolczynski <[email protected]>
…#2816)

- This task is an extension of the empty arch job previously merged. 
- This feature adds an archive task to GEFS system to archive files
locally.
- This feature archives files in ensstat directory. 

Resolves NOAA-EMC#2698
Refs NOAA-EMC#832 NOAA-EMC#2772
The current operational BUFR job begins concurrently with the GFS model
run. This PR updates the script and ush to process all forecast hour
data simultaneously, then combines the temporary outputs to create BUFR
sounding products for each station. The updated job will now start
processing data only after the GFS model completes its 180-hour run,
handling all forecast files from 000hr to 180hr at a time. The new
version job running will need 7 nodes instead of the current operational
4 nodes.

This PR depends on the GFS bufr code update NOAA-EMC/gfs-utils#75

With the updates of bufr codes and scripts, there is no need to add
restart capability to GFS post-process job JGFS_ATMOS_POSTSND.

This PR includes the other changes:

Rename the following table files:

parm/product/bufr_ij13km.txt to parm/product/bufr_ij_gfs_C768.txt
parm/product/bufr_ij9km.txt to parm/product/bufr_ij_gfs_C1152.txt

Add a new table file: parm/product/bufr_ij_gfs_C96.txt for GFSv17 C96
testing.

Added a new capability to the BUFR package. The job priority is to read
bufr_ij_gfs_${CASE}.txt. If the table file is not available, the code
will automatically find the nearest neighbor grid point (i, j).

Refs NOAA-EMC#1257
Refs NOAA-EMC/gfs-utils#75
This PR creates a PyGFS class called JEDI, which is to be instantiated
everytime a JEDI application is run. The AtmAnalysis and AtmEnsAnalysis
classes are no longer children of the Analysis class, but rather direct
children of the Task class. They each have a JEDI object as an
attribute, which is used to run either the variational/ensemble DA JEDI
applications or the FV3 increment converter JEDI application, depending
on which job they are created for (e.g. atmanlvar vs. atmanlfv3inc). The
intention is that a later PR will apply this framework to all analysis
task, and the PyGFS Analysis class will be removed.
This PR:
- Creates a standalone page for FAQ and Common issues
- Adds a block of caution on using variables in a users' `bashrc`

Fixes: NOAA-EMC#2850
Copy link
Owner

@CoryMartin-NOAA CoryMartin-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments, let's discuss this AM

@@ -1,6 +1,7 @@
#! /usr/bin/env bash

source "${HOMEgfs}/ush/preamble.sh"
export DATA=${DATA:-${DATAROOT}/${RUN}statanl_${cyc}}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only needed if we intend on changing the path for DATA from the default

# Get task specific resources
source "${EXPDIR}/config.resources" anlstat

echo "END: config.anlstat"
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nearly duplicate file? probably needs removed

@@ -5,7 +5,7 @@
import os

from wxflow import Logger, cast_strdict_as_dtypedict
from pygfs.task.atm_analysis import AtmAnalysis
from pygfs.task.stat_analysis import StatAnalysis
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should change to using the new Jedi class

@@ -252,7 +252,7 @@ fi
#------------------------------
if [[ -d "${HOMEgfs}/sorc/gdas.cd" ]]; then
cd "${HOMEgfs}/parm/gdas" || exit 1
declare -a gdasapp_comps=("aero" "atm" "io" "ioda" "snow" "soca" "jcb-gdas" "jcb-algorithms")
declare -a gdasapp_comps=("aero" "atm" "io" "ioda" "snow" "soca" "stats" "jcb-gdas" "jcb-algorithms")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking about this more, I think we just need snow/stats, soca/stats, aero/stats, atm/stats, rather than a stats dir

logger = getLogger(__name__.split('.')[-1])


class StatAnalysis(Analysis):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above about that we should migrate to the new Jedi class

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DavidHuber-NOAA and others added 9 commits September 10, 2024 11:48
This modifies the way the `config` dictionary is constructed and
referenced. Rather than updating a single configuration dictionary with
each `RUN`, a `RUN`-based dictionary of `config` dictionaries is created
and referenced by the appropriate `RUN` when calculating resources.

This also makes the methods that were hidden before NOAA-EMC#2727 hidden again.

Resolves NOAA-EMC#2783
This replaces `APRUN` with `APRUN_default` in all of the `.env` files.

Resolves NOAA-EMC#2870
This adds 3 missing links from the UPP into parm/ufs to .gitignore.

Resolves NOAA-EMC#2901
This reverts commit 7893aa1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.