Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy land increments and land DA confs to COM #1797

Merged
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion Externals.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ protocol = git
required = False

[GDASApp]
hash = 09757ce
hash = 2774a10
local_path = sorc/gdas.cd
repo_url = https://github.com/NOAA-EMC/GDASApp.git
protocol = git
Expand Down
2 changes: 1 addition & 1 deletion sorc/checkout.sh
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,7 @@ if [[ ${checkout_gsi} == "YES" ]]; then
fi

if [[ ${checkout_gdas} == "YES" ]]; then
checkout "gdas.cd" "https://github.com/NOAA-EMC/GDASApp.git" "09757ce"; errs=$((errs + $?))
checkout "gdas.cd" "https://github.com/NOAA-EMC/GDASApp.git" "2774a10"; errs=$((errs + $?))
fi

if [[ ${checkout_gsi} == "YES" || ${checkout_gdas} == "YES" ]]; then
Expand Down
28 changes: 25 additions & 3 deletions ush/python/pygfs/task/land_analysis.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ def __init__(self, config):

_res = int(self.config['CASE'][1:])
_window_begin = add_to_datetime(self.runtime_config.current_cycle, -to_timedelta(f"{self.config['assim_freq']}H") / 2)
_window_begin = add_to_datetime(_window_begin, -to_timedelta(f"1S"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to do this "correctly, now don't we need to make the window_length 6hours + 1 second?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CoryMartin-NOAA I conducted and compared two runs: one with window length 6 hours and the other with 6 hours + 1 second. They do show some difference from the land increment files as below:

Tile 1: 
Variable Group Count        Sum     AbsSum          Min        Max       Range        Mean      StdDev
snodl    /       432 0.00534215 0.00534219 -4.60635e-09 0.00056797 0.000567975 1.23661e-05 6.58902e-05

Tile 2: 
Variable Group Count       Sum  AbsSum      Min     Max   Range         Mean   StdDev
snodl    /      1213 -0.570553 65.0728 -3.67096 3.36201 7.03297 -0.000470365 0.306991

Tile 3: 
Variable Group Count Sum AbsSum     Min     Max   Range Mean StdDev
snodl    /       982 nan    nan -94.825 98.1387 192.964  nan    nan
vtype    /       105 nan    nan     inf    -inf    -inf  nan    nan
slmsk    /       105 nan    nan     inf    -inf    -inf  nan    nan

Tile 4, 5, 6 are identical. 

Let's talk at tomorrow's meeting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CoryMartin-NOAA @barlage When I conducted the snow DA cycling run with window length 6 hours + 2 seconds which make the DA window centering at the cycle time, all the nan values were gone. Therefore, the DA window centering at the cycle time is important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CORRECTIONS:

My above statements are incorrect, because there are still nan values when the above setting with the DA window beginning at 1 second back and window length 6 hours + 2 seconds.

The above with no nan values came from the experiment with the DA window beginning at 2 second back and window length 6 hours + 1 seconds.

Copy link

@barlage barlage Aug 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since @CoryMartin-NOAA questioned in the meeting earlier, we looked at an increment file and not all of the grids are NaN, only about 100-200 and they look to be in locations where there are snow observations, which can seen from the above nccmp. Also, there must be some increment limits in the add_increment code since these do not affect the analysis.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@CoryMartin-NOAA Okay, I will change the window length to 6hr 1sec.

Copy link
Contributor

@aerorahul aerorahul Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand, if the results are the same for 6hr, 6hr-1s, 6hr+1s, then why make the change?
Also, shouldn't this be handled in the observation processing part rather than in the workflow?
Consider the case where the observations are in a single file for 24h (for e.g.), shouldn't this be teased out in the yaml?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aerorahul this is because the 'JEDI' window convention is the opposite of the 'NCEP' window convention, and we lost that battle with Yannick a while ago. While it may be identical in this case, it shouldn't be. Observations at exactly 03z would not make it into the 06z analysis without this change. I don't quite understand you comment in the second sentence, this is what defines the window variable that goes into the templated YAML.

Copy link
Contributor

@aerorahul aerorahul Aug 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. Lets discuss offline.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CoryMartin-NOAA for the detailed explanations.
@aerorahul Without this change, about several hundreds of observations will be lost in each cycle. For example, the observations increased from 1718 to 2161 with this changes.

_letkfoi_yaml = os.path.join(self.runtime_config.DATA, f"{self.runtime_config.RUN}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")

# Create a local dictionary that is repeatedly used across this class
Expand All @@ -44,7 +45,7 @@ def __init__(self, config):
'npz_ges': self.config.LEVS - 1,
'npz': self.config.LEVS - 1,
'LAND_WINDOW_BEGIN': _window_begin,
'LAND_WINDOW_LENGTH': f"PT{self.config['assim_freq']}H",
'LAND_WINDOW_LENGTH': f"PT{self.config['assim_freq']}H1S",
'OPREFIX': f"{self.runtime_config.RUN}.t{self.runtime_config.cyc:02d}z.",
'APREFIX': f"{self.runtime_config.RUN}.t{self.runtime_config.cyc:02d}z.",
'jedi_yaml': _letkfoi_yaml
Expand Down Expand Up @@ -323,8 +324,10 @@ def execute(self) -> None:
def finalize(self) -> None:
"""Performs closing actions of the Land analysis task
This method:
- copies analysis back to COM/ from DATA/
- tar and gzips the JEDI diagnostic files
- tar and gzip the output diag files and place in COM/
- copy the generated YAML file from initialize to the COM/
- copy the analysis files to the COM/
- copy the increment files to the COM/

Parameters
----------
Expand All @@ -336,6 +339,15 @@ def finalize(self) -> None:
statfile = os.path.join(self.task_config.COM_LAND_ANALYSIS, f"{self.task_config.APREFIX}landstat.tgz")
self.tgz_diags(statfile, self.task_config.DATA)

logger.info("Copy full YAML to COM")
src = os.path.join(self.task_config['DATA'], f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
src = os.path.join(self.task_config['DATA'], f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
src = os.path.join(self.task_config['DATA'], f"{self.task_config['RUN']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")

Please don't use CDUMP anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, isn't this already defined as self.task_config.APREFIX? Why do we need to reconstruct that again here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aerorahul for quick review. Good catch, and I made the changes as suggested.

dest = os.path.join(self.task_config.COM_LAND_ANALYSIS, f"{self.task_config['CDUMP']}.t{self.runtime_config['cyc']:02d}z.letkfoi.yaml")
yaml_copy = {
'mkdir': [self.task_config.COM_LAND_ANALYSIS],
'copy': [[src, dest]]
}
FileHandler(yaml_copy).sync()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?
COM is for data not configuration. The UFS-weather-model has set a precedent for copying namelists and configuration stuff into COM. If there is a need to copy this data for every cycle, we need a proper solution with a well defined set of requirements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not? This is a few kb file that is a record of the exact configuration needed to reproduce results. Especially in the land case, this is different at 18z than it is at other cycles. When observations and QC have to be turned on/off these YAMLs will not always be identical and should be retained.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CoryMartin-NOAA. In addition, I include copying the YAML files for archive to keep consistent with other DA component such as aero DA.

Copy link
Contributor

@aerorahul aerorahul Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An update to this comment:

After discussions w/ NCO, if we wish to retain cycle specific configuration artifacts such as namelists, etc. that are not already part of HOMEgfs, we can do so in a single location in COM. @CoryMartin-NOAA and @aerorahul discussed a good place for said configuration artifacts would be ROTDIR/RUN.PDY/cyc/conf.
The jobs will need to ensure that they identify the artifacts uniquely; for e.g. input.nml from the ufs-weather-model should be saved as ufs_input.nml

This will require creating a COM_CONF_TMPL in config.com and then filling the template with appropriate substitutions in the j-jobs that wish to save the artifacts.

Please let me know if you have any questions.
@WalterKolczynski-NOAA FYI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @aerorahul for the updates.
@CoryMartin-NOAA @aerorahul Do you want me to include this change in this PR by putting the yaml file into the ROTDIR/RUN.PDY/cyc/conf?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested, I create COM_CONF_TMPL in config.com and copy the YAML file into ROTDIR/RUN.PDY/cyc/conf. I still keep using the previous name (e.g., gdas.t18z.letkfoi.yaml). What do you suggest the change for the file name?


logger.info("Copy analysis to COM")
template = f'{to_fv3time(self.task_config.current_cycle)}.sfc_data.tile{{tilenum}}.nc'
anllist = []
Expand All @@ -346,6 +358,16 @@ def finalize(self) -> None:
anllist.append([src, dest])
FileHandler({'copy': anllist}).sync()

logger.info('Copy increments to COM')
template = f'landinc.{to_fv3time(self.task_config.current_cycle)}.sfc_data.tile{{tilenum}}.nc'
aerorahul marked this conversation as resolved.
Show resolved Hide resolved
inclist = []
for itile in range(1, self.task_config.ntiles + 1):
filename = template.format(tilenum=itile)
src = os.path.join(self.task_config.DATA, 'anl', filename)
dest = os.path.join(self.task_config.COM_LAND_ANALYSIS, filename)
inclist.append([src, dest])
FileHandler({'copy': inclist}).sync()

@staticmethod
@logit(logger)
def get_bkg_dict(config: Dict) -> Dict[str, List[str]]:
Expand Down