-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDO based post-processing application #1871
CDO based post-processing application #1871
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shellcheck found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.
Please use the PR template (check your comment delineators). |
ush/cdo_post.sh
Outdated
while IFS= read -r line; do | ||
|
||
# Get the attributes for the respective variable(s). | ||
varname=$(echo "${line}" | $(command -v awk) "{print $1}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The shellnorms
linter insists on double quotes around the awk
command; however this will cause awk
to fail within this bash environment. Single quotes are required.
Is there an issue with the bash script linter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error is SC2016 (info): Expressions don't expand in single quotes, use double quotes for that.
which is incorrect and will cause the script to break.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a shellcheck directive the line above to mute the warning. We've had to do this in other places where single-quotes are necessary (like config.com).
…ed or a related application these files can be parsed to build the correct runtime inputs.
@HenryWinterbottom-NOAA Can you modify |
@EricSinsky-NOAA Yes, I will do that in a separate PR. We will need an additional script beneath |
@HenryWinterbottom-NOAA Ok. When you modify |
@EricSinsky-NOAA Can you please provide me documentation as to what exactly those scripts are doing (e.g., what are the respective environment variables and what are they used for)? The scripts are not documented and I am not familiar enough with NCL to guess what is happening in the respective NCL scripts. |
@HenryWinterbottom-NOAA Sure, I can provide you more information on these scripts. |
@HenryWinterbottom-NOAA Since the rotation angles in MOM6 are located on cell centers and the ocean velocities are on the cell edges, you must first regrid the ocean velocities to the center cell location before rotation. Is that what your CDO script does? |
@DeniseWorthen Yes. I think the issue was the wrong rotation (i.e., earth relative versus grid relative). |
@HenryWinterbottom-NOAA What issue? The NCL scripts and the python scripts that @EricSinsky-NOAA built from them both regridded the velocities to the center, then rotated. |
# $1 - The path to the variable-interpolated netCDF file. | ||
# | ||
# Global Variables: | ||
# output_path - The path to the output netCDF file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make output_path an argument. Don't rely on globals.
ush/remap.sh
Outdated
echo "netCDF-formatted file path ${output_path} exists; merging ${var_interp_path}" | ||
cdo merge "${output_path}" "${var_interp_path}" "${tmp_nc_file}" | ||
mv "${tmp_nc_file}" "${output_path}" | ||
rm "${var_interp_path}" >> /dev/null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How damaging is it if the file isn't removed, and why is STDOUT being trashed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to remove files that are no longer needed. Once the interpolated file, for the respective variable, is merged (i.e., appended) to the output netCDF file path it is no longer needed. If the files remain they will contribute to the disk footprint without reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, what I was getting at was: should this be rm -f
to make sure it gets removed? And still not sure why the STDOUT is being redirected to /dev/null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I piped the stdout/stderr to a file in in the runtime directory. See REMAP_LOG
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not sure why this is even necessary. It's not like rm
prints out a bunch of garbage output.
# output_path - The path to the output netCDF file. | ||
# dstgrid_config - The configuration for destination grid specifications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, make these arguments.
I think Jiande's point is that near the equator rotation is not in play |
@LydiaStefanova-NOAA pointed me to these three files: original tripole: /scratch1/NCEPDEV/climate/Jiande.Wang/For-others/For-Eric/ocn2013100106.01.2013100100.nc I'm looking at a point which is 5N, 224W (136E). Tripole grid, ssu and ssv using ferret's internal regrid-to-tgrid:
Python script
CDO
|
Please try the following file: |
@HenryWinterbottom-NOAA After comparing the CDO-generated results ( |
My apologies, I accidentally compared @HenryWinterbottom-NOAA's current CDO-generated result to his previous CDO-generated result. Please disregard. |
Thank you for checking those differences, @LydiaStefanova-NOAA. Also, after further checking, it turns out that my results actually are correctly comparing the CDO-generated results (vector_reorder.nc) to the Python-generated results. |
@LydiaStefanova-NOAA Thank you for taking a look. The issue with a tripolar projection is that different interpolation schemes will handle it differently. It appears that ESMF and CDO are a case of that. |
@LydiaStefanova-NOAA The Please see RDHPCS Hera file |
@LydiaStefanova-NOAA OK, the difference was what I expected. I was previously trying to resolve the discontinuity. Please check the updated RDHPCS Hera file |
@LydiaStefanova-NOAA The "gap" is due to how the mask is treated for ESMF versus CDO. All of the above being noted, how do you suggest that we proceed? |
parm/post/mom6_interp.csv
Outdated
SST remapbil MOM6_INTERP_NC 0 | ||
SSS remapbil MOM6_INTERP_NC 0 | ||
SSH remapbil MOM6_INTERP_NC 0 | ||
speed remapbil MOM6_INTERP_NC 0 | ||
MLD_003 remapbil MOM6_INTERP_NC 0 | ||
so remapbil MOM6_INTERP_NC 0 | ||
temp remapbil MOM6_INTERP_NC 0 | ||
latent remapbil MOM6_INTERP_NC 0 | ||
sensible remapbil MOM6_INTERP_NC 0 | ||
SW remapbil MOM6_INTERP_NC 0 | ||
LW remapbil MOM6_INTERP_NC 0 | ||
evap remapbil MOM6_INTERP_NC 0 | ||
lprec remapbil MOM6_INTERP_NC 0 | ||
LwLatSens remapbil MOM6_INTERP_NC 0 | ||
Heat_PmE remapbil MOM6_INTERP_NC 0 | ||
SSU,SSV remapbil MOM6_INTERP_NC 1 cos_rot,sin_rot | ||
uo,vo remapbil MOM6_INTERP_NC 1 cos_rot,sin_rot | ||
taux,tauy remapbil MOM6_INTERP_NC 1 cos_rot,sin_rot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be consistent with the NCL/Python remapping programs, The following variables should use conservative remapping (referred to as remapcon in CDO):
- latent
- sensible
- SW
- LW
- evap
- lprec
- fprec
- LwLatSens
- Heat_PmE
- taux,tauy
…to prepare the input files, specifically those for MOM6, such that they can be properly formatted using .
strip_whitespace "${angle_array[0]}" | ||
cosang="${out_string}" | ||
cdo_remap "${cosang}" "${varfile}" "${interp_type}" "${nc_output_path}" | ||
strip_whitespace "${angle_array[1]}" | ||
sinang="${out_string}" | ||
cdo_remap "${sinang}" "${varfile}" "${interp_type}" "${nc_output_path}" | ||
cdo -expr,"xr=${cosang}*${xvar}+${sinang}*${yvar}; yr=${cosang}*${yvar}-${sinang}*${xvar}" -selname,"${xvar}","${yvar}","${cosang}","${sinang}" "${nc_output_path}" "${var_rotate_path}" >> "${REMAP_LOG}" 2>&1 | ||
varname_update "xr" "${xvar}" "${var_rotate_path}" | ||
cdo_remap "${xvar}" "${var_rotate_path}" "${interp_type}" "${output_path}" | ||
varname_update "yr" "${yvar}" "${var_rotate_path}" | ||
cdo_remap "${yvar}" "${var_rotate_path}" "${interp_type}" "${output_path}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the vector remapping, it is suggested to do the following so that there is more consistency with the NCL/Python remapping programs:
- Replace missing values with 0.0 before any interpolation or vector rotation.
- On the interpolated u and v fields, for grid points where the interpolated mask equals 1 (non-ocean points), set those values on the interpolated u and v fields to a netcdf FillValue. This mask can be derived from a 2D and 3D scalar variable (e.g. SST for the 2D variable and Temperature for the 3D variable).
I think once the above 2 items are added, the CDO remapping will be in great shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace missing values with 0.0 before any interpolation or vector rotation.
This was explored and produced a discontinuity due to the tripolar remapping as noted by @LydiaStefanova-NOAA . Therefore this does not seem like a usable solution.
The CDO application does not provide a way, at least that I am aware of, to handle the mask as you are suggesting. If there is a feature that I am unaware of I would ask that you fork my branch and attempt to add it. Otherwise this issue and PR need to be closed ASAP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no CDO feature that I am currently aware of that can handle masks in such a way. It may require some more time and attention to add these features. However, since these items are only suggestions and not required for the CDO-based remapping to fit in the global-workflow, these items can be addressed in a separate issue and future PR only if needed, and if there are no objections from anyone. As remap.sh
currently stands, I still think the results are considerably close to the original NCL remapping program ush/ocnpost.ncl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not follow this repo, so I apologize if I'm out-of sync. A basic question: How (or where) does CDO obtain the land mask for the destination grid?
In the NCL/Python versions, the destination land mask can be thought of as an "output", meaning the destination land mask is given by the points where a source grid ocean point cannot be mapped fully to a destination grid point (ie, the destination point would be represented by a combination of a land and ocean point on the source grid). This is more and more important as the bathymetry closes in since the ocean model bathymetry will not in general match something like etopo5.
This is an entirely separate issue than the polar seam. That is the result of the fact that the last latitude on the T-cell tripole grid is at 89.64 and so no T-cell grid point can be mapped to 90N.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DeniseWorthen It is unclear what CDO uses as a land mask for the destination grid by default. It does look like there is a way to create a mask file and apply it to the interpolated field such as in this example on the CDO forum: https://code.mpimet.mpg.de/boards/53/topics/10933
Here are the basics of what I gathered from this example so far:
- Interpolate field:
cdo -remapbil,r360x180 infile.nc infile_r360x180.nc
- create mask file with the same resolution as the destination grid:
cdo -f nc -remapbil,r360x180 -topo topo_r360x180.nc
- mask the land areas using the mask file:
cdo -expr,'topo = ((topo<0.0)) ? 1.0 : topo/0.0' topo_r360x180.nc mask_land.nc
- mask interpolated field with the mask file:
cdo -mul mask_land.nc infile_r360x180.nc infile_r360x180_mask_land.nc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @EricSinsky-NOAA I took some time and rewrote the entire original ocnpost.ncl routine in Fortran. It reproduces the fields created by the original NCL to O~-8 in all fields: https://github.com/DeniseWorthen/ocnpost90
I wasn't asked to do this, but a fortran code is a) portable b) fast and c) debuggable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DeniseWorthen Thanks for that. I suspect this will be what we ultimately go with. Once Rahul is back on Weds we will discuss the best way to proceed.
Thanks to everyone who's been working on this, particularly those who wrote stuff that may not be used, which I know can be frustrating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DeniseWorthen Thank you for writing your NCL code in FORTRAN! I do agree that FORTRAN is a great option too for this sort of application, particularly because of its efficiency and flexibility. At first glance, your FORTRAN ocnpost program looks very close to your original NCL ocnpost program.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
speed
should not be directly interpolated using remapbil. It should be recalculated using the remappedSSU
andSSV
.speed
= sqrt(SSU
^2 +SSV
^2)evap
is not required to output in operational forecast files and can be removed.lprec
is not required to output in operational forecast files and can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GwenChen-NOAA Thank you for your feedback. These values were provided to me by @jiandewang.
Those three variables can be removed from the comma-delimited list. Since speed can be (re)calculated it should be the job of a script/application beyond that of remap.sh
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
speed is outputted internally from model so there is no need to recalculate it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiandewang, speed
in the tripolar netCDF files is outputted from model and no need to change. speed
in the latlon netCDF or GRIB2 files needs to be recalculated, because SSU
and SSV
have been remapped to a new grid. This is how wind speed and direction are handled in UPP. Directly interpolating speed
to a new grid treats speed
as a scalar variable, which is incorrect.
@HenryWinterbottom-NOAA, speed calculation is a simple function. Should be easy to add it into remap.sh
and append the output into the latlon netCDF files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jiandewang,
speed
in the tripolar netCDF files is outputted from model and no need to change.speed
in the latlon netCDF or GRIB2 files needs to be recalculated, becauseSSU
andSSV
have been remapped to a new grid. This is how wind speed and direction are handled in UPP. Directly interpolatedspeed
to a new grid treatsspeed
as a scalar variable, which is incorrect.@HenryWinterbottom-NOAA, speed calculation is a simple function. Should be easy to add it into
remap.sh
and append the output into the latlon netCDF files.
Thank you for following up @GwenChen-NOAA. The remap.sh
script is not intended for a specific application (i.e., it is agnostic to the respective component model output to be remapped). The purpose is to generate netCDF files containing the specified/respective variables to be defined on a new destination grid. It does not care whether it is an ice or an ocean or any other component model variable fields that are being remapped. Therefore, the speed
in this case should be computed elsewhere, not within remap.sh
.
Please see my branch here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aerorahul and I talked a little about this before his A/L. I'll comment once we have a more detail conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HenryWinterbottom-NOAA, can you point me to a sample latlon netCDF file generated using the latest code? I need it for testing. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GwenChen-NOAA The following contains a sample file that was used for debugging:
/scratch1/NCEPDEV/da/Henry.Winterbottom/scratch/Neil.Barton/vector_reorder_nonzero.nc
It contains the analysis/forecast variables SSH
, SSU
, and SSV
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@HenryWinterbottom-NOAA, I looked at your sample data. The remapped SSH looks very good to me. The remapped SSU and SSV have more grid points along the coast with missing data, which may be caused by the rotation and land-sea mask handling. Since most variables are scalar, I think this PR is sufficient. Thanks for your good work!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@GwenChen-NOAA Thank you for the feedback.
I will begin the PR and send to @aerorahul and @WalterKolczynski-NOAA for final approval.
Description
This PR addresses issue #923. A
bash
script is provided that leverages theCDO
andNCO
packages to remap specified variable fields using the CDO supported remapping types. This PR is a necessary upgrade to the existing NCL based scripts due to NCL reaching EOL (i.e., it is no-longer supported). No special Python libraries or utilities are required. Leveraging the capabilities of CDO and NCO allow support across all development and production platforms.Resolves #923
Type of change
Change characteristics
How has this been tested?
Tested using ice and ocean forecast files provided by @NeilBarton-NOAA and @jiandewang and executed on RDHPCS Hera, NOAA CSP AWS, and OSX.
Checklist