Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GFDL to main (2024-05-31) #1631

Merged
merged 139 commits into from
Jul 29, 2024
Merged

Conversation

marshallward
Copy link
Collaborator

Redo of #1627, I will try to keep it brief. This should finalize most of the necessary Nonboussinesq support. It should also include an important update to enable the latest FMS library (2024+) but YMMV.

Many other new features and bugfixes, please peruse the PR list below.

Features

Bugfixes

Non-Boussinesq fixes (+Boussinesq refactoring)

Diagnostics

Input parameter handling

Refactoring

Variable documentation

Build system

Code maintenance

Contributors

adcroft and others added 30 commits November 22, 2023 17:41
- Added a base class in MOM_EOS_base_type.F90
- All EOS modules now extend this base class
  - This reduces replicated code between the EOS modules
- All existing APIs in MOM_EOS now avoid branching for the type of
  EOS and ultimately pass through to a low-level elemental function
  implementation of the actual EOS
- Added a new elemental function exposed by MOM_EOS
  (currently not used in the main model)
- There is a speed up over the previous form of EOS due to the
  reduced branching
  - For some functions, a local implementation of the base class member is
    needed to gain performance. I deliberately did not implement this
    optimization for UNESCO or Jackett06 so that the generic implementation
    of the base class is utilized and we have code coverage.
- Added rules to .testing/Makefile to invoke build.timing, run.timing for the
  "target" code checked out for regression tests
- Appended to existing GH "perfmon" workflow
  Added or corrected the units in comments describing about 200 real variables
in MOM_hor_bnd_diffusion and MOM_neutral_diffusion, and corrected spelling
errors or other issues in about another 20 comments.  Only comments are changed
and all answers are bitwise identical.
  Fix unpaired parentheses in the format statement used in an error message
about OBC segment data not being on the supergrid.  All answers are bitwise
identical, but there is one less compile-time warning.
  Added the new public interfaces continuity_fluxes and continuity_adjust_vel to
make use of the continuity code without actually updating the layer thicknesses,
and made the existing routines zonal_mass_flux and meridional_mass_flux in the
continuity_PPM module public.   Continuity_fluxes is overloaded to provide
either the layer thickness fluxes or the vertically summed barotropic fluxes.
As a part of this change, the LB arguments to meridional_mass_flux and
zonal_mass_flux were made optional and moved toward the end of the list of
arguments.  The intent of the ocean_grid_type argument to 11 routines in the
continuity_PPM module was changed from inout to in.

  Missing factors of por_face_area[UV] were also added to merid_face_thickness
and zonal_face_thickness when the visc_rem arguments are not present or where
OBCs are in use.  This could change answers, but it seems very unlikely that any
impacted cases are in use yet.

  Answers are bitwise identical all known cases, but there are 4 new public
interfaces, and some bugs were fixed in cases that are not likely in use yet.
  In recognition of the fact that there is only a single continuity scheme that
is accessed via the MOM_continuity module, this commit refactors MOM_continuity
to use 3 simple pass-through interfaces (continuity, continuity_init and
continuity_stencil) to the corresponding routines from MOM_continuity_PPM, while
the MOM_continuity control structure is now alias for continuity_PPM_CS and is
opaque in the MOM_continuity module.  As a part of these changes, the runtime
parameter CONTINUITY_SCHEME was obsoleted with a warning value of "PPM".  All
answers are bitwise identical, and all public interfaces look the same from the
outside, but there is one fewer entry in the MOM_parameter_doc.all files.
  Refactored MOM_continuity_PPM to create the separate publicly visible routines
zonal_edge_thickness and meridional_edge_thickness, and also renamed the
internal routines zonal_face_thickness and merid_face_thickness to
zonal_flux_thickness, meridional_flux_thickness and made them publicly visible
as well.

  This commit also renames a number of internal edge thickness variables from
h_L and h_R to h_S and h_N or h_W and h_E for greater clarity, since left and
right are not so well defined on the grid as north, south, east and west.

  All answers are bitwise identical, but there are 4 new publicly visible
interfaces.
  Moved the calls to zonal_edge_thickness and meridional_edge_thickness out of
zonal_mass_flux and meridional_mass_flux to facilitate the reuse at some later
date of the PPM thickness reconstructions.  As a part of this, there are new
edge thickness arguments to zonal_mass_flux and meridional_mass_flux.  The
interfaces to zonal_edge_thickness and meridional_edge_thickness are new
publicly visible and are used in MOM_continuity.

  This commits also changes the name of the loop_bounds_type to
cont_loop_bounds_type and makes it public but opaque adds the publicly visible
function set_continuity_loop_bounds to enable the continuity loop bounds to be
set from outside of the continuity_PPM module.

  Reflecting these changes there are new calls to zonal_edge_thickness and
meridional_edge_thickness in the 3 routines in MOM_continuity and in
continuity_PPM, and new arrays for holding the edge thicknesses in these
routines.

  All answers are bitwise identical, but there are new publicly visible
interfaces and types and changes to other publicly visible interfaces.  However,
no changes are required outside of MOM_continuity and MOM_continuity_PPM.
  Added the new publicly visible routines zonal_BT_mass_flux and
meridional_BT_mass_flux to return the vertically summed transports that the
continuity solver would generate.  Also revised the routine continuity_2d_fluxes
in MOM_continuity to make use of these new routines.  Because these new routines
are not yet being used, all answers are bitwise identical, but there are new
public interfaces.
  Added the new publicly visible routines continuity_zonal_convergence and
continuity_meridional_convergence to increment layer thicknesses using the
continuity loop bounds type to specify extents.  Also revised continuity_PPM
to use these new routines.  These changes will allow for the reuse of some
of the reconstructions in calls that replace calls to continuity with the
unwrapped contents.  All answers are bitwise identical, but there are two new
public interfaces.
  Move continuity_fluxes and continuity_adjust_vel from MOM_continuity.F90 to
MOM_continuity_PPM.F90, but with these interfaces also offered via the
MOM_continuity module so that no changes are required outside of these two
files.  In addtion, 11 of the recently added public interfaces from
MOM_continuity_PPM are also made available as pass-through interfaces from
MOM_continuity.  All answers are bitwise identical.
  Added the new optional arguments du_cor and dv_cor to continuity_PPM and
zonal_mass_flux or meridional_mass_flux to return the barotropic velocity
increments that make the summed barotropic transports match uhbt or vhbt.  Also
cleaned up and simplified the logic of some of the flags used to apply specified
open boundary conditions, adding a new 1-d logical array thereby avoiding
working unnecessarily on some loops or repeatedly checking for specified open
boundary condition points.  Two openMP directives were also simplified.  All
answers are bitwise identical, but there are new optional arguments to three
publicly visible routines.
This patch adds support to makedep for handling most #ifdef-like
condition blocks.  The following preprocessing commands are handled:

* #define / #undef
* #ifdef / #else / #endif

A new flag is added to provide defined macros (-D), and is chosen to
match the `cpp` flag, so that flags can be shared across programs.

Macros are tracked in a new internal variable and are used for #ifdef
testing.  Nested condition blocks are supported by using an internal
stack of the exclusion state.

Certain cases are still not handled.

* #if blocks containing logical expressions are not parsed.
* CPP content inside of #include is ignored.

No doubt many other cases are still unconsidered, such as exotic macro
names.

The autoconf builds use this feature by passing the generated $(DEFS)
argument to makedep.  This is suitable for now, but we may need to
consider two cases in the future:

* Macros defined in $CPPFLAGS are currently ignored, and perhaps they
  should be included here.  The risk is that it may contain non-macro
  flags.

* At some point, DEFS could be moved to a config.h file, and DEFS would
  no longer contain these macros.  (Note that this is very unlikely at
  the moment, since this feature only works with C.)
  Standardized the syntax for the units of salinities that are read via
get_param calls to be uniformly 'units="ppt"' or similar units for derivatives
with salinity.  The only exceptions are places where practical salinity is used
specifically, which occurs for several arguments in the MOM_EOS code.  All
answers are bitwise identical, but there are changes to a number of
MOM_parameter_doc files.
 - Without this, if part of your OBC is filled with land mask and
   if that land mask contains a masked out tile, you will generate
   a NaN from the phase speed calculation where h is negative in the
   halo neighbor of that masked tile.
  Renamed the runtime parameter FIX_USTAR_GUSTLESS_BUG to USTAR_GUSTLESS_BUG
(with a switch between the meanings of true and false for the two parameters)
for consistency with the syntax of other bug-fix flags in MOM6 and to partially
address dev/gfdl MOM6 issue #237.  Input parameter files need not be changed
right away because MOM6 will still work if FIX_USTAR_GUSTLESS_BUG is specified
instead of USTAR_GUSTLESS_BUG, but USTAR_GUSTLESS_BUG will be logged, so there
are changes to the MOM_parameter_doc files.   By default or with existing input
parameter files, all answers are bitwise identical, and there is error handling
if inconsistent settings of FIX_USTAR_GUSTLESS_BUG and USTAR_GUSTLESS_BUG are
both specified.
…d/flipped input and grids, and under dimensional rescaling to the +/-140 power. Bitwise identical results were achieved under rotation/rescaling using standard 2D test cases such as MISMIP+. Major changes include the following:

- Specified order of operations throughout, and modified the implementation of FEM integration,  h-point-to-node operations, etc for consistency under rotation (see MOM_ice_shelf_dynamics.F90 subroutines calc_shelf_driving_stress, CG_action, CG_action_subgrid_basal, matrix_diagonal, shelf_advance_front, and interpolate_h_to_b).
- Added an ice-shelf version of 'first_direction' and 'alterate_first_direction', allowing users to control the order in which the x- and y-direction ice-shelf advection calls are made. Required for consistency under rotation.
- Dimensional rescaling fixes throughout MOM_ice_shelf_initialize.F90, and within MOM_ice_shelf_dynamics.F90 (see subroutines initialize_ice_shelf_dyn, ice_shelf_solve_outer, and calc_shelf_taub)
- Rotation/rescaling testing revealed two additional bugs that were subsequently fixed:
(1) An index error in the conjugate gradient algorithm (subroutine CG_action)
(2) Discretization error for the east/north fluxes in subroutines ice_shelf_advect_thickness_x and ice_shelf_advect_thickness_y.

The commit also includes several simple changes for code readability and computational efficiency:

- Saved Phi, PhiC, and Phisub in the ice shelf dynamics control structure at the start of each run, rather than reallocating and recalculating these static fields each time they are used. Reshaped the Phisub array for computational efficiency over loops.
- Modified the sub-cell grounding scheme so that it is only called for partially-grounded cells that contain the grounding line (i.e. where float_cond(i,j)==1). Added float_cond to the ice dynamics structure for output.
- Simplified the option to calculate ice viscosity at 4 quadrature points per cell rather than at cell centers, by adding parameter NUMBER_OF_ICE_VISCOSITY_QUADRATURE_POINTS and reshaping CS%ice_visc accordingly.
- Style changes and removal of unused code (e.g. removed subroutine apply_boundary_values and field thickness_bdry_val)
  Renamed the arguments u and v to step_MOM_dyn_split_RK2 as u_inst and v_inst
to more clearly differentiate between the instantaneous velocities (u_inst and
v_inst) and the velocities with a time-averaged phase in the barotropic mode
(u_av and v_av).  A comment is also added at one point where the wrong
velocities are being used to calculate and apply the Orlanski-style radiation
open boundary conditions, with the intention of adding the option to correct
this in a subsequent commit.  This commit only changes the name of a pair of
internal variables in one routine, and all answers are bitwise identical.
  Renamed the runtime parameter FIX_UNSPLIT_DT_VISC_BUG to UNSPLIT_DT_VISC_BUG
(with a switch between the meanings of true and false for the two parameters)
for consistency with the syntax of other bug-fix flags in MOM6 and to partially
address dev/gfdl MOM6 issue #237.  Input parameter files need not be changed
right away because MOM6 will still work if FIX_UNSPLIT_DT_VISC_BUG is specified
instead of UNSPLIT_DT_VISC_BUG, but UNSPLIT_DT_VISC_BUG will be logged, so there
are changes to the MOM_parameter_doc files.   By default or with existing input
parameter files, all answers are bitwise identical, and there is error handling
if inconsistent settings of FIX_UNSPLIT_DT_VISC_BUG and UNSPLIT_DT_VISC_BUG are
both specified.
  Add the new module MOM_dynamics_split_RK2b and calls to the routines in this
module to MOM.F90.  These calls are exercised when the run time parameter
SPLIT_RK2B is true.  For now, all answers are bitwise identical when this new
code is being exercised, but there is a new module with multiple public
interfaces and a new entry in many MOM_parameter_doc files.
  Revised step_MOM_dyn_split_RK2b to use the time-filtered velocities as
arguments and reconstruct the instantaneous velocities with the proper phase
from the barotropic solver from the saved increments du_av_inst and dv_av_inst.
As a part of this change, the continuity solver is used to find the thickness
fluxes used in the predictor step Coriolis terms, and the horizontal viscous
accelerations are calculated for both the predictor and corrector steps, thereby
avoiding the need to vertically remap any 3-d fields apart from a single pair of
velocity components.

  The run-time option STORE_CORIOLIS_ACCEL is no longer used when SPLIT_RK2B =
True.  FPMIX was also disabled in this mode due to unresolved dimensional
inconsistency errors in that code.

  Additionally, the time-filtered velocities are now used instead of the
instantaneous velocities in the second call to radiation_open_bdry_conds(),
which should improve the performance of several of the Orlanski-type open
boundary conditions.

  The 2-d fields du_av_inst and dv_av_inst were added to the restart file, while
the 3-d fields u2, v2, diffu and diffv and either CAu and CAv or uh, vh and h2
are all removed from the restart files.  Remap_dyn_split_RK2b_aux_vars() now has
nothing to do, and it should probably be eliminated altogether in a subsequent
commit.

  All answers are changed when SPLIT_RK2B = True, and there are changes to the
contents of the restart files.
  This commit includes further revisions to MOM_dynamics_split_RK2b that avoid
some unnecessary calculations and group some of the halo updates into fewer
group passes.  All answers are bitwise identical.
  Corrected the description of the runtime parameter SPLIT_RK2B that will appear
in the MOM_parameter_doc files and in the doxygen descriptions of the MOM module
to better reflect what was ultimately being done with this new scheme.  All
answers are bitwise identical, but there are changes to the (newly added)
contents of some MOM_parameter_doc files.
  Added the new functions cuberoot and intrinsic_functions_unit_tests to the
MOM_intrinsic_functions module, and call intrinsic_functions_unit_tests from
unit_tests to confirm that this new function works as intended.  Separately,
cuberoot was tested by replacing expressions like A**(1./3.) with cuberoot(A) in
MOM_energetic_PBL and verifying that the answers only change at roundoff, but
that it can give bitwise identical results when the argument is scaled by an
integer power of 8 and then unscaled by the corresponding integer power of 2,
but that change will occur in a subsequent commit as it can change answers
depending on an ANSWER_DATE flag.  With this commit, cuberoot is not yet being
used so all answers are bitwise identical, although there are new publicly
visible routines.
JOB_DIR in the Gaea-specific .gitlab-ci.yml configuration file is
updated to use its F5 filesystem.
  Modified the cuberoot function to do 3 iterations with Halley's method
starting with a first guess that balances the errors at the two ends of the
range of the iterations, before a final iteration with Newton's method that
polishes the root and gives a solution that is accurate to machine precision.
Following on performance testing of the previous version, all convergence
testing has been removed and the same number of iterations are applied
regardless of the input value.  This changes answers at roundoff for code that
uses the cuberoot function, so ideally this PR would be dealt with before the
cuberoot becomes widely used.
@Hallberg-NOAA
Copy link
Collaborator

I think that we now know why the answers were changing for the NCAR regression testing (in addition to the other issues already addressed above).

When we refactored the equations of state to make them elemental, we did not manage to carry forward the actions resulting from the flag USE_WRIGHT_2ND_DERIV_BUG that determines whether to correct an error in the expressions for two of the second derivative calculations when EQN_OF_STATE = "WRIGHT". The run-time parameter is still there and it is being logged, but it does nothing; worse yet, the revised code behaves as though USE_WRIGHT_2ND_DERIV_BUG = True although the default is False.

This impacts solutions that use one of the parameterizations enabled with USE_STANLEY_... = True, have EQN_OF_STATE = "WRIGHT" and do not explicitly set USE_WRIGHT_2ND_DERIV_BUG = True. This slipped through our testing because we do not have any such configurations in the GFDL regression test suite. We are encouraging people to use EQN_OF_STATE = "WRIGHT_REDUCED" if you want something that is mathematically equivalent to EQN_OF_STATE = "WRIGHT", but fixes the 2nd derivative bugs and is refactored with added parentheses so that answers are invariant to levels of optimization or choice of compiler. There are other equation of state options that also are demonstrably bug-free. It is only "WRIGHT" that has these problems because we are retaining the ability to generate all of its bugs to be able to recreate older solutions.

I do have a fix to this that will once again make configurations with USE_WRIGHT_2ND_DERIV_BUG = False work as intended, and in tests with a version our our global_ALE test case using the same parameterizations as NCAR's proto-CESM test, it is now giving bitwise identical answers to those that I get with the current dev/ncar fork. I will be putting this in shortly as a PR directly to NOAA-GFDL:gfdl-to-main-2024-05-31.

We have not gone through the same intensive diagnosis of why EMC's 5x5 test cases are showing answer changes, but hopefully it was tripping up on similar issues to the ones we have debugged with dev/ncar.

@jiandewang
Copy link
Collaborator

@Hallberg-NOAA for 5x5 we have
USE_WRIGHT_2ND_DERIV_BUG = False
EQN_OF_STATE = "WRIGHT"

see MOM_parameter_doc.all on GAEA: /gpfs/f5/nggps_emc/scratch/Jiande.Wang/For-Marshall/PR0608/5x5

@Hallberg-NOAA
Copy link
Collaborator

Hallberg-NOAA commented Jul 15, 2024

After running a version of our Global-ALE test case with all of the parameter settings from EMC's 5x5 test case, I think that the only thing that we need to do for this test case to reproduce the previous answers with the new code is to set EPBL_ANSWER_DATE = 20231231.

@Hallberg-NOAA
Copy link
Collaborator

The PR fixing the Wright equation of state problems that were changing answers in the NCAR regression suite can be found at NOAA-GFDL#686.

  This commit restores the effectiveness of the runtime parameter
USE_WRIGHT_2ND_DERIV_BUG in determining whether a bug is corrected in the
calculation of two of the second derivative terms returned by
calculate_density_second_derivs_elem() with the "WRIGHT" equation of state,
recreating the behavior (and answers) that are currently on the main branch of
MOM6.  To do this, it adds and calls the new routine set_params_buggy_Wright()
when appropriate, and adds the new element "three" to the buggy_Wright_EOS type.
When the bug is fixed, buggy_Wright_EOS%three = 3, but ...%three = 2 to
recreate the bug.  This commit does change answers for cases using the "WRIGHT"
equation of state and one of the "USE_STANLEY_..." parameterizations from those
on the dev/gfdl branch of MOM6, but in so doing it restores the answers on the
main branch of MOM6.  There is also a new publicly visible subroutine.
@Hallberg-NOAA
Copy link
Collaborator

Thanks to some persistent work by @jiandewang and others at EMC to track down where the answers diverged, we have a pending update to this PR that finally has all of the EMC regression test cases reproducing their previous answers, along with a second update that would have trapped the problem more quickly. I believe that there will be no other (known) problems once NOAA-GFDL#690 and NOAA-GFDL#687 are merged into the branch for this PR.

@jiandewang
Copy link
Collaborator

special thanks for @DeniseWorthen who provided a key clue (melt_potential) for me and Bob. Otherwise we might still be at struggling stage.

  Moved a check for whether to initialize an uninitialized visc%h_ML field from
visc%MLD outside of the test for the value of CS%mixedlayer_restart.  This
change will prevent answer changes between code versions for certain cases that
initialize from a restart file, use the melt_potential field and have
MLE_USE_PBL_MLD = True but also have MIXEDLAYER_RESTRAT = False.  This commit
corrects a very specific oversight that was introduced when the roles of
visc%h_ML and visc%MLD were separated, and it can change answers (reverting them
to answers from older versions of the main branch of MOM6) in very specific
cases where MOM6 is initialized from a restart file from an older versions of
the code, including the 5x5-degree test case in the dev/emc regression suite.
@jiandewang
Copy link
Collaborator

@Hallberg-NOAA do you plan to merge "checksum melt_potential" in this candidate ? I will do a final full testing (required by UFS and EIB group) after you finalize this.

  Added checksum calls for the melt_potential, ocean_mass, ocean_heat and
ocean_salt elements of the surface state in MOM_surface_chksum if these fields
are allocated for more comprehensive debugging.  Also added the symmetric
optional argument to the call to MOM_surface_chksum form extract_surface_state
so that all of the surface velocity values that could contribute to the ocean
surface velocities that are seen by the coupler are checksummed.  All solutions
are bitwise identical, but there are enhancements to the MOM6 debugging
capabilities.
@jiandewang jiandewang self-requested a review July 23, 2024 02:57
Copy link
Collaborator

@jiandewang jiandewang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

works as expected in UFS now.
adding USE_HUYNH_STENCIL_BUG = True, plus EPBL_ANSWER_DATE=20231231 in 5x5 setting kept current answers

Copy link
Collaborator

@alperaltuntas alperaltuntas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NCAR tests pass.

Copy link
Collaborator

@abozec abozec left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

COAPS approves.

@mathomp4
Copy link
Collaborator

Much like @jiandewang, in my testing, GEOSgcm is zero-diff at C12 if I add:

USE_HUYNH_STENCIL_BUG = True
EPBL_ANSWER_DATE=20231231

I don't have approval rights here it seems, so I guess this is just a 👍🏼

@sanAkel sanAkel self-assigned this Jul 26, 2024
Copy link
Collaborator

@sanAkel sanAkel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hallberg-NOAA

Santha, if you still have the permissions to click the "Approve" button on MOM-ocean, I think it would help if you were to do so now that we have Matt's written approval.

Approving it following @mathomp4's test results.

@marshallward marshallward merged commit aacb909 into mom-ocean:main Jul 29, 2024
8 of 10 checks passed
@marshallward marshallward deleted the gfdl-to-main-2024-05-31 branch July 29, 2024 13:34
mathomp4 added a commit to GEOS-ESM/GEOS_OceanGridComp that referenced this pull request Aug 20, 2024
With the merge of mom-ocean/MOM6#1631 into
our MOM6 fork in [MOM6 geos/v3.2](https://github.com/GEOS-ESM/MOM6/releases/tag/geos%2Fv3.2)
without any changes, the answers will change.

So, until these can be tested, we set:
```
USE_HUYNH_STENCIL_BUG = True
EPBL_ANSWER_DATE = 20231231
```

in the `MOM_override` files.
mathomp4 added a commit to GEOS-ESM/GEOS_OceanGridComp that referenced this pull request Aug 20, 2024
With the merge of mom-ocean/MOM6#1631 into
our MOM6 fork in [MOM6 geos/v3.2](https://github.com/GEOS-ESM/MOM6/releases/tag/geos%2Fv3.2)
without any changes, the answers will change.

So, until these can be tested, we set:
```
USE_HUYNH_STENCIL_BUG = True
EPBL_ANSWER_DATE = 20231231
```

in the `MOM_override` files.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.