Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update submodules to ones that are based on cesm3_0_alpha04a #2853

Open
wants to merge 15 commits into
base: cesm3_0_beta04_changes
Choose a base branch
from

Conversation

ekluzek
Copy link
Collaborator

@ekluzek ekluzek commented Oct 30, 2024

Description of changes

Update the submodules to something close to cesm3_0_alpha04a. I needed to update cime and ccs_config beyond to get the PF Unit testing working.

Specific notes

Remove mct from submodules
Add mpi-serial to submodules
Update the PF unit testing to use the full ESMF library (which will enable wider testing), this also required bringing in NetCDF and PIO libraries which we shouldn't actively use but may allow us to do fewer stub modules for I/O.

Contributors other than yourself, if any: @jedwards4b

CTSM Issues Fixed (include github issue #):
Fixes #2640
Fixes #2375
Finishes resolving #2294

Are answers expected to change (and if so in what way)?
I'm actually not sure yet, I think possibly compsets with active CISM might

Any User Interface Changes (namelist or namelist defaults changes)? No

Does this create a need to change or add documentation? Did you do so? No No

Testing performed, if any: will do regular and ctsm_sci
So far done PF UNIT testing and testing of two simple cases
I haven't tested this for LILIC and I wonder if it will fail

@ekluzek ekluzek added enhancement new capability or improved behavior of existing capability priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations code health improving internal code structure to make easier to maintain (sustainability) labels Oct 30, 2024
@ekluzek ekluzek added this to the cesm3_0_beta05 milestone Oct 30, 2024
@ekluzek ekluzek self-assigned this Oct 30, 2024
@ekluzek
Copy link
Collaborator Author

ekluzek commented Oct 30, 2024

@billsacks and @jedwards4b could you review this for the cmake changes I made for the PF unit testing? I learned more about cmake as a result of getting this to work, but I'd like to have it reviewed by the two of you with more knowledge/skill in using cmake.

Copy link
Contributor

@jedwards4b jedwards4b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would remove the lines in CMakeLists.txt that you have commented out, but otherwise LGTM.

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting this working @ekluzek! A couple of questions here....

src/CMakeLists.txt Outdated Show resolved Hide resolved
src/CMakeLists.txt Show resolved Hide resolved
@ekluzek
Copy link
Collaborator Author

ekluzek commented Oct 30, 2024

OK, my suspicion about LILAC was correct, running the LILAC test I get a fail:

    Case dir: /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac.20241030_152038_a3kydm
    Errors were:
        Building test for LILACSMOKE in directory /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac.20241030_152038_a3kydm
        Traceback (most recent call last):
          File "/glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac.20241030_152038_a3kydm/./case.build", line 267, in <module>
            _main_func(__doc__)
          File "/glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac.20241030_152038_a3kydm/./case.build", line 226, in _main_func
            test = find_system_test(testname, case)(case)
          File "/glade/work/erik/ctsm_worktrees/external_updates/cime/CIME/utils.py", line 2272, in find_system_test
            mod = import_module(path)
          File "/glade/u/apps/derecho/23.09/opt/._view/yazo4iwystz7p2hxu5ukzrw3xa24ksen/lib/python3.10/importlib/__init__.py", line 126, in import_module
            return _bootstrap._gcd_import(name[level:], package, level)
          File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
          File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
          File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
          File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
          File "<frozen importlib._bootstrap_external>", line 883, in exec_module
          File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
          File "/glade/work/erik/ctsm_worktrees/external_updates/cime_config/SystemTests/lilacsmoke.py", line 28, in <module>
            from CIME.utils import run_cmd, run_cmd_no_fail, symlink_force, new_lid, safe_copy, append_testlog
        ImportError: cannot import name 'append_testlog' from 'CIME.utils' (/glade/work/erik/ctsm_worktrees/external_updates/cime/CIME/utils.py)

Waiting for tests to finish
FAIL LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac (phase SHAREDLIB_BUILD)
    Case dir: /glade/work/erik/ctsm_worktrees/external_updates/cime/scripts/LILACSMOKE_D_Ld2.f10_f10_mg37.I2000Ctsm50NwpSpAsRs.derecho_intel.clm-lilac.20241030_152038_a3kydm
Due to presence of batch system, create_test will exit before tests are complete.
To force create_test to wait for full completion, use --wait
test-scheduler took 7.439677476882935 seconds

@ekluzek
Copy link
Collaborator Author

ekluzek commented Oct 31, 2024

I've fixed the LILAC problem between a minor update, and something I need to add to cime:

ESMCI/cime#4703

Now, I'm wondering what will happen with run_neon?

Copy link
Member

@billsacks billsacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your explanations to my questions, @ekluzek ! I'm satisfied with this now.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Nov 4, 2024

Answers for aux_clm were identical on Derecho, but 36 tests on Izumi show answer changes that appear to be roundoff, but propagate in some fields:

ERI_D_Ld9_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-reduceOutput
ERI_D_Ld9_P48x1.f10_f10_mg37.I2000Clm50Sp.izumi_nag.clm-SNICARFRC
ERI_D_Ld9_P48x1.f10_f10_mg37.I2000Clm50Sp.izumi_nag.clm-reduceOutput
ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm50Bgc.izumi_nag.clm-ciso
ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm60Bgc.izumi_nag.clm-ciso
ERP_D_Ld5_P48x1.f10_f10_mg37.I1850Clm60Bgc.izumi_nag.clm-ciso--clm-matrixcnOn
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-flexCN_FUN
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-flexCN_FUN--clm-matrixcnOn
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-luna
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-noFUN_flexCN
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-noFUN_flexCN--clm-matrixcnOn
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-reduceOutput
ERP_D_Ld5_P48x1.f10_f10_mg37.I2000Clm50Sp.izumi_nag.clm-o3lombardozzi2015
ERP_D_Ld9.f10_f10_mg37.I1850Clm60BgcCrop.izumi_nag.clm-clm60cam7LndTuningModeLDust
ERP_D_P48x1.f10_f10_mg37.IHistClm60Bgc.izumi_nag.clm-decStart
ERP_D_P48x1.f10_f10_mg37.IHistClm60Bgc.izumi_nag.clm-decStart--clm-matrixcnOn_ignore_warnings
ERS_D.f10_f10_mg37.I1850Clm50BgcCrop.izumi_nag.clm-ciso_monthly_matrixcn_spinup
ERS_D.f10_f10_mg37.I1850Clm60Sp.izumi_nag.clm-ExcessIceStreams
ERS_D_Ld5.f10_f10_mg37.I2000Clm50Fates.izumi_nag.clm-FatesCold
SMS.f10_f10_mg37.I2000Clm50BgcCrop.izumi_gnu.clm-crop
SMS_D.f10_f10_mg37.I1850Clm60BgcCrop.izumi_nag.clm-ciso_soil_matrixcn_only
SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.izumi_gnu.clm-crop
SMS_D.f10_f10_mg37.I2000Clm60BgcCrop.izumi_nag.clm-crop
SMS_D_Ld1_P48x1.f10_f10_mg37.I2000Clm45BgcCrop.izumi_nag.clm-oldhyd
SMS_D_Ld1_P48x1.f10_f10_mg37.I2000Clm50BgcCru.izumi_nag.clm-datm_bias_correct_cruv7
SMS_D_Ld3.f10_f10_mg37.I2000Clm60Bgc.izumi_nag.clm-HillslopeD
SMS_D_Ld5.f10_f10_mg37.I1850Clm45BgcCrop.izumi_nag.clm-crop
SMS_D_Ld5.f10_f10_mg37.I2000Clm50BgcCrop.izumi_nag.clm-irrig_alternate
SMS_D_Ld5.f10_f10_mg37.I2000Clm50FatesRs.izumi_nag.clm-FatesCold
SMS_D_Ld5.f45_f45_mg37.I2000Clm60Fates.izumi_nag.clm-FatesCold
SMS_D_Ld65.f10_f10_mg37.I2000Clm60BgcCrop.izumi_nag.clm-FireLi2024GSWP
SMS_D_Ld65.f10_f10_mg37.IHistClm60BgcCrop.izumi_nag.clm-cropMonthOutput--clm-RxCropCalsAdaptGGCMI
SMS_D_P48x1_Ld5.f10_f10_mg37.I2000Clm50BgcCrop.izumi_nag.clm-irrig_spunup
SMS_Ld5_D_P48x1.f10_f10_mg37.IHistClm50Bgc.izumi_nag.clm-monthly
SMS_Ld5_D_P48x1.f10_f10_mg37.IHistClm60Bgc.izumi_nag.clm-decStart
SMS_Ln9.f10_f10_mg37.I1850Clm45Bgc.izumi_gnu.clm-clm45cam4LndTuningModeZDustSoilErod

The mpi-serial tests are failing at the build step as well:

ERS_D_Ld5_Mmpi-serial.1x1_vancouverCAN.I1PtClm50SpRs.izumi_nag.clm-CLM1PTStartDate (SHAREDLIB_BUILD NLCOMP)
ERS_D_Ld7_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropRs.izumi_intel.clm-decStart1851_noinitial (SHAREDLIB_BUILD NLCOMP)
ERS_D_Mmpi-serial_Ld5.1x1_brazil.I2000Clm50FatesRs.izumi_nag.clm-FatesCold (SHAREDLIB_BUILD NLCOMP)
ERS_Lm13.f10_f10_mg37.I1850Clm60Bgc.izumi_intel.clm-monthly_matrixcn_fast_spinup (NLCOMP RUN)
ERS_Lm20_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.izumi_gnu.clm-cropMonthlyNoinitial (SHAREDLIB_BUILD NLCOMP)
ERS_Lm40_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_gnu.clm-cropMonthlyNoinitial (SHAREDLIB_BUILD NLCOMP)
ERS_Lm54_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-cropMonthlyNoinitial (SHAREDLIB_BUILD NLCOMP)
ERS_Ly20_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-cropMonthlyNoinitial (SHAREDLIB_BUILD NLCOMP)
ERS_Ly20_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-cropMonthlyNoinitial--clm-matrixcnOn (SHAREDLIB_BUILD NLCOMP)
ERS_Ly3_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.izumi_gnu.clm-cropMonthOutput (SHAREDLIB_BUILD NLCOMP)
ERS_Ly5_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.izumi_gnu.clm-ciso_monthly (SHAREDLIB_BUILD NLCOMP)
ERS_Ly5_Mmpi-serial.1x1_smallvilleIA.I1850Clm50BgcCrop.izumi_gnu.clm-ciso_monthly--clm-matrixcnOn (SHAREDLIB_BUILD NLCOMP)
ERS_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.izumi_intel.clm-cropMonthOutput (SHAREDLIB_BUILD NLCOMP)
ERS_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm50BgcCropQianRs.izumi_intel.clm-cropMonthOutput--clm-matrixcnOn_ignore_warnings (SHAREDLIB_BUILD NLCOMP)
SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.izumi_gnu.clm-ptsRLA (SHAREDLIB_BUILD NLCOMP)
SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.izumi_gnu.clm-ptsROA (SHAREDLIB_BUILD NLCOMP)
SMS_D_Ld1_Mmpi-serial.f45_f45_mg37.I2000Clm50SpRs.izumi_nag.clm-ptsRLA (SHAREDLIB_BUILD NLCOMP)
SMS_D_Ly6_Mmpi-serial.1x1_smallvilleIA.IHistClm45BgcCropQianRs.izumi_intel.clm-cropMonthOutput (SHAREDLIB_BUILD NLCOMP)
SMS_D_Mmpi-serial_Ld5.5x5_amazon.I2000Clm60FatesRs.izumi_nag.clm-FatesCold (SHAREDLIB_BUILD NLCOMP)
SMS_Ld10_D_Mmpi-serial.CLM_USRDAT.I1PtClm60SpRs.izumi_nag.clm-default--clm-NEON-TOOL (SHAREDLIB_BUILD NLCOMP)
SMS_Ld5_Mmpi-serial.1x1_brazil.IHistClm60Bgc.izumi_gnu.clm-mimics (SHAREDLIB_BUILD NLCOMP)
SMS_Ly3_Mmpi-serial.1x1_numaIA.I2000Clm50BgcDvCropQianRs.izumi_gnu.clm-ignor_warn_cropMonthOutputColdStart (SHAREDLIB_BUILD NLCOMP)
SMS_Ly5_Mmpi-serial.1x1_brazil.IHistClm50BgcQianRs.izumi_intel.clm-newton_krylov_spinup (SHAREDLIB_BUILD NLCOMP)
SMS_Ly5_Mmpi-serial.1x1_smallvilleIA.IHistClm60BgcCropQianRs.izumi_gnu.clm-gregorian_cropMonthOutput (SHAREDLIB_BUILD NLCOMP)
SSPMATRIXCN_Ly5_Mmpi-serial.1x1_numaIA.I2000Clm50BgcCropQianRs.izumi_intel.clm-ciso_monthly (SHAREDLIB_BUILD NLCOMP)

@ekluzek
Copy link
Collaborator Author

ekluzek commented Nov 5, 2024

It looks like mpi-serial works up to the following set of submodules:

ccs_config_cesm1.0.0
cime6.0.249
share1.0.19
cmeps0.14.17
(Note, cdeps and pio are the latest versions used in this PR, not the previous ctsm5.3.009 ones)

Beyond that it starts breaking. The set I should use together to get working are the cesm3_0_alpha02a externals:

ccs_config_cesm1.0.0
cime6.1.0
share1.1.2
cmeps1.0.2

cesm3_0_alpha03b has the next set of important updates for ccs_config and cime. The next set of updates are the recent ones that were important in getting PFunit tests to work.

NOTE: The top set is still showing changes to answers on Izumi, so the change in answers must be between the ctsm5.3.009 externals and the ones listed above.

@jedwards4b
Copy link
Contributor

Hi Erik,

I found that the configure script was not included in the mpi-serial distribution, I have added it in ESMCI/mpi-serial#30. I think that this will allow you to update to the latest tags.

@ekluzek
Copy link
Collaborator Author

ekluzek commented Nov 5, 2024

Excellent, thanks @jedwards4b!

Yeah, that is one of the things I saw, I thought it might be generated as part of the build process. But, obviously not. Thanks for figuring that out. I"ll try with that branch and make sure it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
code health improving internal code structure to make easier to maintain (sustainability) enhancement new capability or improved behavior of existing capability priority: high High priority to fix/merge soon, e.g., because it is a problem in important configurations
Projects
Status: In Progress
3 participants