Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initialize revp stresses to previous time step #331

Merged
merged 4 commits into from
Aug 14, 2019

Conversation

abouchat
Copy link
Contributor

@abouchat abouchat commented Jul 9, 2019

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Change stress initialization to previous time step for revp, to be up-to-date with litterature.

  • Developer(s):
    Amélie Bouchat

  • Suggest PR reviewers from list in the column to the right. @eclare108213

  • Please copy the PR test results link or provide a summary of testing completed below.
    base_suite:
    210 of 214 tests PASSED
    4 of 214 tests FAILED
    0 of 214 tests PENDING
    The tests that failed are 'alt02' and 'boxrestore' since revp is activated.
    We are working on running the quality control suite, but still having the same issues reported by @phil-blain in PR Add maximum depth for grounding scheme #325 .

  • How much do the PR code changes differ from the unmodified code?

    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?

    • Yes
    • No
  • Does this PR add any new test cases?

    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)

    • Yes
    • [] No, does the documentation need to be updated at a later time?
      • [] Yes
      • No
  • Please provide any additional information or relevant details below:
    The stresses for revp were currently initialized to zero at the beginning of the subcycling, however in the latest litterature they are initialized to the previous time step (see references below). Changing initialization to the previous time step makes more sense for the evolution of the stresses with subcycling:
    ellipse_compare_stress_init

I could not perform the quality control suite since we are having issues with running the test on our environment. @phil-blain and I will keep working on this.

Kimmritz et al., 2017, https://doi.org/10.1016/j.ocemod.2017.05.006
Koldunov et al., 2019, https://doi.org/10.1029/2018MS001485

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this change. It will be interesting to see whether the QC tests flag it as significant -- I don't think it should, but I don't have a lot of experience with revp. BTW, do the QC tests run cases with revp on? That probably will need to be changed manually, just for these tests.

@abouchat
Copy link
Contributor Author

abouchat commented Jul 9, 2019

I don't think revp is on with the qc option. I will change it manually.

@phil-blain
Copy link
Member

@eclare108213 No, the qc test do not include the revp option. In general, the QC test could be more automated; at the moment if any test fail in the base_suite and we want to run the QC test to check the impact of the code change, one has to manually check which options are 'on' for the failing test(s) and run a 'base' QC test and a 'test' QC test with these options added (possibly running several QC tests since some set of options are incompatible). So I think that in general the QC could be more automated @mattdturner @apcraig

@mattdturner
Copy link
Contributor

The problem with having a failed test in base_suite automatically spawn the QC test is that simulations done for base_suite do not output the data files necessary to run the QC test. Also, the base_suite simulations are generally for shorter time periods than are required for the QC test. No matter what, a new base QC test case and a test QC test case would need to be generated.

Perhaps a short-term solution could be to modify the results.csh script to, if failures are encountered, auto-generate a new script that would create the QC test cases. I hesitate to have it automatically create the cases and submit the jobs since the simulations are rather lengthy. However, I'd be happy to get input from others.

@phil-blain
Copy link
Member

phil-blain commented Jul 10, 2019

I think that having an automatically generated script to generate the QC test cases would be a great addition, or maybe just add outpout to results.csh that indicate which set options to use when creating the QC test cases. You are right that automating the case creation and job submission is not necessarily the best idea.

@apcraig
Copy link
Contributor

apcraig commented Aug 5, 2019

Just pinging, not clear what the status of this PR is. We should definitely add an revp test to the suite. Has a qc test been done and is it still needed?

I think qc tests cannot really be completely automated. When answers change for specific namelist that require qc testing, it makes sense to me that one would want to think carefully about the best set of namelist options in order to isolate the change and demonstrate "same climate". I'm not sure automation fits there.

@phil-blain
Copy link
Member

We were waiting for the QC script problem to be fixed. We will run that this week and update the PR with the results.

@phil-blain phil-blain mentioned this pull request Aug 6, 2019
@phil-blain
Copy link
Member

Amélie still has the same error as I had with the updated python script ;#333 (comment)

@abouchat
Copy link
Contributor Author

abouchat commented Aug 7, 2019

Here are some updates about the QC tests:

  1. I am not able to run the QC test with the alt02 option. It gives me the same error as mentioned by @phil-blain in qc environment setup #333 .

  2. However, I was able to run the QC test with the boxrestore option, where the revised EVP is also activated. It fails:

INFO:__main__:Running QC test on the following directories:
INFO:__main__:  ../data/eccc-ppp2/runs/brooks_intel_smoke_gx1_44x1_boxrestore_medium_qc.qc_base_original_boxrestore_no_leap_years
INFO:__main__:  ../data/eccc-ppp2/runs/brooks_intel_smoke_gx1_44x1_boxrestore_medium_qc.qc_test_boxrestore_no_leap_year
INFO:__main__:Number of files: 1825
INFO:__main__:2 Stage Test Passed
INFO:__main__:Quadratic Skill Test Failed for Northern Hemisphere
INFO:__main__:Quadratic Skill Test Failed for Southern Hemisphere
INFO:__main__:
ERROR:__main__:Quality Control Test FAILED

Given that the final stress states after few iterations are quite different when they are initialized with the previous time level rather than at zero, it makes sense that the solution is different enough to fail the QC test.

  1. When running with the boxrestore option, there are problems with the option 'use_leap_years' that is enabled in the namelist. I therefore had to turn it off manually for this case to run. I will add this as a new issue (probably related to time manager concerns #162 ?)

@apcraig
Copy link
Contributor

apcraig commented Aug 8, 2019

An update to master, #346, should fix the qc testing. Can someone pull the master and try again (sorry about all the redos). thanks!

@phil-blain
Copy link
Member

phil-blain commented Aug 9, 2019

Acutally Matt did it, for the alt02 option the QC test also fails :

INFO:__main__:Running QC test on the following directories:
INFO:__main__:  /p/work1/turner/cice_qc_debug/brooks_intel_smoke_gx1_44x1_alt02_medium_qc.qc_base_original_alt02/
INFO:__main__:  /p/work1/turner/cice_qc_debug/brooks_intel_smoke_gx1_44x1_alt02_medium_qc.qc_test_alt02/
INFO:__main__:Number of files: 1825
INFO:__main__:2 Stage Test Passed
INFO:__main__:Quadratic Skill Test Failed for Northern Hemisphere
INFO:__main__:Quadratic Skill Test Failed for Southern Hemisphere
INFO:__main__:
ERROR:__main__:Quality Control Test FAILED

DEBUG:__main__:Northern Hemisphere skill score = 0.979277
DEBUG:__main__:Southern Hemisphere skill score = 0.912965

@apcraig
Copy link
Contributor

apcraig commented Aug 9, 2019

OK, just trying to understand. Is this correct? We were able to run the revp qc tests with the box grid all along. We were unable to run the revp tests with gx1 until the latest fixes to the qc scripts. The above skill scores (.97, .91) are for gx1 with alt02 turned on, comparing master with this branch. These values are expected and so we can now merge.

I assume the alt02 is a reasonable test case? We could also do an out-of-the-box run just turning on revp, is that valuable too? I guess I wonder whether some of the other settings in alt02 (ncat=1, kitd=0, etc) is giving us the best qc test configuration for testing revp qc. I'm just asking, not saying what has been done isn't adequate. As a general rule, I don't think we need to match the qc testing with the test suite results. The test suite exercises various parts of the code. I think the qc test should be a "best" test that exercises the new non-bit-for-bit changes.

@eclare108213
Copy link
Contributor

I don't think the QC tests are designed to be run with the box configuration. They rely on (somewhat) realistic simulation of the sea ice thickness, and so I believe they should only be run on the global grids. Moreover, I believe they should only be run for gx1 configurations, to get proper statistics, although I might not be remembering that correctly. @proteanplanet will most definitely have more to say...

@apcraig
Copy link
Contributor

apcraig commented Aug 9, 2019

I agree we should not use the box grid for the qc tests. But I think that is what was done until the gx1 test was working? I guess that's why I'm asking for clarification, I'm a little confused about what was done, whether the qc test we want was completed, and whether this is ready to merge.

@eclare108213
Copy link
Contributor

Ah, I see. The gx1 QC tests need to be run for this PR.

@apcraig
Copy link
Contributor

apcraig commented Aug 9, 2019

So, looking again at the results above, they note the case is

brooks_intel_smoke_gx1_44x1_alt02_medium_qc

so this was a gx1 test with alt02. Is that a reasonable test for the revp qc? And what was compared? Is this master with alt02 vs change-stress-init-revp with alt02? master does not have the brooks mods yet, how was master run on brooks? was some of it done manually?

@abouchat
Copy link
Contributor Author

abouchat commented Aug 9, 2019

Yes, we locally added machine files for brooks. I also added these files in this PR for future use at ECCC.

The QC tests were indeed done with gx1. We compare
i) master with alt02 VS change-stress-init-revp with alt02
ii) master with boxrestore vs change-stress-init-revp with boxrestore

These two QC tests were done because they were the two options that failed the base suite, since they are the only two with revised EVP activated.

I can run another test with a different set of options if something else would be better to test the revised EVP here.

@phil-blain
Copy link
Member

I confirm the alt02 test was run on the gx1 grid (we followed the instructions at https://cice-consortium-cice.readthedocs.io/en/master/user_guide/ug_testing.html#end-to-end-testing-procedure) to set up the cases, and added "alt02" and "boxrestore" separately (we ran 2 QC tests : 1 with alt02 (base+test) and one with boxrestore (base+test) both on the gx1 grid)

We compared the up-to-date master with the changes from this branch. (master with alt02 vs change-stress-init-revp with alt02, that's right). (and the same thing with the boxrestore option)
For the master runs the brooks mods were cherry-picked on top of the latest master (just the changes to Macros.brooks, env.brooks, cice.batch and cice.run in the commit d0246cc).

Tony, you say we were not able to run the QC test with revp all along on gx1. I think maybe this is the first time that it was tried, so we don't know. The fact that for the outputs from machine brooks the python script would fail because the values arrived exactly in the middle of 2 values of the student test look-up table does not mean that it would be the same for other machines/compilers (as you note in #247, we do not do testing across machines/compilers.)

Regarding the QC test with alt02:

Is that a reasonable test for the revp qc?

@eclare108213 @abouchat what do you think ? do you think we should isolate the change and run a QC case "brooks_intel_smoke_gx1_44x1_medium_qc" (default options) and manually turn on revp ?

@eclare108213
Copy link
Contributor

I'm starting to understand what is happening now. The boxrestore option is intended only for use with one of the box grids, and in particular the restoring option is designed (if you can call it that) for regional grids, not global grids. Since restoring hasn't been tested on global grids, it's best not to test using that! This might trip up other users, so maybe we can think about how to prevent these kinds of mistakes. Likewise for alt02, there might be something in there that doesn't make sense, although this brings up a larger question about cross-testing the many different options available in CICE. At some point in the past, we decided that was too much to try to do, but we could gradually add more tests as needed. My preference, for PRs like this, is to just test the default configuration with the settings needed to test the modification, which requires some manual intervention. I don't think the QC can ever be completely automated.

@abouchat
Copy link
Contributor Author

abouchat commented Aug 9, 2019

Ok, I see. I am restarting a QC test that will compare master VS change-stress-init-revp, both with the default configuration + revp manually activated, and on gx1 grid.

@abouchat
Copy link
Contributor Author

Here are the new results for the QC test, comparing master with revp VS change-stress-init-revp with revp, on gx1 grid and all other options set to default:

(cice-qc) amb001@eccc-ppp2:>./cice.t-test.py -v ../data/eccc-ppp2/runs/brooks_intel_smoke_gx1_44x1_medium_qc.qc_base_revp ../data/eccc-ppp2/runs/brooks_intel_smoke_gx1_44x1_medium_qc.qc_test_revp
INFO:__main__:Running QC test on the following directories:
INFO:__main__:  ../data/eccc-ppp2/runs/brooks_intel_smoke_gx1_44x1_medium_qc.qc_base_revp
INFO:__main__:  ../data/eccc-ppp2/runs/brooks_intel_smoke_gx1_44x1_medium_qc.qc_test_revp
INFO:__main__:Number of files: 1825
INFO:__main__:2 Stage Test Passed
INFO:__main__:Quadratic Skill Test Failed for Northern Hemisphere
INFO:__main__:Quadratic Skill Test Failed for Southern Hemisphere
INFO:__main__:
ERROR:__main__:Quality Control Test FAILED

and skill scores are:

DEBUG:__main__:Northern Hemisphere skill score = 0.969654
DEBUG:__main__:Southern Hemisphere skill score = 0.918635

@eclare108213
Copy link
Contributor

Thank you @abouchat. I'm ready to merge this one -- it's an important change. One final request: could you make maps of ice thickness from the control and the modified runs, from the same point at/near the end of your runs, and post them in this PR, please? I'd like to make sure the new run isn't doing something totally unexpected, and the maps also will document the essential difference for future reference.

@abouchat
Copy link
Contributor Author

I have the maps for ice thickness at the end of the run, for the Base case, Test case, and the difference between the two. The pattern is the same overall but the ice thickness is smaller when initializing with the previous time step. I think that what is happening is that, when running with a small number of subcycles and with the stresses initialized at zero, the ice is weaker (since the normalized value of sigma_I is smaller than -1, for all points, as we see from the plots of the stress states I posted with the PR checklist) and therefore it gets thicker by ridging. On the other hand, if the ice is stronger when initializing with the previous time step solution, then it resists more convergence, and does not thicken as much. Does that make sense?

h_base
h_test
h_diff

@eclare108213
Copy link
Contributor

That does make sense. Thank you Amélie ! I hope we can continue working with you in the future.

@apcraig apcraig merged commit 9c5e907 into CICE-Consortium:master Aug 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants