-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Initialize revp stresses to previous time step #331
Initialize revp stresses to previous time step #331
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with this change. It will be interesting to see whether the QC tests flag it as significant -- I don't think it should, but I don't have a lot of experience with revp. BTW, do the QC tests run cases with revp on? That probably will need to be changed manually, just for these tests.
I don't think revp is on with the qc option. I will change it manually. |
@eclare108213 No, the qc test do not include the revp option. In general, the QC test could be more automated; at the moment if any test fail in the |
The problem with having a failed test in Perhaps a short-term solution could be to modify the |
I think that having an automatically generated script to generate the QC test cases would be a great addition, or maybe just add outpout to |
…revp specially if it merges an updated upstream into a topic branch.
Just pinging, not clear what the status of this PR is. We should definitely add an revp test to the suite. Has a qc test been done and is it still needed? I think qc tests cannot really be completely automated. When answers change for specific namelist that require qc testing, it makes sense to me that one would want to think carefully about the best set of namelist options in order to isolate the change and demonstrate "same climate". I'm not sure automation fits there. |
We were waiting for the QC script problem to be fixed. We will run that this week and update the PR with the results. |
Amélie still has the same error as I had with the updated python script ;#333 (comment) |
Here are some updates about the QC tests:
Given that the final stress states after few iterations are quite different when they are initialized with the previous time level rather than at zero, it makes sense that the solution is different enough to fail the QC test.
|
An update to master, #346, should fix the qc testing. Can someone pull the master and try again (sorry about all the redos). thanks! |
Acutally Matt did it, for the alt02 option the QC test also fails : INFO:__main__:Running QC test on the following directories:
INFO:__main__: /p/work1/turner/cice_qc_debug/brooks_intel_smoke_gx1_44x1_alt02_medium_qc.qc_base_original_alt02/
INFO:__main__: /p/work1/turner/cice_qc_debug/brooks_intel_smoke_gx1_44x1_alt02_medium_qc.qc_test_alt02/
INFO:__main__:Number of files: 1825
INFO:__main__:2 Stage Test Passed
INFO:__main__:Quadratic Skill Test Failed for Northern Hemisphere
INFO:__main__:Quadratic Skill Test Failed for Southern Hemisphere
INFO:__main__:
ERROR:__main__:Quality Control Test FAILED
DEBUG:__main__:Northern Hemisphere skill score = 0.979277
DEBUG:__main__:Southern Hemisphere skill score = 0.912965 |
OK, just trying to understand. Is this correct? We were able to run the revp qc tests with the box grid all along. We were unable to run the revp tests with gx1 until the latest fixes to the qc scripts. The above skill scores (.97, .91) are for gx1 with alt02 turned on, comparing master with this branch. These values are expected and so we can now merge. I assume the alt02 is a reasonable test case? We could also do an out-of-the-box run just turning on revp, is that valuable too? I guess I wonder whether some of the other settings in alt02 (ncat=1, kitd=0, etc) is giving us the best qc test configuration for testing revp qc. I'm just asking, not saying what has been done isn't adequate. As a general rule, I don't think we need to match the qc testing with the test suite results. The test suite exercises various parts of the code. I think the qc test should be a "best" test that exercises the new non-bit-for-bit changes. |
I don't think the QC tests are designed to be run with the box configuration. They rely on (somewhat) realistic simulation of the sea ice thickness, and so I believe they should only be run on the global grids. Moreover, I believe they should only be run for gx1 configurations, to get proper statistics, although I might not be remembering that correctly. @proteanplanet will most definitely have more to say... |
I agree we should not use the box grid for the qc tests. But I think that is what was done until the gx1 test was working? I guess that's why I'm asking for clarification, I'm a little confused about what was done, whether the qc test we want was completed, and whether this is ready to merge. |
Ah, I see. The gx1 QC tests need to be run for this PR. |
So, looking again at the results above, they note the case is brooks_intel_smoke_gx1_44x1_alt02_medium_qc so this was a gx1 test with alt02. Is that a reasonable test for the revp qc? And what was compared? Is this master with alt02 vs change-stress-init-revp with alt02? master does not have the brooks mods yet, how was master run on brooks? was some of it done manually? |
Yes, we locally added machine files for brooks. I also added these files in this PR for future use at ECCC. The QC tests were indeed done with gx1. We compare These two QC tests were done because they were the two options that failed the base suite, since they are the only two with revised EVP activated. I can run another test with a different set of options if something else would be better to test the revised EVP here. |
I confirm the alt02 test was run on the gx1 grid (we followed the instructions at https://cice-consortium-cice.readthedocs.io/en/master/user_guide/ug_testing.html#end-to-end-testing-procedure) to set up the cases, and added "alt02" and "boxrestore" separately (we ran 2 QC tests : 1 with alt02 (base+test) and one with boxrestore (base+test) both on the gx1 grid) We compared the up-to-date master with the changes from this branch. (master with alt02 vs change-stress-init-revp with alt02, that's right). (and the same thing with the boxrestore option) Tony, you say we were not able to run the QC test with revp all along on gx1. I think maybe this is the first time that it was tried, so we don't know. The fact that for the outputs from machine brooks the python script would fail because the values arrived exactly in the middle of 2 values of the student test look-up table does not mean that it would be the same for other machines/compilers (as you note in #247, we do not do testing across machines/compilers.) Regarding the QC test with alt02:
@eclare108213 @abouchat what do you think ? do you think we should isolate the change and run a QC case "brooks_intel_smoke_gx1_44x1_medium_qc" (default options) and manually turn on revp ? |
I'm starting to understand what is happening now. The boxrestore option is intended only for use with one of the box grids, and in particular the restoring option is designed (if you can call it that) for regional grids, not global grids. Since restoring hasn't been tested on global grids, it's best not to test using that! This might trip up other users, so maybe we can think about how to prevent these kinds of mistakes. Likewise for alt02, there might be something in there that doesn't make sense, although this brings up a larger question about cross-testing the many different options available in CICE. At some point in the past, we decided that was too much to try to do, but we could gradually add more tests as needed. My preference, for PRs like this, is to just test the default configuration with the settings needed to test the modification, which requires some manual intervention. I don't think the QC can ever be completely automated. |
Ok, I see. I am restarting a QC test that will compare master VS change-stress-init-revp, both with the default configuration + revp manually activated, and on gx1 grid. |
Here are the new results for the QC test, comparing master with revp VS change-stress-init-revp with revp, on gx1 grid and all other options set to default:
and skill scores are:
|
Thank you @abouchat. I'm ready to merge this one -- it's an important change. One final request: could you make maps of ice thickness from the control and the modified runs, from the same point at/near the end of your runs, and post them in this PR, please? I'd like to make sure the new run isn't doing something totally unexpected, and the maps also will document the essential difference for future reference. |
I have the maps for ice thickness at the end of the run, for the Base case, Test case, and the difference between the two. The pattern is the same overall but the ice thickness is smaller when initializing with the previous time step. I think that what is happening is that, when running with a small number of subcycles and with the stresses initialized at zero, the ice is weaker (since the normalized value of sigma_I is smaller than -1, for all points, as we see from the plots of the stress states I posted with the PR checklist) and therefore it gets thicker by ridging. On the other hand, if the ice is stronger when initializing with the previous time step solution, then it resists more convergence, and does not thicken as much. Does that make sense? |
That does make sense. Thank you Amélie ! I hope we can continue working with you in the future. |
For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers
PR checklist
Short (1 sentence) summary of your PR:
Change stress initialization to previous time step for revp, to be up-to-date with litterature.
Developer(s):
Amélie Bouchat
Suggest PR reviewers from list in the column to the right. @eclare108213
Please copy the PR test results link or provide a summary of testing completed below.
base_suite:
210 of 214 tests PASSED
4 of 214 tests FAILED
0 of 214 tests PENDING
The tests that failed are 'alt02' and 'boxrestore' since revp is activated.
We are working on running the quality control suite, but still having the same issues reported by @phil-blain in PR Add maximum depth for grounding scheme #325 .
How much do the PR code changes differ from the unmodified code?
Does this PR create or have dependencies on Icepack or any other models?
Does this PR add any new test cases?
Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
Please provide any additional information or relevant details below:
The stresses for revp were currently initialized to zero at the beginning of the subcycling, however in the latest litterature they are initialized to the previous time step (see references below). Changing initialization to the previous time step makes more sense for the evolution of the stresses with subcycling:
I could not perform the quality control suite since we are having issues with running the test on our environment. @phil-blain and I will keep working on this.
Kimmritz et al., 2017, https://doi.org/10.1016/j.ocemod.2017.05.006
Koldunov et al., 2019, https://doi.org/10.1029/2018MS001485