Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JRA55do to forcing #831

Closed
wants to merge 8 commits into from
Closed

Conversation

daveh150
Copy link
Contributor

@daveh150 daveh150 commented May 25, 2023

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Add JRA55do forcing

  • Developer(s):
    David Hebert

  • Suggest PR reviewers from list in the column to the right.

  • Please copy the PR test results link or provide a summary of testing completed below.
    base_suite results
    419 measured results of 419 total results
    419 of 419 tests PASSED
    0 of 419 tests PENDING
    0 of 419 tests MISSING data
    0 of 419 tests FAILED

  • How much do the PR code changes differ from the unmodified

  • code?

    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on Icepack or any other models?

    • Yes
    • No
  • Does this PR update the Icepack submodule? If so, the Icepack submodule must point to a hash on Icepack's main branch.

    • Yes
    • No
  • Does this PR add any new test cases?

    • Yes
    • No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)

    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes
      • No
  • Please provide any additional information or relevant details below:

Added JRA55do option to ice_forcing.F90. Started from the JRA55 forcing and modified.

In this JRA55do case I kept the JRA55do variable names, and also used individual variables for each variable instead of one large file for each year. They are defined in the JRA55do subroutine.

@daveh150 daveh150 marked this pull request as ready for review May 25, 2023 20:09
@apcraig
Copy link
Contributor

apcraig commented May 30, 2023

Several questions/comments.

  • Is the plan to keep JRA55 and JRA55do longer term? Which would be the recommended JRA55 forcing dataset?
  • We should add some test cases for JRA55do.
  • I think we need to update the input datasets at zenodo, propagate those datasets, and do some testing before merging this.

@eclare108213
Copy link
Contributor

  • Is the plan to keep JRA55 and JRA55do longer term? Which would be the recommended JRA55 forcing dataset?

I think these JRA datasets are fairly large, and so it would be better to have only one of them. Are any of the files common between them?

  • We should add some test cases for JRA55do.

I suggest copying the existing JRA55 tests, for now. The PR info above says that this is a BFB change. Is that true, comparing old and new JRA datasets? If so, what is changing here?

@daveh150
Copy link
Contributor Author

daveh150 commented May 30, 2023 via email

@eclare108213
Copy link
Contributor

For the JRA55do set I wrote individual NetCDF files for each variable instead of one large NetCDF. The variable names are different and there are no common files between this and the JRA55 files. Also my thought in this case was having 7x 1.4GB(ish) files might be more manageable than 1x 10 GB (ish) file, particularly to download from Zenodo. Though I can quickly make one file with NetCDF operators.

I see. Would most of the data itself be the same (although with different variable names), if the files were formatted the same way? I'm not suggesting you do that, just trying to understand what's different and where the overlaps are.

Sorry if I confused the BFB, but I don’t think that will hold using JRA55do vs. ‘old’ JRA55 forcing. The BFB was with the base suite, with intent the addition ‘did no harm’ to the rest of the CICE code.

I see, thanks!

@daveh150
Copy link
Contributor Author

daveh150 commented May 30, 2023 via email

@apcraig
Copy link
Contributor

apcraig commented May 30, 2023

Or have a set of inputs that specify the forcing filename and names of variables on the file. The thing that does stand out in these changes is how much is hard-wired. However, I don't know that we can actually build something that is truly general and if we can't, it's probably better to just do it as we have than try to build something complicated and general that doesn't work very well. If we generalize, we might have to get into time and space interpolation too. There are tools out there like the CESM data streams that might be useful. Prescribed CICE mode already uses that, so there is some precedent in CICE, but it would require more CESM infrastructure to be migrated into CICE (prescribed ice mode is only run within the CESM coupled framework so the infrastructure is available outside CICE).

@daveh150
Copy link
Contributor Author

daveh150 commented May 30, 2023 via email

@daveh150
Copy link
Contributor Author

daveh150 commented Jun 2, 2023

I consolidated the JRA55_files and JRA55_data subroutines in an attempt to standardize. In each subroutine it will search 'atm_data_type' to determine the grid and JRA55 or JRA55do. The downside of this is we will need to standardize our current JRA55 files. Currently they are (clearly I was not thinking about standard file naming!)

gx1: JRA55_03hr_forcing_2005.nc
gx3: JRA55_gx3_03hr_forcing_2005.nc
tx1: JRA55_03hr_forcing_tx1_2005.nc

I updated code to search for JRA55_grd_03hr_forcing_yyyy.nc

Here are some quick results from the log, they are similar which is encouraging. Dave Bailey is on leave until the 10th he will probably have more tests for comparison.

total_snw_volume_2005

total_ice_extent_2005

total_ice_volume_2005

@apcraig
Copy link
Contributor

apcraig commented Jun 7, 2023

The code mods look good. Have you checked everything is still bit-for-bit for the old option? The documentation needs to be updated, and we need to decide if we're going to support old and new JRA55 data or just one. If just new, we should change the default, deprecate the old option, and run a full test suite since all answers will change. We may also want to do a QC test. If we want to support both datasets, then we should decide which is default and then add some tests for the other. The new datasets also need to be migrated to zenodo in any case. It may be OK to support both options in terms of managing the dataset. Any site will probably just have one set of data. Zenodo already has the older JRA55 dataset, and we have to add the new one anyway. We can have two sets on some test machines, at least for gx3.

@daveh150
Copy link
Contributor Author

daveh150 commented Jun 9, 2023

Running base_suite test locally passed all tests (I only have gnu compiler locally). Two tests initially failed because they could not find the JRA55 forcing with 'new' naming conventions. Making symlinks and re-running solved that issue.

344 measured results of 344 total results
344 of 344 tests PASSED
0 of 344 tests PENDING
0 of 344 tests MISSING data
0 of 344 tests FAILED

For reference the tests that initially failed are:
restart_tx1_40x4_dsectrobin_medium
smoke_gx1_24x1_medium_run90day_yi2008

@dabail10
Copy link
Contributor

Did a run myself and here are some contour plots. This looks good to me overall. I think it would be good if we could compare 5 year runs using the QC control test.

https://webext.cgd.ucar.edu/CICE/jra55dotest-jra55test/yrs2005-2005/

@daveh150
Copy link
Contributor Author

Thanks Dave! I'll pull data and make forcing through 2011 for us to make longer tests.

@dabail10
Copy link
Contributor

dabail10 commented Jul 3, 2023

I am currently running two 5 year tests. The code here is fine. I will have to work up some new distribution for the files. Note that the names of the gx1 files have to change to include the 'gx1' grid name.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I approved this earlier but now realize that a test has not been added. Shouldn't there be one?

@daveh150
Copy link
Contributor Author

daveh150 commented Jul 6, 2023

I can add JRA55do tests. As it stands now looks like we will need to add items like set_nml.jra55do_gx1, set_nml.jra55do_gx3, set_nml.jra55do.tx1. Then it will be one of the 'sets' in a test suite.

@daveh150
Copy link
Contributor Author

daveh150 commented Jul 7, 2023

I added a few jra55do test cases to the base_suite and quick_suite. in base_suite I added jra55do to each of our grids (gx3, gx1, tx1). There are corresponding setup_nml.jra55do_XXX, where XXX is grid version. I tested them on our local system and they ran successfully.

@apcraig
Copy link
Contributor

apcraig commented Jul 7, 2023

Can someone summarize the current strategy. Is the old JRA55 still the default? Is it now using the individual yearly files? Do we need to download new datasets for both JRA55 and JRA55do on all our test platforms? Is the old JRA55 forcing going to stay the default for a while?

@apcraig
Copy link
Contributor

apcraig commented Jul 7, 2023

I would like to run a full test suite on cheyenne before merging. But don't want to do that until the PR is ready.

@dabail10
Copy link
Contributor

dabail10 commented Jul 7, 2023

In case it is helpful, the run with the JRA55do passed the QC test versus the standard JRA55 case.

Dave

ice_thickness_jra55dote_minus_jra55te
ice_thickness_jra55dote
ice_thickness_jra55te

@daveh150
Copy link
Contributor Author

daveh150 commented Jul 7, 2023

Thanks Tony for asking about the strategy. As it stands now, JRA55 is the default, but needs to have new forcing files downloaded to work with the updated forcing code.

Quite honestly I'm not sure what the default should be. It wouldn't be too hard to make JRA55do the default if that is the better dataset, should be just a matter of updating the forcing definitions in the test suites. Any opinions?

I am struggling with if we actually need both JRA55 and JRA55do? For our testing purposes can (or should) we simply support the JRA55do? Maybe that is for future release/testing.

Also, a note I just went to the JRA55do website, and they say JRA55 is going to terminate sometime this year. Doesn't mean we can't use it for our testing years 2005-2011, just that it is going to go away.

@apcraig
Copy link
Contributor

apcraig commented Jul 7, 2023

It may be a little late, but I think it's a bummer to have to download the old JRA55 data split into files in order to user the newer version of the code. Every test machine and every user will have to do this as soon as they update CICE with this PR. And it doesn't change anything in terms of forcing data or results (hopefully). But it's kind of a big job to ask of the community. One way around this is to deprecate the JRA55 dataset in favor of the JRA55do and say that you have to download the new dataset. But that could be problematic for folks that have established testing and development with JRA55 and want to stick with that. From the user perspective, the best thing would be to preserve the current JRA55 forcing option with the current files and to add the JRA55do files (in whatever format we prefer) and make the JRA55do the default if we think it's better. And yes, the JRA55 is being discontinued. If the QC is passes with JRA55 and JRA55do, and JRA55 is being discontinued, remind me why we're doing this now. Are simulations improved somehow?

What we may need to do is think about what comes after JRA55. Or maybe it's fine to have 10 years of JRA55 forcing for the forseeable future since it's not supposed to be used for science anyway.

@daveh150
Copy link
Contributor Author

daveh150 commented Jul 7, 2023

I think one could get by with making symlinks instead of re-downloading the old JRA55 forcing. The only difference is the filename having the grid resolution in a consistent location. This avoids having hard-coded filenames in the code, or an archaic if-else tree.

Your bigger question is a good one... why are we doing this? I think the intent was to give option to use JRA55do as better for forcing ocean-ice models by making some corrections to JRA55. If the benefit really isn't worth the effort, maybe as you suggest perhaps it is better if we spend effort on a supported forcing set. We typically use CFSR when we need climate extended data, but we don't do a whole lot with that.

@apcraig
Copy link
Contributor

apcraig commented Jul 7, 2023

See also #840.

What's happening with the old JRA55 datasets? The files just need to be renamed or linked? OK, that's less of an issue then, I thought there was some files were split into individual years or something. That's true for all the different resolutions for the old JRA55 dataset?

I think we need to decide what we're doing here and how we're doing it, what's going to be the default, are we supporting both, and how to let the community know what's happening.

If the JRA55do provides some advantages, then we should make it the default. And we should support the old datasets at all resolutions and with a namelist switch, with at most a requirement to add soft links in the local inputdata area. And maybe the CICE scripts can automatically create those links for users if the new files/links don't exist, I could help with that. If the JRA55do does not provide benefits, I would close this PR. The only caveat might be if we think the JRA55do dataset could be our standard forcing dataset for many years to come, then maybe it's a good idea to update to JRA55do now (while still providing support for the prior JRA55 version with relative ease of use).

@dabail10
Copy link
Contributor

Basically the JRA55do is corrected version of the JRA55 that is specially designed for ice-ocean simulations. The JRA55do is what is used in CESM and other modeling centers. The Tsujino et al. article that was on our inputdata page specifically refers to the JRA55do. While that change in forcing passes the QC test (above), I do think we should change to JRA55do as the default. This is more consistent with other uses. I agree we will have to support the original JRA55 forcing for older versions of the code. I have created Zenodo entries for the JRA55do by year only, except for the gx3. These code mods also require that the string 'gx1_' be added to the gx1 forcing. I have an updated version (by year only) on Zenodo and created symbolic links on cheyenne.

@apcraig
Copy link
Contributor

apcraig commented Jul 10, 2023

I'm OK with making the JRA55do the default. Just to be clear, that will change ALL test results and we will have to download the new data to all test platforms.

How about if we support the older JRA55 dataset without the links, using the old filenames directly in the code. I think this should be easy to do without having to create a bunch of redundant subroutines. Also, why don't we remove the grid name from the "atm_data_type" while we're doing this. What I envision is

atm_data_type = [JRA55,JRA55do,ncar,default,box2001,...]
atm_data_dir = ICE_MACHINE_INPUTDATA/CICE_data/forcing/gx1

and then the code would construct the filenames. So much is already hardcoded this should not be difficult. We could do the same for the NCAR_bulk option too. Let me checkout this code and prototype that implementation and see how it looks. Then we can discuss. I much prefer to make the code work with the current datasets than have to create links for the existing datasets.

@dabail10
Copy link
Contributor

I like this idea. However, the old JRA55 data should be using the gx1 in the filenames. The gx3 and tx1 have the grid name in the file names and it would be more consistent.

@apcraig
Copy link
Contributor

apcraig commented Jul 10, 2023

I totally understand that it would be more consistent to rename the old gx1 JRA55 dataset to have gx1 in the name. But that's not what we have now and I really don't like the idea of the community struggling when the JRA55 gx1 case suddenly doesn't work just because we changed the names, especially if it's not the default anymore. Why add that complication? I think we want things to be backwards compatible when it's possible.

@dabail10
Copy link
Contributor

Could we support both names?

@apcraig
Copy link
Contributor

apcraig commented Jul 10, 2023

Could we support both names?

I think we could. Let me see if I can make this work.

@apcraig
Copy link
Contributor

apcraig commented Jul 12, 2023

I have created a PR to the daveh150 jra55do branch to generalize the implementation a bit, daveh150#1. The implementation is now flexible to find JRA55 or JRA55do filenames of various formats and with fewer redundant subroutines. That PR also updates this PR to a more recent version of Consortium main. But there is still going to be a conflicting file based on the PR merge this morning. I can help fix that if needed. Or we can cancel this PR and open a new one as needed and I can help with that too if it's better. I suggest we get the changes in for jra55do then have a separate PR to change the default to jra55do.

@daveh150
Copy link
Contributor Author

@apcraig if your branch implements jra55do that does not have a merge conflict it sounds like it will be easier to just work from your repo for a PR, and we can delete this one. That seems like the cleanest route unless I'm missing something. Thank you Tony!

@apcraig
Copy link
Contributor

apcraig commented Jul 12, 2023

OK, will setup a clean PR and then close this one. Testing that now.

Several other issues. I think it's important to keep the inputdata usage as simple as possible and backwards compatible. To me, that means living with the current forcing filenames. They are not totally self consistent, but they are all unique and that's the main thing. And the new implementation will be fully compatible with all the filenames as they exist today. So, we should not worry about new file names for current files, creating links to rename filenames, or anything else like that. It just confuses usage. To that end,

  • I have updated the directory structure information on this page, https://github.com/CICE-Consortium/CICE/wiki/CICE-Input-Data to add the JRA55do files, that hadn't been done yet. I also reordered the information slightly so it's consistent between grids.
  • I have removed the JRA55 gx1 version 2 zenodo link from that same page. These files are the same files as the original versions, just different names. The new implementation will deal with the various flavors of filenames automatically. Unfortunately, zenodo still shows there are two versions, https://zenodo.org/record/8118239. Both should work for users and both are the same data (so that's slightly confusing). Could we delete the zenodo version 2 page?
  • I have removed some of the links that were added on cheyenne at /glade/p/cesm/pcwg_dev/CICE_data/forcing. OK, sort of makes the files look a little nicer, but this is not how the files are provided to the community. We should be duplicating what we provide to the community on cheyenne to make sure it works. @dabail10, I don't have permissions to remove the JRA55_tx1 links, can you do that? Also maybe check permissions in the pcwg_dev/CICE_data and Icepack_data directories on cheyenne. Group is "ncar" and some of it is group write-able. But there is a mix of permissions, just think about what it should be and maybe check that it's OK.

@apcraig apcraig mentioned this pull request Jul 13, 2023
18 tasks
@apcraig
Copy link
Contributor

apcraig commented Jul 13, 2023

Closing, see #843

@apcraig apcraig closed this Jul 13, 2023
@eclare108213
Copy link
Contributor

@daveh150 thanks for all your work getting this started, and for comparing the old and new JRA55 data sets (@dabail10 too!). It would be good to include links to those results in #843. I'll work on that bit -

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants