Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Netcdf chunking and compression #115

Merged
merged 2 commits into from
Jan 31, 2024

Conversation

anton-seaice
Copy link

Hi Tony

Let me know what you think

I didn't update the text in the docs, or the tests

Cheers

subname//' ERROR: deflating coord '//coord%short_name,file=__FILE__,line=__LINE__)
endif

if (history_format=='hdf5' .and. dimids(1)==imtid .and. dimids(2)==jmtid) then
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to revisit this for 1-d coords

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chunksize doesn't change the file, right? It's just about "how" to write the date, not what. chunking may not be important for 1d arrays for current implementation of CICE with 2d grids. All CICE 1d arrays are basically axes which are relatively small overall. Should we just ignore 1d chunking for now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - the advice seems to be don't chunk the 1d vars. Most coords use i as the first var in the array and j as the second, but there are a couple of exceptions (which we can basically ignore / not chunk).

Copy link
Owner

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @anton-seaice, Lots of good clean up here. Had several minor comments in code. I guess my biggest question is why is deflate reset to 0 or 1 in places, why aren't we using the 0-9 thing. That integer plays a role in the amount of compression.

cicecore/cicedyn/analysis/ice_history_shared.F90 Outdated Show resolved Hide resolved
cicecore/cicedyn/general/ice_init.F90 Outdated Show resolved Hide resolved
cicecore/cicedyn/general/ice_init.F90 Outdated Show resolved Hide resolved
cicecore/cicedyn/general/ice_init.F90 Outdated Show resolved Hide resolved
cicecore/cicedyn/general/ice_init.F90 Outdated Show resolved Hide resolved
!-----------------------------------------------------------------
! 3D (fsd)
!-----------------------------------------------------------------
dimid3(1) = imtid
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation here looks off. Also, kind of liked the old "comment blocks". Also would consider keeping definition of dimidz for each group as noted above.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I thought they were kind of confusing - because they were all under the comment "! define attributes for time-variant variables" but then used the horizontal lines in the dame way that comment does.

subname//' ERROR: deflating coord '//coord%short_name,file=__FILE__,line=__LINE__)
endif

if (history_format=='hdf5' .and. dimids(1)==imtid .and. dimids(2)==jmtid) then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chunksize doesn't change the file, right? It's just about "how" to write the date, not what. chunking may not be important for 1d arrays for current implementation of CICE with 2d grids. All CICE 1d arrays are basically axes which are relatively small overall. Should we just ignore 1d chunking for now?

cicecore/cicedyn/infrastructure/io/io_pio2/ice_restart.F90 Outdated Show resolved Hide resolved
@apcraig
Copy link
Owner

apcraig commented Jan 31, 2024

Just to coordinate a bit. My plan is to let you revise this a bit then I'll test and merge it onto my branch. Then I'll update the testing, documentation, and address your other comments in my PR. I'll test some more and then we can discuss if there are an other final issues we need to address. Thanks @anton-seaice, this is great.

@anton-seaice
Copy link
Author

Just to coordinate a bit. My plan is to let you revise this a bit then I'll test and merge it onto my branch. Then I'll update the testing, documentation, and address your other comments in my PR. I'll test some more and then we can discuss if there are an other final issues we need to address. Thanks @anton-seaice, this is great.

Great thankyou. I think I made all the changes - I am just retesting now :)

@apcraig
Copy link
Owner

apcraig commented Jan 31, 2024

Looks good, I'll run a quick set of tests, then merge.

@apcraig
Copy link
Owner

apcraig commented Jan 31, 2024

OK, I'm merging now. Found a couple minor things as part of initial testing (intent different in pio1 vs pio2, know you're not testing pio1) and you didn't add the new namelist to the ice_in file. I will fix those as part of my next update to the branch, but this is all looking OK. @anton-seaice, if you find any issues during your testing, let me know.

@apcraig apcraig merged commit e2297a3 into apcraig:ioformat Jan 31, 2024
1 check passed
@apcraig
Copy link
Owner

apcraig commented Feb 1, 2024

Just FYI. Did more testing, the deflate "flag" in pio restart was set to 0. So restarts weren't compressing. I'm fixing that, will push the changes to my branch in the next day. In case you noticed the same, didn't want you chasing it down too,

--- a/cicecore/cicedyn/infrastructure/io/io_pio2/ice_restart.F90
+++ b/cicecore/cicedyn/infrastructure/io/io_pio2/ice_restart.F90
@@ -931,7 +931,7 @@ subroutine define_rest_field(File, vname, dims)
 
 #ifndef USE_PIO1            
       if (restart_format=='hdf5' .and. restart_deflate/=0) then 
-         status = pio_def_var_deflate(File, vardesc, shuffle=0, deflate=0,deflate_level=restart_deflate)
+         status = pio_def_var_deflate(File, vardesc, shuffle=0, deflate=1, deflate_level=restart_deflate)
          call ice_pio_check(status, &
             subname//' ERROR: deflating restart field '//trim(vname),file=__FILE__,line=__LINE__)
       endif

@anton-seaice
Copy link
Author

OK, I'm merging now. Found a couple minor things as part of initial testing (intent different in pio1 vs pio2, know you're not testing pio1) and you didn't add the new namelist to the ice_in file. I will fix those as part of my next update to the branch, but this is all looking OK. @anton-seaice, if you find any issues during your testing, let me know.

Is configuration/scripts/ice_ic supposed to have every configuration var in it?

There is also an issue with the chunking still in history files. It works for 2d coords, but not for 3d/4d variables. And i need to check the nvertices case (which is a bit of an edge case).

Good pickup on the deflate :)

@apcraig
Copy link
Owner

apcraig commented Feb 1, 2024

Is configuration/scripts/ice_ic supposed to have every configuration var in it?

ice_in does have (more-or-less) all the namelist in it. Partly, it provides exposure to the user. But more importantly, if they are not there, they cannot be tested via the test suite. The test suite modifies namelist automatically, but can only do it for namelist that are in the default ice_in.

Generally, the default set in the namelist file is the same as the default in the model, just to make namelist documentation clearer. But that's not a formal requirement.

I'm testing my changes now, will merge those to the branch. Then I'll merge #116.

@anton-seaice
Copy link
Author

Great - thanks :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants