Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow overwrite of model output #272

Merged
merged 1 commit into from
Mar 8, 2024

Conversation

rtodling
Copy link
Collaborator

@rtodling rtodling commented Mar 8, 2024

Yesterday I ran into an issue w/ MAPL that I find it quite unpleasant in many respects – I stumbled on this in the past – had a exchanged w/ the SI group, they reset the default, but apparently that come back in more recent versions MAPL as being set w/ the undesirable default.

This is related to an knob placed in MAPL to prevent CFIO from overwriting any nc4 output. The way I stumbled on this yesterday was but doing some tests w/ the model; running on an interactive queue and re-running the model over a segment that had already been run w/ - and having the GCM crash on me saying that it could not overwrite its output.

In general this is an annoyance for debugging such as I was doing, but it hit me this morning that this is more than just an annoyance. This would make restarting the ensemble, in case of sporadic random crashes, really painful – essentially this is break the auto-restart of the ensembles.

I checked the M21C source code and from what I can tell it too has the inconvenient default.

I checked w/ SI group – they agree w/ me this is inconvenient – and told me there is a global flag that can be put in the history that tells cfio not to bother.

I am putting this here - you can decide if you want it.

BTW: I was looking at the @GEOSgcm_App as I would expect to find a M21C or R21C history there ... but I don't see one! Did we missing something in the commit?

@rtodling rtodling added the 0 diff The changes in this pull request have verified to be zero-diff with the target branch. label Mar 8, 2024
@rtodling rtodling requested a review from a team as a code owner March 8, 2024 18:12
@elakkraoui
Copy link

@rtodling, the M21C history is here:
https://github.com/GEOS-ESM/GEOSgcm_App/blob/R21C/HISTORY_R21C.rc.tmpl

As for the proposed change, yes of course, we need to be able to restart the ensemble. I haven't had any issue restarting the ensemble in my tests, but that might be just luck.

@rtodling
Copy link
Collaborator Author

rtodling commented Mar 8, 2024

Typically when the ensemble crashes it is because a node has gone sawer. Most of the time this happens even before the model writes out anything. In such cases we'd not have a problem restarting the ensemble.

It's easy to test my hypothesis by simply running the ensemble; kill one of the members after some output has been written; then launching the workflow again ...

@rtodling
Copy link
Collaborator Author

rtodling commented Mar 8, 2024

@rtodling, the M21C history is here: https://github.com/GEOS-ESM/GEOSgcm_App/blob/R21C/HISTORY_R21C.rc.tmpl

The above is nowhere to be seen in what you and Scott placed in the repo ... or if it is somewhere it is not in the tag I released!

As for the proposed change, yes of course, we need to be able to restart the ensemble. I haven't had any issue restarting the ensemble in my tests, but that might be just luck.

@rtodling rtodling merged commit 5bdfbad into R21C Mar 8, 2024
6 checks passed
@elakkraoui
Copy link

@rtodling, the M21C history is here: https://github.com/GEOS-ESM/GEOSgcm_App/blob/R21C/HISTORY_R21C.rc.tmpl

The above is nowhere to be seen in what you and Scott placed in the repo ... or if it is somewhere it is not in the tag I released!

I checked out the tag you released yesterday, the HISTORY_R21C is there.

If there's anything to clear up about the tagging process itself, let's wait until Scott gets back. So far, the R21C and your tag are both identical to my local build. Nothing is mission from what I can tell.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0 diff The changes in this pull request have verified to be zero-diff with the target branch.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants