Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make syncing of restart directories to /g/data standard #275

Open
rmholmes opened this issue Jul 14, 2023 · 5 comments
Open

Make syncing of restart directories to /g/data standard #275

rmholmes opened this issue Jul 14, 2023 · 5 comments

Comments

@rmholmes
Copy link
Collaborator

Given the new scratch file expiry limits, I suggest that the sync_data.sh script be altered to automatically sync restarts, as well as outputs, to /g/data to avoid losing them all if you forget to do it by hand.

@aekiss
Copy link
Contributor

aekiss commented Jul 14, 2023

Agreed - this has been on my back burner for a while. Would need to specify two sync paths so we can store restarts somewhere different from the outputs that the cookbook syncs.

@aekiss
Copy link
Contributor

aekiss commented Jul 22, 2023

This would also require a rethink of how the payu driver, collation and sync script interoperate.

  • At present the sync script is run in parallel with model runs and collation, so it only syncs collated .nc files.
  • Payu doesn't automatically collate the most recent restart when collate: restart: true in config.yaml, only the one prior to that (this is so the model can run immediately without waiting for restarts to collate, but this is less likely to be a problem now we're using mppnccombine-fast).
  • Therefore final restart of an experiment doesn't get automatically collated so would never be auto-synced, despite being the most important one to save

In practice I've manually collated restarts at the conclusion of an experiment (using payu collate -d archive/restart<num>) and then submitted the sync_data.sh sync script (with SYNCDIR set to the restart destination).

It is not uncommon to have collation failures in restarts and outputs, so at the end of an experiment it's a good idea to collate any uncollated files and re-sync.

@aidanheerdegen
Copy link
Contributor

I've been meaning to make a payu issue for this, because it should be included in payu.

I think it would also be easier to do this way, with access to all the important run information.

@rmholmes
Copy link
Collaborator Author

Would need to specify two sync paths so we can store restarts somewhere different from the outputs that the cookbook syncs.

I feel that it would be much easier if outputs and restarts were in the same folder, as they are already in the job directory archive. I've been syncing my restarts for the ERA-5 runs across to the same directory in ik11. To avoid this causing problems with the cookbook - can't we just tell the cookbook not to look at anything within a folder starting with restart*/?

@aidanheerdegen
Copy link
Contributor

An advantage to there being so many open issues on the payu repo, often there is already an issue to cover your use case, and voila!

payu-org/payu#200

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants