-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recipe yml structure #3
Comments
@BSchilperoort Thanks, it looks good, and it has the minimum required information. I think the sections can be reorganized. For example, I suggest using the same structure as springtime recipes for the 'datasets' section. Let's have a separate section for configurations:
run_directory: /home/bart/Data/lsmdata/test
download: True # /home/bart/Data/lsmdata/test/download_dir will be created
documentation:
description:
Example recipe that downloads two variables from era5_land data and converts
them to ALMA format.
datasets:
test:
dataset: era5-land
frequency: hourly
years: [1980, 2020]
area:
name: test
bbox: [3, 50, 6, 54]
variables:
- air_temperature # will map to 2m_temperature...
- height_m: 2 # optional extra argument
- dewpoint_temperature
- height_m: 2
converter: # /home/bart/Data/lsmdata/test/processed will be created
convention: ALMA
flavor: PLUMBER2 # More specified than ALMA.
frequency: 1H # outputs at 1 hour frequency. Pandas-like freq-keyword.
resolution: 0.01 # output resolution in degrees. |
Thanks for the ideas. I do like the documentation part. I find having "datasets" and "dataset" a bit confusing. How about calling the first one a collection? Additionally, as the goal is to prepare input data for land surface models, the area and years will be the same for most datasets. So by default the collections:
stemmus_scope_NL:
years: [1980, 2020]
area: [3, 50, 8, 54]
dataset: era5-land
frequency: hourly
variables:
- air_temperature
dataset: CAMS
years: [2004, 2020] # overrides 'years' from collection level
variables:
- co2
dataset: dummy_data
years: [1980, 2003] # No data available for these years.
variables:
co2:
unit: ppm
value: 350
|
Or maybe more specific, "datasets" and "source"? |
Well, they are also source datasets. As we're making a superset of those, the term "collection" feels most apt to me. Or only "collections" and "sources" to avoid the word altogether. But it would probably be best to avoid calling the result a new dataset, as the result should not be shared. Redistribution will probably violate some of the license agreements etc. We should be careful with licenses and properly attributing the sources, see also #4 |
What about |
I have added an example recipe structure to the repository:
Any thoughts, @SarahAlidoost, @geek-yang ?
The text was updated successfully, but these errors were encountered: