Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add soil temperature and soil moisture to ERA5 data, STEMMUS_SCOPE recipe #53

Merged
merged 22 commits into from
Aug 19, 2024

Conversation

BSchilperoort
Copy link
Contributor

@BSchilperoort BSchilperoort commented Feb 8, 2024

To add these I had to merge them and add a "depth" dimension. The original files are split by layer...

Closes #47

Note that the CDS is slow, so running the recipe might take a while. Running it overnight is probably best.

Example recipe:

# config (folder, login info etc goes to a ~/.zampy/config file)
name: "soil temperature test"

download:
  time: ["2019-01-01", "2019-01-31"]
  bbox: [54, 6, 53, 5] # NESW
  datasets:
    era5_land:
      variables:
        - soil_temperature
        - soil_moisture

convert:
  convention: ALMA
  frequency: 1H  # outputs at 1 hour frequency. Pandas-like freq-keyword.
  resolution: 0.25  # output resolution in degrees.

@BSchilperoort
Copy link
Contributor Author

@SarahAlidoost you should be able to run the following recipe now, which will download all input data for STEMMUS_SCOPE for half a year.
To make this recipe work for longer time periods we'll have to add a feature that adds NaNs if the requested start/end time is not appropriate for a dataset.

# config (folder, login info etc goes to a ~/.zampy/config file)
name: "STEMMUS_SCOPE_input"

download:
  time: ["2020-01-01", "2020-06-30"]
  bbox: [60, 10, 50, 0] # NESW
  datasets:
    era5_land:
      variables:
        - air_temperature
        - dewpoint_temperature
        - soil_temperature
        - soil_moisture
    era5:
      variables:
        - total_precipitation
        - surface_thermal_radiation_downwards
        - surface_solar_radiation_downwards
        - surface_pressure
        - eastward_component_of_wind
        - northward_component_of_wind
    eth_canopy_height:
      variables:
        - height_of_vegetation
    fapar_lai:
      variables:
        - leaf_area_index
    land_cover:
      variables:
        - land_cover
    prism_dem_90:
      variables:
        - elevation
    cams:
      variables:
        - co2_concentration

convert:
  convention: ALMA
  frequency: 1H  # outputs at 1 hour frequency. Pandas-like freq-keyword.
  resolution: 0.25  # output resolution in degrees.

@BSchilperoort
Copy link
Contributor Author

Not sure why the Windows tests are failing. I am able to reproduce it on my machine, but it seems like the netCDF4 library or xarray is not releasing the file lock on the netCDF files. This makes the temp dir clean up fail because it cannot unlink the files.

I did not change that part of the code either, so it probably has something to do with a new version somewhere. Or with Dask because I did make the CI use Dask distributed (to avoid memory issues as the default scheduler is bad).

@SarahAlidoost
Copy link
Member

Not sure why the Windows tests are failing. I am able to reproduce it on my machine, but it seems like the netCDF4 library or xarray is not releasing the file lock on the netCDF files. This makes the temp dir clean up fail because it cannot unlink the files.

I did not change that part of the code either, so it probably has something to do with a new version somewhere. Or with Dask because I did make the CI use Dask distributed (to avoid memory issues as the default scheduler is bad).

Looking at the log of action, it seems that there are different errors on windows:

E                   PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\a\\zampy\\zampy\\tests\\test_data\\fapar-lai\\tmp\\tmpvxggwz_9\\c3s_LAI_20190110000000_GLOBE_PROBAV_V3.0.1.nc'
E           NotADirectoryError: [WinError 267] The directory name is invalid: 'D:\\a\\zampy\\zampy\\tests\\test_data\\fapar-lai\\tmp\\tmpvxggwz_9\\c3s_LAI_20190110000000_GLOBE_PROBAV_V3.0.1.nc'
FAILED tests/test_datasets/test_fapar_lai.py::TestFaparLAI::test_ingest - NotADirectoryError: [WinError 267] The directory name is invalid: 'D:\\a\\zampy\\zampy\\tests\\test_data\\fapar-lai\\tmp\\tmpvxggwz_9\\c3s_LAI_20190110000000_GLOBE_PROBAV_V3.0.1.nc'

I think Dask workers cause these errors. Can you please refactor dask.distributed.Client() in test_fapar_lai.py::TestFaparLAI::test_ingest with submit and result methods of client and check if it fixes the errors.

Copy link
Member

@SarahAlidoost SarahAlidoost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BSchilperoort thanks 👍 I am running the recipe STEMMUS_SCOPE_input. Downloading from CDS is not very fast. But the changes in this PR look good. Can you please add the recipe STEMMUS_SCOPE_input as a yaml file to a subfolder e.g. recipes in the root and also add it to the documentation. Also, see my comment about the dask.distributed.Client() in the test.

@BSchilperoort
Copy link
Contributor Author

BSchilperoort commented Jul 26, 2024

FAPAR dataset test fixed, now a different test fails with a segfault due to rasterio...

Cause must be some dependency that has changed. My old environment passes all tests fine still.

@BSchilperoort BSchilperoort changed the title Add soil temperature and soil moisture to ERA5 data Add soil temperature and soil moisture to ERA5 data, STEMMUS_SCOPE recipe Jul 29, 2024
Copy link

sonarcloud bot commented Jul 29, 2024

Quality Gate Failed Quality Gate failed

Failed conditions
55.8% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

Copy link
Member

@SarahAlidoost SarahAlidoost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@BSchilperoort well done, thanks.

@SarahAlidoost SarahAlidoost merged commit 67d2ea9 into main Aug 19, 2024
9 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing Registration for Soil Moisture Data in 'src/zampy/reference/variables.py'
2 participants