-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
area across 0/360 degree longitude #23
Comments
Hi @QianqianHan96, My reasoning behind keeping the ERA5 longitude in the [0;360] range, was that ERA5 was by far the largest dataset, so I thought it made more sense to adapt the other datasets to it. When I selected the range[0;125] degree in the inference notebook, I just wanted to test whether I could handle that spatial extent, and I did not really care about the specific region selected. Concerning the problems you are observing at point 1., 2. and 4. above, I think they are all related to the same issue, which is the fact that your longitude coordinates of you final DataArray are not continuous (they somehow have a "gap"). When you bring a dataset that extends only over Europe and is in the [-180;180] range to the [0;360] range, you end up with a longitude coordinate array that looks something like: A possible solution for this could be to fill the array with a gap in the x coordinates with NaNs, you should be able to do it with reindex or reindex_like. They should lead to an DataArray with a "continuous" longitude coordinate (all steps from 0 to 360) and lots of NaNs for the values for which you had no coordinates.. |
Hi Francesco, Thanks a lot for your detailed explanation. It helps a lot. I understand now. When I convert [-180;180] range to [0;360] range in not global map, I need to make sure the converted result has continuous longitude. If it is global, it is continuous of course. |
Hi Francesco, I was just wondering is it possible to select longitude not continuous in one time, [0,67] and [330,360]? I also mentioned this problem to Sarah during today's group meeting. |
Hi @QianqianHan96, I am not sure I understand what you mean. Do you mean if it is possible to select the longitude ranges [0; 67] and [330;360] without dropping the coordinates in between? Does the following do what you want? import dask.array as da
import numpy as np
import xarray as xr
# data array with longitude in [0;360] range
tmp = xr.DataArray(
data=da.random.random((180, 360), chunks=(45, 45)),
coords={
'lon': np.arange(360),
'lat': np.arange(-90, 90),
},
dims=('lat', 'lon'),
)
# select longitude range [0;67] and [330;360], remaining is set to NaN
tmp_crop = tmp.where((tmp.lon < 67) | (tmp.lon > 330))
# plot
tmp_crop.plot.imshow() |
Hi Francesco, this is what I meant. But I just checked, in this way, the not selected pixels became nan, and the .nbytes/2**30 is same as the original global image. Then the computation is equal to run global data instead of Europe area? If I use .sel, I will get not continuous longitude between 67 and 330, then I will see the weird plot again (cell 127 in https://github.com/EcoExtreML/Emulator/blob/main/0preprocessing/4-SSM360.ipynb). But this way, the data size is decreased. The data size is shown below (small test on CRIB), era5land image in one timestep is tested. |
Correct, it's quite wasteful. You could see whether the results makes sense when you run the resampling of SSM using the "non-continuous" coordinates (thus leaving a gap between 67 and 330). If results make sense, you can then fill in the gap in the longitude values using |
Thanks for your answer. Another thing I have to be careful is the size of output data because of the storage on snellius and later I have to publish the data. So in this case, filling the gap when I save the data will make the output data bigger. |
In branch |
Hi Sarah, Thank for your help. Indeed, if we convert [0,360] to [-180,180], then Europe will have continuous longitude. Is there another way we can choose the whole Europe in [0, 360] range? I am curious how ERA5Land people do this. In any case, only Europe and Africa are across 0° longitude. So not a big problem. |
Please note that the implementation in the notebook 1-ERA5-land.ipynb only saves one year of era5land data However, see the notebook Cut_xarray_dataset.ipynb, I updated it using dask but on my local computer and only 5 cores. I used two files of era5_land data in this example. The notebook converts the longitude values, cuts the data and saves the data in Two comments about
|
Thanks for your help, Sarah. Actually https://github.com/EcoExtreML/Emulator/blob/main/0preprocessing/1-ERA5-land.ipynb used 64 cores, but it saves one year of era5land data with 8 variables, which means 96 data files. And also it saved the global area (around 15 times larger than Europe). We want to run for global area in the end. But I am not sure is it better to rerun the preprocessing script again for all the input data for 10 years, because I keep [0,360] for all the input data (ERA5Land, LAI, SSM, CO2, landcover, canopyheight, vcmax) during preprocessing. Or just split Europe into two parts during inference. |
Hi Francesco, @fnattino
I saw you keep [0,360] as longitude standard during preprocessing and inference. I also tried to do the same for other variables: SSM, CO2, IGBP, hc, Vcmax when I preprocess them. The only problematic one is SSM, because it is not global, it is just for Europe. I tried two ways to resample it.
The resampled SSM from two ways display differently. But when I export the result to geotiff and open in ArcGIS, they are both wrong (seems like the third figure in the below figures). Do you have some experience about this?
Then I tried the third way, same as the first way, but do not clip era5land
The another reason I am worried about this is when we do inference on bigger scale, if we need to select an area, we better avoid 0/360 longitude line? I also saw you select [0,125] degree in the inference notebook.
The text was updated successfully, but these errors were encountered: