-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change units caravan forcing, added support for data_sources, changed version #433
base: main
Are you sure you want to change the base?
Conversation
RolfHut
commented
Jun 18, 2024
- changed the units to be consistent with lumpedmakking
- added support for the new 'data_source' option. Reverts to ERA5_land (Standard for caravan) if not specified.
- changed hardcoded version in URL to global and bumped to v2
I'm running into some naming problems: ERA5 variables are re-named to standards, but those variables already exist in the other datasets. |
Would you not first apply the selection of which dataset to use, therefore avoid any variable duplicates? |
Tried that: doesn't work. Empty parameter remains part of the xarray
object. I'm working on it.
|
Not really related to units, but would be nice to squeeze in here. OSError: [Errno -90] NetCDF: file not found: 'https://opendap.4tu.nl/thredds/dodsC/data2/djht/ca13056c-c347-4a27-b320-930c2a4dd207/1/01439500.nc' maybe catch this first, then actually try retrieve the dataset |
I thought I checked that, but in that case I made some mistakes with merging. Looks like it was appended rather than merged... The code I used is here, could you take a look if you spot an error Bart? I still have the (unmerged) files locally so if it's a quick fix I can run it again. |
Oh, I think I see the issue. Dimensions: (basin_id: 671, time: 12784, data_source: 3)
Coordinates:
* basin_id (basin_id) |S15 b'camels_01013500' ... b'camels_144...
* time (time) datetime64[ns] 1980-01-01 ... 2014-12-31
* data_source (data_source) |S64 b'nldas' b'maurer' b'daymet' When using that to combine this with the exisiting caravan dataset, it doesn't merge on the other dimensions (as they are dissimilar?). I should've checked this before uploading, sorry. Dimensions: (time: 14975, basin_id: 1153,
data_source: 4) I probably need to use Can give that a try sometime next week at the earliest. |
Yeah if the dimension size is not the same it's more challenging to merge the data. Perhaps the first step would be to make the basin count + labels the same for the new data, before merging the new variables into the original data. |
Hmm, I'm hesitant to change the original data: we are trying to give access to the original datasets. (not helping, just giving the broader context) |
I found that the first entry gives era5 and the last the other three (as it appended). We can use it this to Neater would be to update the dataset. @RolfHut @BSchilperoort what are your views on this? Im happy to do either whilst I still have a TUD account to update the dataset easily (not sure what happens after). |
I prefer the option with the different files based on which dataset you request. This does mean a new upload indeed. I have just fixed the naming issue: generating now works, but you do still have the problem with the double datasets. This also makes 'to_xarray()' fail. If you can give a headsup when the source data is fixed, I can see if 'to_xarray' works again and move on with the climate change analyses. |
@RolfHut I was looking into this again. Modifying the original datasets was too much work/data for me at the moment (encountered some issues with Caravan too). Writing code to work around it is a lot easier now. 482/1153 camels (USA) basins are missing the daymet data (not sure why: it's just missing 🤷♂️ ). Otherwise massaging the data into a useful format goes OK now, opening a PR soon. |
So ive managed to find some time to dig into the issue. There is 671 in the original camels USA dataset. This is used for my thesis work. In version 1 of the opendap, I just used the camels as present in the caravan datset, which is a subset of the orignal data set at 482 basins. This difference was unkown to me at the time, the forcing of this is all era5. I then attempted to add the original camels dataset which has 671 basin and 3 forms of forcing. 671 + 482 = 1153: I appended rather than merged the datasets ... I'll look into fixing this. As a continuation of my thesis I aim to publish a paper on it, with some time provided by HKV and some in my own time (considering I still have a bit to do). |