Differences between NUG/CF _FillValue
and Zarr fill_value
(and how Xarray handles them)
#377
Replies: 6 comments 26 replies
-
I see that they are technically different, but in practice, are they. CF has missing_value, which is what should be used if you want to be explicit. I know that despite that people treat _FillValue as missing values-- but what else can you do? And what else could you do with zarr's fillvalue? Not set is not set, regardless of why. Any idea what carry does if the three have different values? |
Beta Was this translation helpful? Give feedback.
-
Yes, that's what we've done in CMIP for 20 years (and for the reason you state). missing=fill=same constant. Maybe a bit sloppy, but practical. |
Beta Was this translation helpful? Give feedback.
-
CF Appendix A describes |
Beta Was this translation helpful? Give feedback.
-
I haven't looked at the others, but f8 sure as heck looks like a bug. and why not NaN? ??? I"ll try to go post on the project to ask ... -CHB |
Beta Was this translation helpful? Give feedback.
-
well, yes, very likely true -- but that does apply to the largest integer exactly represented :-) -- though yes, maybe numbers that large aren't common. But anyway, I don't think that value is the largest integer that can be represented by float32 -- that would be, well the same as the largest number a float can represent -- and the largest of the range of integers that can be exactly represented is far smaller -- on order of 10^8 I think. We'll see what the netcdf-C devs say. I suppose the real lesson here is that you shouldn't use the defaults :-) |
Beta Was this translation helpful? Give feedback.
-
which are: If neither valid_min, valid_max nor valid_range is defined then generic applications should define a valid range as follows:
I think the idea with 3 and 4 here is that But yes, not for byte, where the _FillValue is 0, but they want 0 to be valid. And this does point to NaN as being the best option for float types if you don't want to restrict the range. |
Beta Was this translation helpful? Give feedback.
-
Question
Hi all,
I just wanted to point out an ongoing xarray (edit:
Zarr) discussion (issue #5475) about NUG/CF_FillValue
and Zarrfill_value
. It appears that they mean (are used for) very different things but Xarray is currently treating them the same.I haven't made it all the way through the conversation and I'm not sure when I will be able to get back to it. So if anyone else is interested in taking a look and commenting, that would be great.
Quick comparison (my understanding after a quick read):
Zarr
fill_value
is the value to use for undefined sections of an array, e.g., chunks that are not written. When the array is read, data from missing chunks is set to the value offill_value
. It does not necessarily imply missing data.NUG/CF
_FillValue
is the value used to initialize variables when allocating memory or disk space before data was written. When_FillValue
is found when reading a variable, it indicates that the data was never written for that index location in the variable. So it is always considered missing data.Beta Was this translation helpful? Give feedback.
All reactions