Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent treatment of fill_value and missing_value during cubewrite #116

Open
blimlim opened this issue Sep 27, 2024 · 0 comments
Open
Labels
question Further information is requested Release Required for next release

Comments

@blimlim
Copy link
Collaborator

blimlim commented Sep 27, 2024

This issue is related to cubewrite refactoring work in #99, where we are extracting the missing_value/fill value modifications into a separate function.

The code is not currently consistent in the way it treats the fill value's type. It first sets a fill value depending on the type of the cube's data, and then writes this to the cube's missing_data attribute:

# Set the missing_value attribute. Use an array to force the type to match
# the data type
if cube.data.dtype.kind == 'f':
fill_value = 1.e20
else:
# Use netCDF defaults
fill_value = default_fillvals['%s%1d' % (cube.data.dtype.kind, cube.data.dtype.itemsize)]
cube.attributes['missing_value'] = np.array([fill_value], cube.data.dtype)

where the missing_data attribute is forced to match the cube data's type.

When the fill_value is used later as an argument to to the sman.write function, the same type conversion is not applied, e.g:

sman.write(cube,
zlib=True,
complevel=compression,
unlimited_dimensions=['time'],
fill_value=fill_value)

This means there can be a mismatch between the missing_value and fill_values types. E.g. if the cube.data.dtype is np.float32, then the missing_value will have type float32, while the fill_value supplied to sman.write will use the default python float64 type.

@MartinDix do you know if there are reasons that the type needs to be treated differently in these two places? In terms of turning this part of the code into a separate function and adding unit tests, I think it becomes simpler if we are able to convert both the missing_value and fill_value to match the cube.data.dtype.

@truth-quark truth-quark added question Further information is requested Release Required for next release labels Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested Release Required for next release
Projects
None yet
Development

No branches or pull requests

2 participants