Inconsistent treatment of `fill_value` and `missing_value` during `cubewrite` #116

blimlim · 2024-09-27T04:33:45Z

This issue is related to cubewrite refactoring work in #99, where we are extracting the missing_value/fill value modifications into a separate function.

The code is not currently consistent in the way it treats the fill value's type. It first sets a fill value depending on the type of the cube's data, and then writes this to the cube's missing_data attribute:

um2nc-standalone/umpost/um2netcdf.py

Lines 331 to 339 in c1a4a54

    
           # Set the missing_value attribute. Use an array to force the type to match 
        
           # the data type 
        
           if cube.data.dtype.kind == 'f': 
        
               fill_value = 1.e20 
        
           else: 
        
               # Use netCDF defaults 
        
               fill_value = default_fillvals['%s%1d' % (cube.data.dtype.kind, cube.data.dtype.itemsize)] 
        
           cube.attributes['missing_value'] = np.array([fill_value], cube.data.dtype)

where the missing_data attribute is forced to match the cube data's type.

When the fill_value is used later as an argument to to the sman.write function, the same type conversion is not applied, e.g:

um2nc-standalone/umpost/um2netcdf.py

Lines 391 to 395 in c1a4a54

    
           sman.write(cube, 
        
                      zlib=True, 
        
                      complevel=compression, 
        
                      unlimited_dimensions=['time'], 
        
                      fill_value=fill_value)

This means there can be a mismatch between the missing_value and fill_values types. E.g. if the cube.data.dtype is np.float32, then the missing_value will have type float32, while the fill_value supplied to sman.write will use the default python float64 type.

@MartinDix do you know if there are reasons that the type needs to be treated differently in these two places? In terms of turning this part of the code into a separate function and adding unit tests, I think it becomes simpler if we are able to convert both the missing_value and fill_value to match the cube.data.dtype.

The text was updated successfully, but these errors were encountered:

truth-quark added question Further information is requested Release Required for next release labels Sep 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent treatment of `fill_value` and `missing_value` during `cubewrite` #116

Inconsistent treatment of `fill_value` and `missing_value` during `cubewrite` #116

blimlim commented Sep 27, 2024

Inconsistent treatment of fill_value and missing_value during cubewrite #116

Inconsistent treatment of fill_value and missing_value during cubewrite #116

Comments

blimlim commented Sep 27, 2024

Inconsistent treatment of `fill_value` and `missing_value` during `cubewrite` #116

Inconsistent treatment of `fill_value` and `missing_value` during `cubewrite` #116