Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rainbow reader: _FillValue and _Undetect #103

Open
egouden opened this issue Apr 3, 2023 · 13 comments
Open

Rainbow reader: _FillValue and _Undetect #103

egouden opened this issue Apr 3, 2023 · 13 comments
Labels
data-standards Data standard related

Comments

@egouden
Copy link
Contributor

egouden commented Apr 3, 2023

Description

In the Rainbow reader, _FillValue is set to 0 and _Undetect is not set. This is causing a loss of information.

Normally _Fillvalue should be set to 255 (or the max value for the byte encoding) and _Undetect to 0.

WMO CF Extensions:

@egouden egouden changed the title Rainbow: _FillValue and _Undetect Rainbow reader: _FillValue and _Undetect Apr 4, 2023
@mgrover1
Copy link
Collaborator

mgrover1 commented Apr 5, 2023

Thanks for letting us know! Contributions are always welcome :)

@mgrover1 mgrover1 added the bug Something isn't working label Apr 5, 2023
@kmuehlbauer
Copy link
Collaborator

Sorry for the late response, that went somehow under the radar.

Rainbow5 (at least Release 5.57.0) does only have a notion of "no data" which is best translated to _FillValue (it's zero for all radar moments on disk). There is no _Undetect comparable value in Rainbow5.

For instance a one byte reflectivity looks like this:

0x00 (0) no data
0x01 (1) -31.5
0x02 (2) -31.0
0x40 (64) 0.0
0x80 (128) 32.0
0xC0 (192) 64.0
0xFE (254) 95.0
0xFF (255) 95.5

We can't do much about the source format.

@kmuehlbauer kmuehlbauer removed the bug Something isn't working label Apr 5, 2023
@kmuehlbauer
Copy link
Collaborator

And I'm not considering that a bug, as specifying the _FillValue is sufficient for the CfRadial2 standard. ODIM_H5 has nodata as well as undetect mentioned, but if I interpret the documentation correctly both are not mandatory values.

@mgrover1
Copy link
Collaborator

mgrover1 commented Apr 5, 2023

Ahh okay! Sorry for the mislabelling here @kmuehlbauer , I appreciate the insight!

@kmuehlbauer
Copy link
Collaborator

No worries @mgrover1, we should invent a label which can be put there instead. How about data-format-issue?

@mgrover1
Copy link
Collaborator

mgrover1 commented Apr 5, 2023

how about data-standards?

@kmuehlbauer kmuehlbauer added the data-standards Data standard related label Apr 5, 2023
@egouden
Copy link
Contributor Author

egouden commented Apr 6, 2023

This is a known tricky issue.

  1. _Undetect

It is an optional attribute in FM301. It is defined as follow: "Indicates an area (range bin) that has been radiated but has not produced a valid echo".

It is a mandatory metadata in ODIM. It is defined as "Raw value used to denote areas below the measurement detection threshold (radiated but nothing detected)".

By convention, this special kind of value is often coded as zero.

In the documentation of Rainbow it is described as "no data". But this is misleading. It actually corresponds to _Undetect. It should not be considered as "missing data" using _FillValue.

  1. _FillValue

This is defined in CFRADIAL as "Indicates data are missing at this range bin" This has apparently been dropped in FM301 in favor of flag_values.

In ODIM 2.4 you have the similar mandatory (if applicable) NODATA, defined as "raw value used to denote areas void of data (never radiated)". In OPERA, it is also used when bad data have been removed.

There is actually no "missing value" in the Rainbow format. I was wrong here. So _FillValue should simply not be set.

I hope this clarifies my point of view.

@kmuehlbauer
Copy link
Collaborator

@egouden Great explanation. It's definitely confusing. The major pain point here is the netcdf standard which only knows _FillValue but no nodata or undetect whatsoever. It has a notion of missing_value, though.

So as default, when creating a netcdf variable you can specify a fill_value. If you leave it out, netcdf will take it's default value for that datatype as fill_value but it will not be attached to the variable.

It looks like we have to make sure we use apply the correct metadata etc on writing the different standards.

@egouden
Copy link
Contributor Author

egouden commented Apr 7, 2023

For UBYTE, the default _FillValue is 255. If this is a valid maximum value for a given format (e.g. Rainbow), one could replace 255 by 254 while setting 255 as _FillValue for clarity.

Note that a 255 valid value is most probably not meteorological. And it might be reset to 255 later by post processing.

@egouden
Copy link
Contributor Author

egouden commented Apr 17, 2023

This is actually a more general issue on the definition of special values in our data model. Is missing value "undetect" or "no valid data" or both? I think this is not clear in FM301. I will ask the OPERA ODIM expert for clarification.

@kmuehlbauer
Copy link
Collaborator

We will need to be careful here at implementation time, as xarray maps everything to np.nan (_FillValue, missing_value) but it is not able to tell that apart when writing.

I've also added links to the relevant WMO CF Extensions in the OP above.

@kmuehlbauer
Copy link
Collaborator

WMO_EXT_FillValue

This is an excerpt from the WMO CF Extensions. The only mandatory attribute in our context is valid_range, providing smallest and largest valiud values.

_FillValue is denoting missing or undefined data and is not mandatory (but it might be good practice to use the netcdf default fill_value here, if no other value is provided).

@syedhamidali
Copy link
Contributor

Based on the this Open Radar Discourse thread it seems that we need to add more metadata to the rainbow reader, something pretty similar to what is in this function:

def add_metadata_to_rainbow(dtree):
    data = np.array('axis_z', dtype='|S32')
    attributes = {
        'standard_name': 'primary_axis_of_rotation',
        'options': 'axis_z, axis_y, axis_x'
    }
    # Create the xarray DataArray
    dtree['primary_axis'] = xr.DataArray(data, attrs=attributes, dims=())
    
    fixed_angles = []
    for grp in dtree.groups:
        if "sweep" in grp:
            fixed_angle = dtree[grp]['sweep_fixed_angle'].values
            fixed_angles.append(fixed_angle)
            dtree[grp]['sweep_mode'] = xr.DataArray(
                np.array(dtree[grp]['sweep_mode'], dtype='|S32'),
                attrs={'standard_name': 'scan_mode_for_sweep',
     'options': 'sector, coplane, rhi, vertical_pointing, idle, azimuth_surveillance, \
     elevation_surveillance, sunscan, pointing, calibration, manual_ppi, manual_rhi'})
    
    fixed_angles = np.array(fixed_angles, dtype=np.float32)
    sweep_groups = np.array([f'sweep_{i}' for i in range(fixed_angles.size)])
    dtree['sweep_fixed_angle'] = xr.DataArray(fixed_angles, dims=('sweep'))
    dtree['sweep_group_name'] = xr.DataArray(sweep_groups, dims=('sweep'))
    dtree['volume_number'] = xr.DataArray(np.array(0, dtype=np.int16),
                                      attrs={'standard_name':'data_volume_index_number'})
    dtree['instrument_type'] = xr.DataArray(np.array('radar', dtype='|S32'),
                                      attrs={'standard_name': 'type_of_instrument',
                                             'options': 'radar, lidar',
                                             'meta_group': 'instrument_parameters'}
                                       )
    dtree['platform_type'] = xr.DataArray(
        np.array('fixed', dtype='|S32'),
        attrs={'standard_name': 'platform_type',
    'options':'fixed, vehicle, ship, aircraft_fore, aircraft_aft, aircraft_tail, aircraft_belly,\
    aircraft_roof, aircraft_nose, satellite_orbit, satellite_geostat'})
    
    return dtree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-standards Data standard related
Development

No branches or pull requests

4 participants