Migrating mpl reader from ACT to xradar #159

zssherman · 2024-03-08T16:32:05Z

After the ACT dev call, we discussed on how moving the MPL reader to xradar would be more fitting:
ARM-DOE/ACT#806

I can give this a shot. I will just need to learn the backends of Xarray first.

kmuehlbauer · 2024-03-08T17:56:15Z

@zssherman Great initiative! It looks like MPL is also some binary format with neat header structures.

The sigmet/iris reader heavily uses these kind of structured decoding. This is also what #158 is trying to achieve for nexrad level2.

Maybe we can discuss on next open radar meeting which steps are necessary to get a prototype reader ready.

zssherman · 2024-03-08T19:25:40Z

@kmuehlbauer That sounds good to me!

kmuehlbauer · 2024-03-14T07:58:52Z

@zssherman Since there wasn't much time yesterday I'll follow up with some ideas/pointers here.

File Format Description: https://www.dropletmeasurement.com/manual/software-manual-sigmampl, pp. 46-49
mpl2nc tool by @peterkuma https://github.com/peterkuma/mpl2nc for inspiration

I'm not really sure how to handle the sidecar files, but we might just search/recognize them and directly read/decode as binary blobs (when without header).

For the main file the idea would be to use np.memmap for easy reading large data. See

xradar/xradar/io/backends/nexrad_level2.py

Lines 137 to 151 in 56a9ca1

    
           def __init__(self, filename, mode="r", loaddata=False): 
        
               """initalize the object.""" 
        
               self._fp = None 
        
               self._filename = filename 
        
               # read in the volume header and compression_record 
        
               if hasattr(filename, "read"): 
        
                   self._fh = filename 
        
               else: 
        
                   self._fp = open(filename, "rb") 
        
                   self._fh = np.memmap(self._fp, mode=mode) 
        
               self._filepos = 0 
        
               self._rawdata = False 
        
               self._loaddata = loaddata 
        
               self._bz2_indices = None 
        
               self.volume_header = self.get_header(VOLUME_HEADER)

Then the header could be directly extracted using the machinery from the iris/sigmet reader:

xradar/xradar/io/backends/nexrad_level2.py

Lines 187 to 190 in 56a9ca1

    
           def get_header(self, header): 
        
               len = struct.calcsize(_get_fmt_string(header)) 
        
               head = _unpack_dictionary(self.read_from_file(len), header, self._rawdata) 
        
               return head

For this the header structure needs some special layout, where decoding information can be attached into the OrderedDict:

xradar/xradar/io/backends/nexrad_level2.py

Lines 725 to 733 in 56a9ca1

    
           VOLUME_HEADER = OrderedDict( 
        
               [ 
        
                   ("tape", {"fmt": "9s"}), 
        
                   ("extension", {"fmt": "3s"}), 
        
                   ("date", UINT4), 
        
                   ("time", UINT4), 
        
                   ("icao", {"fmt": "4s"}), 
        
               ] 
        
           )

The actual data might be read with dedicated functions (eg names like get_data or similar), which uses header information about file offset, size and dtype. See the following for a (not so nice example):

xradar/xradar/io/backends/nexrad_level2.py

Lines 576 to 608 in 56a9ca1

    
           def get_data(self, sweep_number, moment=None): 
        
               """Load sweep data from file.""" 
        
               sweep = self.data[sweep_number] 
        
               start = sweep["record_number"] 
        
               stop = sweep["record_end"] 
        
               intermediate_records = [ 
        
                   rec["record_number"] for rec in sweep["intermediate_records"] 
        
               ] 
        
               filepos = sweep["filepos"] 
        
               moments = sweep["sweep_data"] 
        
               if moment is None: 
        
                   moment = moments 
        
               elif isinstance(moment, str): 
        
                   moment = [moment] 
        
               for name in moment: 
        
                   if self.is_compressed: 
        
                       self.init_record(start) 
        
                   else: 
        
                       self.init_record_by_filepos(start, filepos) 
        
                   ngates = moments[name]["ngates"] 
        
                   word_size = moments[name]["word_size"] 
        
                   data_offset = moments[name]["data_offset"] 
        
                   ws = {8: 1, 16: 2} 
        
                   width = ws[word_size] 
        
                   data = [] 
        
                   self.rh.pos += data_offset 
        
                   data.append(self._rh.read(ngates, width=width).view(f"uint{word_size}")) 
        
                   while self.init_next_record() and self.record_number <= stop: 
        
                       if self.record_number in intermediate_records: 
        
                           continue 
        
                       self.rh.pos += data_offset 
        
                       data.append(self._rh.read(ngates, width=width).view(f"uint{word_size}")) 
        
                   moments[name].update(data=data)

This get_data function is used in the ArrayWrapper to retrieve the data in a lazy manner, whereas header data is used to provide the information to create the DataArrays/Dataset.

xradar/xradar/io/backends/nexrad_level2.py

Line 1227 in 56a9ca1

class NexradLevel2ArrayWrapper(BackendArray):

This is then used in the XarrayStore to provide Variables/Coordinates

xradar/xradar/io/backends/nexrad_level2.py

Line 1310 in 56a9ca1

def open_store_variable(self, name, var):

xradar/xradar/io/backends/nexrad_level2.py

Line 1328 in 56a9ca1

def open_store_coordinates(self):

I hope this does at least make some sense and you could give it a try.

zssherman added the enhancement New feature or request label Mar 8, 2024

zssherman self-assigned this Mar 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrating mpl reader from ACT to xradar #159

Migrating mpl reader from ACT to xradar #159

zssherman commented Mar 8, 2024 •

edited

Loading

kmuehlbauer commented Mar 8, 2024

zssherman commented Mar 8, 2024

kmuehlbauer commented Mar 14, 2024

Migrating mpl reader from ACT to xradar #159

Migrating mpl reader from ACT to xradar #159

Comments

zssherman commented Mar 8, 2024 • edited Loading

kmuehlbauer commented Mar 8, 2024

zssherman commented Mar 8, 2024

kmuehlbauer commented Mar 14, 2024

zssherman commented Mar 8, 2024 •

edited

Loading