Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mask type control #119

Merged
merged 14 commits into from
Sep 16, 2022
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,8 @@
* Propagate `output_struct_name` argument to `ExtractSegmentationInterface` to match its extractor arguments. [PR #128](https://github.com/catalystneuro/neuroconv/pull/128)
* Added compression and iteration (with options control) to all Fluorescence traces in `write_segmentation`. [PR #120](https://github.com/catalystneuro/neuroconv/pull/120)
* For irregular recordings, timestamps can now be saved along with all traces in `write_segmentation`. [PR #130](https://github.com/catalystneuro/neuroconv/pull/130)
* Added `mask_type` argument to `tools.roiextractors.add_plane_segmentation` function and all upstream calls. This allows users to request writing not just the image_masks (still the default) but also pixels, voxels or `None` of the above. [PR #119](https://github.com/catalystneuro/neuroconv/pull/119)
* `utils.json_schema.get_schema_from_method_signature` now allows `Optional[...]` annotation typing and subsequent `None` values during validation as long as it is still only applied to a simple non-conflicting type (no `Optional[Union[..., ...]]`). [PR #119](https://github.com/catalystneuro/neuroconv/pull/119)


### Documentation and tutorial enhancements:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ def run_conversion(
stub_test: bool = False,
stub_frames: int = 100,
include_roi_centroids: bool = True,
mask_type: Optional[str] = "image", # Optional[Literal["image", "pixel"]]
iterator_options: Optional[dict] = None,
compression_options: Optional[dict] = None,
):
Expand All @@ -86,6 +87,7 @@ def run_conversion(
overwrite=overwrite,
verbose=self.verbose,
include_roi_centroids=include_roi_centroids,
mask_type=mask_type,
iterator_options=iterator_options,
compression_options=compression_options,
)
145 changes: 91 additions & 54 deletions src/neuroconv/tools/roiextractors/roiextractors.py
Original file line number Diff line number Diff line change
Expand Up @@ -549,6 +549,7 @@ def add_plane_segmentation(
metadata: Optional[dict],
plane_segmentation_index: int = 0,
include_roi_centroids: bool = True,
mask_type: Optional[str] = "image", # Optional[Literal["image", "pixel"]]
h-mayorquin marked this conversation as resolved.
Show resolved Hide resolved
iterator_options: Optional[dict] = None,
compression_options: Optional[dict] = None,
) -> NWBFile:
Expand All @@ -572,6 +573,17 @@ def add_plane_segmentation(
If there are a very large number of ROIs (such as in whole-brain recordings), you may wish to disable this for
faster write speeds.
Defaults to True.
mask_type : str, optional
There are two types of ROI masks in NWB: ImageMasks and PixelMasks.
Image masks have the same shape as the reference images the segmentation was applied to, and weight each pixel
by its contribution to the ROI (typically boolean, with 0 meaning 'not in the ROI').
Pixel masks are instead indexed by ROI, with the data at each index being the shape of the image by the number
of pixels in each ROI.
Voxel masks are instead indexed by ROI, with the data at each index being the shape of the volume by the number
of voxels in each ROI.
Specify your choice between these two as mask_type='image', 'pixel', 'voxel', or None.
If None, the mask information is not written to the NWB file.
Defaults to 'image'.
iterator_options : dict, optional
The options to use when iterating over the image masks of the segmentation extractor.
compression_options : dict, optional
Expand All @@ -582,14 +594,14 @@ def add_plane_segmentation(
NWBFile
The nwbfile passed as an input with the plane segmentation added.
"""
assert mask_type in ["image", "pixel", "voxel", None], (
"Keyword argument 'mask_type' must be one of either 'image', 'pixel', 'voxel', "
f"or None (to not write any masks)! Received '{mask_type}'."
)

iterator_options = iterator_options or dict()
compression_options = compression_options or dict(compression="gzip")

def image_mask_iterator():
for roi_id in segmentation_extractor.get_roi_ids():
image_masks = segmentation_extractor.get_roi_image_masks(roi_ids=[roi_id]).T.squeeze()
yield image_masks

# Set the defaults and required infrastructure
metadata_copy = deepcopy(metadata)
default_metadata = get_default_ophys_metadata()
Expand All @@ -599,16 +611,8 @@ def image_mask_iterator():
plane_segmentation_metadata = image_segmentation_metadata["plane_segmentations"][plane_segmentation_index]
plane_segmentation_name = plane_segmentation_metadata["name"]

add_imaging_plane(
nwbfile=nwbfile,
metadata=metadata_copy,
imaging_plane_index=plane_segmentation_index,
)

add_image_segmentation(
nwbfile=nwbfile,
metadata=metadata_copy,
)
add_imaging_plane(nwbfile=nwbfile, metadata=metadata_copy, imaging_plane_index=plane_segmentation_index)
add_image_segmentation(nwbfile=nwbfile, metadata=metadata_copy)

ophys = get_module(nwbfile, "ophys")
image_segmentation_name = image_segmentation_metadata["name"]
Expand All @@ -624,39 +628,63 @@ def image_mask_iterator():
imaging_plane_name = imaging_plane_metadata["name"]
imaging_plane = nwbfile.imaging_planes[imaging_plane_name]

plane_segmentation_kwargs = dict(
**plane_segmentation_metadata,
imaging_plane=imaging_plane,
columns=[
VectorData(
data=H5DataIO(
DataChunkIterator(image_mask_iterator(), **iterator_options),
**compression_options,
),
name="image_mask",
description="image masks",
),
VectorData(
data=accepted_ids,
name="Accepted",
description="1 if ROI was accepted or 0 if rejected as a cell during segmentation operation",
),
VectorData(
data=rejected_ids,
name="Rejected",
description="1 if ROI was rejected or 0 if accepted as a cell during segmentation operation",
),
],
id=roi_ids,
)
plane_segmentation = PlaneSegmentation(**plane_segmentation_kwargs)
plane_segmentation_kwargs = dict(**plane_segmentation_metadata, imaging_plane=imaging_plane)
if mask_type is None:
plane_segmentation = PlaneSegmentation(id=roi_ids, **plane_segmentation_kwargs)
elif mask_type == "image":
plane_segmentation = PlaneSegmentation(id=roi_ids, **plane_segmentation_kwargs)
plane_segmentation.add_column(
name="image_mask",
description="Image masks for each ROI.",
data=H5DataIO(segmentation_extractor.get_roi_image_masks().T, **compression_options),
)
elif mask_type == "pixel" or mask_type == "voxel":
pixel_masks = segmentation_extractor.get_roi_pixel_masks()
num_pixel_dims = pixel_masks[0].shape[1]

assert num_pixel_dims in [3, 4], (
"The segmentation extractor returned a pixel mask that is not 3- or 4- dimensional! "
"Please open a ticket with https://github.com/catalystneuro/roiextractors/issues"
)
if mask_type == "pixel" and num_pixel_dims == 4:
warn(
"Specified mask_type='pixel', but ROIExtractors returned 4-dimensional masks. "
"Using mask_type='voxel' instead."
)
mask_type = "voxel"
if mask_type == "voxel" and num_pixel_dims == 3:
warn(
"Specified mask_type='voxel', but ROIExtractors returned 3-dimensional masks. "
"Using mask_type='pixel' instead."
)
mask_type = "pixel"

mask_type_kwarg = f"{mask_type}_mask"
plane_segmentation = PlaneSegmentation(**plane_segmentation_kwargs)
for roi_id, pixel_mask in zip(roi_ids, pixel_masks):
plane_segmentation.add_roi(**{"id": roi_id, mask_type_kwarg: [tuple(x) for x in pixel_mask]})
Comment on lines +664 to +665
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment for how the column-wise was not working, but this is the only way I was able to successfully do this.

Note that the returned structure from ROIExtractors (the pixel_masks object) is a np.ndarray of shape (num_rois, num_dimensions) - attempting to add this without the list-compression tuple wrapping results in shape errors. The list-compression tuple wrapping was taken from the working example of the PyNWB documentation: https://pynwb.readthedocs.io/en/stable/tutorials/domain/ophys.html#pixel-masks

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concerning the format, yeah, it looks strange but I remember having problems adding ragged data as column as well when working in the units table. That's why spikes over there are always added as rows.

You are already adding id above in line 663, as we don't have test for this part yet I don't know if this will cause troubles or does it just work as a reference here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that bit is important actually - if you know that the type of mask being written is one of the two types that currently has to be added by-row (since the column approach of #157 has yet to work), then you had to perform that row writing after the plane segmentation object is in memory but before adding other column data.

But also, when adding the data by row you have to set the ID there instead of as a separate column.

But if you're not writing any data row-wise, then you can pass the column of IDs directly into the initialization of the table as we have always done.

Copy link
Collaborator

@h-mayorquin h-mayorquin Sep 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, same problem as with the other dynamic tables. I remember facing issues with this in the electrodes and units table that made the writing trickier. The problem as you mentioned is that id can't not be passed as a column itself. It needs to be passed at initialization or built row wise.

Strangely, I just tested this in a planet segmentation object intilized without ints and is asking for mask_type as a required argument of add_roi:

for roi in roi_ids:
    plane_segmentation.add_roi(id=roi)

Traceback (most recent call last):
  File "<string>", line 2, in <module>
  File "/home/heberto/miniconda3/envs/neuroconv_environment/lib/python3.9/site-packages/hdmf/utils.py", line 645, in func_call
    return func(args[0], **pargs)
  File "/home/heberto/miniconda3/envs/neuroconv_environment/lib/python3.9/site-packages/pynwb/ophys.py", line 255, in add_roi
    raise ValueError("Must provide 'image_mask' and/or 'pixel_mask'")
ValueError: Must provide 'image_mask' and/or 'pixel_mask'

Which I find it strange that you can omit if you build everything by columns but you can't not build a table by rows without a type of mask.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might want to raise an issue in PyNWB for that lol


if include_roi_centroids:
# ROIExtractors uses height x width x (depth), but NWB uses width x height x depth
tranpose_image_convention = (1, 0) if len(segmentation_extractor.get_image_size()) == 2 else (1, 0, 2)
roi_locations = segmentation_extractor.get_roi_locations()[tranpose_image_convention, :]
roi_locations = segmentation_extractor.get_roi_locations()[tranpose_image_convention, :].T
plane_segmentation.add_column(
name="ROICentroids", description="The x, y, (z) centroids of each ROI.", data=roi_locations.T
name="ROICentroids",
description="The x, y, (z) centroids of each ROI.",
data=H5DataIO(roi_locations, **compression_options),
)

plane_segmentation.add_column(
name="Accepted",
description="1 if ROI was accepted or 0 if rejected as a cell during segmentation operation.",
data=H5DataIO(accepted_ids, **compression_options),
)
plane_segmentation.add_column(
name="Rejected",
description="1 if ROI was rejected or 0 if accepted as a cell during segmentation operation.",
data=H5DataIO(rejected_ids, **compression_options),
)

image_segmentation.add_plane_segmentation(plane_segmentations=[plane_segmentation])
return nwbfile

Expand Down Expand Up @@ -728,10 +756,7 @@ def add_fluorescence_traces(
plane_index=plane_index,
)

roi_response_series_kwargs = dict(
rois=roi_table_region,
unit="n.a.",
)
roi_response_series_kwargs = dict(rois=roi_table_region, unit="n.a.")

# Add timestamps or rate
timestamps = segmentation_extractor.frame_to_time(np.arange(segmentation_extractor.get_num_frames()))
Expand Down Expand Up @@ -772,7 +797,6 @@ def add_fluorescence_traces(
trace_metadata = next(
trace_metadata for trace_metadata in response_series_metadata if trace_name == trace_metadata["name"]
)

# Build the roi response series
roi_response_series_kwargs.update(
data=H5DataIO(SliceableDataChunkIterator(trace, **iterator_options), **compression_options),
Expand Down Expand Up @@ -883,6 +907,7 @@ def write_segmentation(
buffer_size: int = 10,
plane_num: int = 0,
include_roi_centroids: bool = True,
mask_type: Optional[str] = "image", # Optional[Literal["image", "pixel"]]
iterator_options: Optional[dict] = None,
compression_options: Optional[dict] = None,
):
Expand Down Expand Up @@ -920,6 +945,17 @@ def write_segmentation(
If there are a very large number of ROIs (such as in whole-brain recordings), you may wish to disable this for
faster write speeds.
Defaults to True.
mask_type : str, optional
There are two types of ROI masks in NWB: ImageMasks and PixelMasks.
Image masks have the same shape as the reference images the segmentation was applied to, and weight each pixel
by its contribution to the ROI (typically boolean, with 0 meaning 'not in the ROI').
Pixel masks are instead indexed by ROI, with the data at each index being the shape of the image by the number
of pixels in each ROI.
Voxel masks are instead indexed by ROI, with the data at each index being the shape of the volume by the number
of voxels in each ROI.
Specify your choice between these two as mask_type='image', 'pixel', 'voxel', or None.
If None, the mask information is not written to the NWB file.
Defaults to 'image'.
"""
assert (
nwbfile_path is None or nwbfile is None
Expand Down Expand Up @@ -958,7 +994,7 @@ def write_segmentation(
nwbfile_path=nwbfile_path, nwbfile=nwbfile, metadata=metadata_base_common, overwrite=overwrite, verbose=verbose
) as nwbfile_out:

ophys = get_module(nwbfile=nwbfile_out, name="ophys", description="contains optical physiology processed data")
_ = get_module(nwbfile=nwbfile_out, name="ophys", description="contains optical physiology processed data")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this and the lines below 1091-1023?

Should we remove them in another PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just linter cleanup, they're not doing anything. Technically the call to get_module might be initializing the processing module, but the ophys return is unused and the stuff below is unused

for plane_no_loop, (segmentation_extractor, metadata) in enumerate(
zip(segmentation_extractors, metadata_base_list)
):
Expand All @@ -967,11 +1003,11 @@ def write_segmentation(
add_devices(nwbfile=nwbfile_out, metadata=metadata)

# ImageSegmentation:
image_segmentation_name = (
"ImageSegmentation" if plane_no_loop == 0 else f"ImageSegmentation_Plane{plane_no_loop}"
)
add_image_segmentation(nwbfile=nwbfile_out, metadata=metadata)
image_segmentation = ophys.data_interfaces.get(image_segmentation_name)
# image_segmentation_name = (
# "ImageSegmentation" if plane_no_loop == 0 else f"ImageSegmentation_Plane{plane_no_loop}"
# )
# add_image_segmentation(nwbfile=nwbfile_out, metadata=metadata)
# image_segmentation = ophys.data_interfaces.get(image_segmentation_name)

# Add imaging plane
add_imaging_plane(nwbfile=nwbfile_out, metadata=metadata)
Expand All @@ -982,6 +1018,7 @@ def write_segmentation(
nwbfile=nwbfile_out,
metadata=metadata,
include_roi_centroids=include_roi_centroids,
mask_type=mask_type,
iterator_options=iterator_options,
compression_options=compression_options,
)
Expand Down
16 changes: 11 additions & 5 deletions src/neuroconv/utils/json_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,11 +78,17 @@ def get_schema_from_method_signature(class_method: classmethod, exclude: list =
param_types = [annotation_json_type_map[x.__name__] for x in np.array(args)[valid_args]]
else:
raise ValueError("No valid arguments were found in the json type mapping!")
if len(set(param_types)) > 1:
raise ValueError(
"Conflicting json parameter types were detected from the annotation! "
f"{param.annotation.__args__} found."
)
num_params = len(set(param_types))
conflict_message = (
"Conflicting json parameter types were detected from the annotation! "
f"{param.annotation.__args__} found."
)
# Normally cannot support Union[...] of multiple annotation types
if num_params > 2:
raise ValueError(conflict_message)
# Special condition for Optional[...]
if num_params == 2 and not args[1] is type(None): # noqa: E721
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if num_params == 2 and not args[1] is type(None): # noqa: E721
if num_params == 2 and args[1] is not type(None): # noqa: E721

Copy link
Collaborator

@h-mayorquin h-mayorquin Sep 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add this to the changelog?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

raise ValueError(conflict_message)
param_type = param_types[0]
else:
arg = param.annotation
Expand Down
Loading