-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mask type control #119
Conversation
for roi_id, pixel_mask in zip(roi_ids, pixel_masks): | ||
plane_segmentation.add_roi(**{"id": roi_id, mask_type_kwarg: [tuple(x) for x in pixel_mask]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See other comment for how the column-wise was not working, but this is the only way I was able to successfully do this.
Note that the returned structure from ROIExtractors (the pixel_masks
object) is a np.ndarray
of shape (num_rois, num_dimensions)
- attempting to add this without the list-compression tuple wrapping results in shape errors. The list-compression tuple wrapping was taken from the working example of the PyNWB documentation: https://pynwb.readthedocs.io/en/stable/tutorials/domain/ophys.html#pixel-masks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerning the format, yeah, it looks strange but I remember having problems adding ragged data as column as well when working in the units table. That's why spikes over there are always added as rows.
You are already adding id
above in line 663, as we don't have test for this part yet I don't know if this will cause troubles or does it just work as a reference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that bit is important actually - if you know that the type of mask being written is one of the two types that currently has to be added by-row (since the column approach of #157 has yet to work), then you had to perform that row writing after the plane segmentation object is in memory but before adding other column data.
But also, when adding the data by row you have to set the ID there instead of as a separate column.
But if you're not writing any data row-wise, then you can pass the column of IDs directly into the initialization of the table as we have always done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, same problem as with the other dynamic tables. I remember facing issues with this in the electrodes and units table that made the writing trickier. The problem as you mentioned is that id can't not be passed as a column itself. It needs to be passed at initialization or built row wise.
Strangely, I just tested this in a planet segmentation object intilized without ints and is asking for mask_type as a required argument of add_roi:
for roi in roi_ids:
plane_segmentation.add_roi(id=roi)
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "/home/heberto/miniconda3/envs/neuroconv_environment/lib/python3.9/site-packages/hdmf/utils.py", line 645, in func_call
return func(args[0], **pargs)
File "/home/heberto/miniconda3/envs/neuroconv_environment/lib/python3.9/site-packages/pynwb/ophys.py", line 255, in add_roi
raise ValueError("Must provide 'image_mask' and/or 'pixel_mask'")
ValueError: Must provide 'image_mask' and/or 'pixel_mask'
Which I find it strange that you can omit if you build everything by columns but you can't not build a table by rows without a type of mask.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might want to raise an issue in PyNWB for that lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, do you have a segmentation_extractor that returns volumetric pixel_masks? Can we add some tests for this? Also, the CHANGELOG.
for roi_id, pixel_mask in zip(roi_ids, pixel_masks): | ||
plane_segmentation.add_roi(**{"id": roi_id, mask_type_kwarg: [tuple(x) for x in pixel_mask]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concerning the format, yeah, it looks strange but I remember having problems adding ragged data as column as well when working in the units table. That's why spikes over there are always added as rows.
You are already adding id
above in line 663, as we don't have test for this part yet I don't know if this will cause troubles or does it just work as a reference here?
This PR isn't quite ready to go, there's a lot of artifacts not working that I wanted your take on like the column-wise approach |
Yep, that's what I imagined -it is a draft after all- Sorry for not being able to help you there about that specific issue other than to share your perplexity. |
I'm simplifying this PR just to get the basic functionality in. Splintering column-wise stuff into #157. Last thing this should need is the debug to the validation to allow |
Yep. In NWB they're actually called
Done
Done |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.Three requests:
- Adding the other modifications done in this PR to the changelog. mask_type=None and the schema modifications.
- Adding a test for the case when mask_type = None
- Adding a test for the case of mask_type =="pixel"
plane_segmentation = PlaneSegmentation(**plane_segmentation_kwargs) | ||
|
||
if mask_type is not None and mask_type == "image": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of the following logic. You simplify the conditions and conditional structure at the price of initializing plane segmentation in two different ways:
if mask_type == "image" or mask_type is None: # Initialize with ids already to add columns later
plane_segmentation_kwargs.update(id=roi_ids)
plane_segmentation = PlaneSegmentation(**plane_segmentation_kwargs)
if mask_type == "image":
plane_segmentation.add_column(
name="image_mask",
description="Image masks for each ROI.",
data=H5DataIO(segmentation_extractor.get_roi_image_masks().T, **compression_options),
)
else:
plane_segmentation = PlaneSegmentation(**plane_segmentation_kwargs)
pixel_masks = segmentation_extractor.get_roi_pixel_masks()
num_pixel_dims = pixel_masks[0].shape[1]
# Some more code
mask_type_kwarg = f"{mask_type}_mask"
for roi_id, pixel_mask in zip(roi_ids, pixel_masks):
plane_segmentation.add_roi(**{"id": roi_id, mask_type_kwarg: [tuple(x) for x in pixel_mask]})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic simplified to avoid additional cyclomatic complexity and mimic the switch statement that we'll eventually be able to use for this type of logic
for roi_id, pixel_mask in zip(roi_ids, pixel_masks): | ||
plane_segmentation.add_roi(**{"id": roi_id, mask_type_kwarg: [tuple(x) for x in pixel_mask]}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, same problem as with the other dynamic tables. I remember facing issues with this in the electrodes and units table that made the writing trickier. The problem as you mentioned is that id can't not be passed as a column itself. It needs to be passed at initialization or built row wise.
Strangely, I just tested this in a planet segmentation object intilized without ints and is asking for mask_type as a required argument of add_roi:
for roi in roi_ids:
plane_segmentation.add_roi(id=roi)
Traceback (most recent call last):
File "<string>", line 2, in <module>
File "/home/heberto/miniconda3/envs/neuroconv_environment/lib/python3.9/site-packages/hdmf/utils.py", line 645, in func_call
return func(args[0], **pargs)
File "/home/heberto/miniconda3/envs/neuroconv_environment/lib/python3.9/site-packages/pynwb/ophys.py", line 255, in add_roi
raise ValueError("Must provide 'image_mask' and/or 'pixel_mask'")
ValueError: Must provide 'image_mask' and/or 'pixel_mask'
Which I find it strange that you can omit if you build everything by columns but you can't not build a table by rows without a type of mask.
if num_params > 2: | ||
raise ValueError(conflict_message) | ||
# Special condition for Optional[...] | ||
if num_params == 2 and not args[1] is type(None): # noqa: E721 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if num_params == 2 and not args[1] is type(None): # noqa: E721 | |
if num_params == 2 and args[1] is not type(None): # noqa: E721 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add this to the changelog?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -958,7 +1007,7 @@ def write_segmentation( | |||
nwbfile_path=nwbfile_path, nwbfile=nwbfile, metadata=metadata_base_common, overwrite=overwrite, verbose=verbose | |||
) as nwbfile_out: | |||
|
|||
ophys = get_module(nwbfile=nwbfile_out, name="ophys", description="contains optical physiology processed data") | |||
_ = get_module(nwbfile=nwbfile_out, name="ophys", description="contains optical physiology processed data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this and the lines below 1091-1023?
Should we remove them in another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just linter cleanup, they're not doing anything. Technically the call to get_module
might be initializing the processing module, but the ophys
return is unused and the stuff below is unused
@CodyCBakerPhD I don't see why create an option to control something (pixel vs voxel) that they in fact do not control as the segmentation extractor controls this. The choice should be between "store_dense_mask" or "store_sparse_mask" and the code should take care of choosing between pixel or voxel automatically. Maybe there is something I am missing out? |
@CodyCBakerPhD Also, it could be that plane segmentation is different and a second pass does not make sense here. Plane segmentation refers to an imaging plane and I don't see why further rois segmented by another extractor could not be added to the same imaging plane. We can always add new plane segmentation and then link them to the same plane which could make things simpler for us but if that the case, maybe we could do the same for the units table and simplify the code there. What do you think? |
Hence the warnings that decide that based on the shape and issue a simple warning if the person didn't specify the right one.
Sounds like a use case to explore in the future
Hmm that is interesting... You cannot use a |
Done
Done
Done |
Yeah, but why give them an non-working option in the first place though? You are kind of semi patching a mistake of our making with the warning. How are you thinking about it? What is the purpose of having that option? The image_mask spatially dense case can be either 2 or 3 dimensional and we don't bother the user with specifying the dimension there as that is determined with the extractor. Why have a different logic for the sparse case? I really think that we should simplify the user life here, if I am missing something and this is truly the best way, then we should test those warnings. |
Because those are how they are written and accessed in the NWB file itself. Going with your suggested option wouldn't be clear to them how the final mask field would show up as a field of the neurodata type. Also not entirely true to call one dense and other sparse since the storage on disk in HDF5 will be sparse either way, it's just the structure returned in memory (which you can convert one to the other in-memory with a single line anyway). I'd call it an 'educational' warning to help teach people in that situation the difference between the two masks. It's subtle enough that we can easily correct it in the event they knew they wanted a non-image mask but weren't sure about the nomenclature for their data. TBH though, this will probably be a very rare use case since volumetric data is not the norm. Also most people might not even think about this option at all unless it affects performance, which is only in the cases like I'm seeing for whole-brain imaging. Added tests for the warnings and this functionality. |
Codecov Report
@@ Coverage Diff @@
## main #119 +/- ##
==========================================
+ Coverage 91.52% 91.53% +0.01%
==========================================
Files 65 65
Lines 3374 3392 +18
==========================================
+ Hits 3088 3105 +17
- Misses 286 287 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Continuation of #117 - adds
mask_type
specification toadd_plane_segmentation
.This one in particular has some difficulties to discuss as shown below