Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in BrukerTiffSinglePlaneImagingExtractor #343

Merged
merged 15 commits into from
Jun 12, 2024
Merged

Conversation

h-mayorquin
Copy link
Collaborator

The current code assumes that the channel name (the stream) should be in the filename. This is not the case for some data that I have. Here is a frame element in the correspoding xml

    <Frame relativeTime="0.00949076" absoluteTime="41.2114907600011" index="3" parameterSet="CurrentSettings">
      <File channel="1" channelName="Red" filename="TSeries-20240527-001_Cycle00001_Ch1_000003.ome.tif" />
      <File channel="2" channelName="Green" filename="TSeries-20240527-001_Cycle00001_Ch2_000003.ome.tif" />

As you can see the current asumption is not valid and therefore the code is a bug. The fix proposed in this PR yses the channelName in the xml element to filter the file paths that correspond to the channel.

@h-mayorquin h-mayorquin self-assigned this Jun 10, 2024
@h-mayorquin
Copy link
Collaborator Author

Ah, OK, this is a case for a problem when using a single argument to specify plane and channel.

Here:

imaging_extractors = []
for stream_name in plane_stream_names:
extractor = BrukerTiffSinglePlaneImagingExtractor(folder_path=folder_path, stream_name=stream_name)
imaging_extractors.append(extractor)

We pass plane_stream (a string identifying the plane) as a stream to BrukerTiffSinglePlaneImagingExtractor. But in some other places we pass a channel. The current implementation is relying on the fact that both are included in the filename and that this code will work. However, as I argued above the channelName is not always in the filename.

I think at least at this level we should seprate them, it would make the implementation easier.

@weiglszonja
Copy link
Contributor

weiglszonja commented Jun 11, 2024

Thank you @h-mayorquin for looking into this issue. I still think that (following this issue):

We refactor the code so the classes are clearly separated as you guys propose.

would be the way to go, it would follow how @pauladkisson refactored the extractor for ScanImage, I believe we don't have a routing class there either. We didn't know it was a bug because the examples that we've seen for Bruker always followed the convention we built on, I think it would be reasonable to have designated classes instead of a router.

@h-mayorquin
Copy link
Collaborator Author

Ok, @weiglszonja I found a way of solving this without requiring more changes. I think that the changes we have discussed are more robust and would make the code clear but this should be a intermediate fix. Note that this requires #344 before merging.

Comment on lines +395 to +401
else: # This is the case for when stream_name is a plane_stream
file_names = [file.attrib["filename"] for file in file_elements]
file_names_for_stream = [file for file in file_names if self.stream_name in file]
if file_names_for_stream == []:
raise ValueError(
f"The selected stream '{self.stream_name}' is not in the available channel_streams '{streams['channel_streams']}'!"
)
Copy link
Contributor

@weiglszonja weiglszonja Jun 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, @weiglszonja I found a way of solving this without requiring more changes. I think that the changes we have discussed are more robust and would make the code clear but this should be a intermediate fix. Note that this requires #344 before merging.

Ah I see, this can work. thanks for fixing these issues @h-mayorquin ! I think this is already a great improvement, also with lxml.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the string ducktyping (changing the workflow depending on the value/shape/content of the string) that I did not want to do but at least it solves the problem. I think it would be better if we had another argument plane_stream or just pass file_paths somewhere as discussed. But I think that discussion is more dififcult and this already solves the bug so...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with you, we can revisit this once we have the bandwidth to do the file_paths approach.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge #346 into this once finished and let me know when this is ready for final review.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready for final review.

#346 is an optimization on top of this but this PR is the bug fix.

I would prefer to merge this first and then add the optimization as another PR if you are OK with that.

@h-mayorquin h-mayorquin merged commit fee7dd3 into main Jun 12, 2024
29 checks passed
@h-mayorquin h-mayorquin deleted the file_names branch June 12, 2024 17:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants