Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify Volumetric/MultiPlane Imaging API #245

Closed
2 tasks done
pauladkisson opened this issue Sep 18, 2023 · 6 comments · Fixed by #248
Closed
2 tasks done

Unify Volumetric/MultiPlane Imaging API #245

pauladkisson opened this issue Sep 18, 2023 · 6 comments · Fixed by #248

Comments

@pauladkisson
Copy link
Member

pauladkisson commented Sep 18, 2023

Current Behavior

Right now, the only imaging extractor that supports 3D imaging is BrukerTiff. It does so by taking a list of SinglePlaneImagingExtractors and combines them a la MultiImagingExtractor. Note that each SinglePlaneImagingExtractor is itself a MultiImagingExtractor that combines multiple 'chunked' files corresponding to a single depth plane. Each SinglePlaneImagingExtractor instance only supports 1 channel and 1 plane, which are passed via stream_name during the __init__. Each MultiPlaneImagingExtractor instance supports 1 channel (but all planes) which is passed via stream_name during the __init__.

Issues

  • 'streams' are a low-cohesion concept to represent optical channels and depth planes.
    • They are referred to by string names, which can be different in the MultiPlaneImagingExtractor (which only needs to specify the channel) and the SinglePlaneImagingExtractor (which needs to specify some formatted combination of channel and plane).
    • They are supposed to be analogous to 'AP', 'LF', etc. streams in spikeinterface, but those streams refer to collections of channels that undergo different types of pre-processing. In contrast, each stream as they are implemented in BrukerTiff just refer to 1 channel and 1 plane. There may be a relevant parallel in ophys (ex. functional vs anatomical streams), but specifying a plane/channel combo is not that parallel.
    • optical channels and depth planes are fully independent concepts (1 plane + many channels, 1 channel + many planes, many channels + many planes are all possible) with rich associated metadata that can guide the API: channel name, channel wavelength, plane depth, etc. By lumping these two axes together as 'streams' we lose the ability to interact with them separately.
  • Combining multiple SinglePlaneImagingExtractors with MultiImagingExtractor is a bit hacky and confusing.
    • MultiImagingExtractor was designed to concatenate ImagingExtractors along the frames axis, but here we want to combine multiple IEs along a separate depth axis.
    • Using MultiImagingExtractor for this purpose leads to wonky stuff like start_frames = [0, 0, ..., 0].
  • BrukerTiff specifies channel during the __init__ but all other IE's specify channel during the call to get_video.
    • Not sure which is better, but we should be consistent.

Proposed Solutions

  • Specify channel and plane separately by index (or ID?) and then the extractor handles the logic of how to load that channel/plane combo in their file structure (Bruker, ScanImage, etc.)
  • Add a MultiPlaneImagingExtractor base class to handle the combination of IE's along the depth axis. Could come with extra metadata properties like get_plane_depth.
  • We need to specify plane at __init__ time so that SinglePlaneImagingExtractors can be appropriately combined into MultiPlaneImagingExtractors. So, to keep it consistent we should specify channel also at __init__ time. This will require refactoring of all the existing IE's but I think it is the most convenient implementation agnostic to file structure. It also simplifies get_video to only take start_frame and end_frame, which I think is a good thing.

Do you have any interest in helping implement the feature?

Yes.

Code of Conduct

@pauladkisson
Copy link
Member Author

@CodyCBakerPhD @h-mayorquin @weiglszonja @alessandratrapani Please let me know what you all think!

@CodyCBakerPhD
Copy link
Member

@pauladkisson Thanks for writing this all up, I'll look deeper and leave more comments over next couple days

BrukerTiff specifies channel during the init but all other IE's specify channel during the call to get_video.
Not sure which is better, but we should be consistent.

Strongly against having it in the get_video call because it leads to the 4D ambiguity; much clearer to have separate objects for each channel if you want to do anything with them side-by-side, which makes it a field specified at __init__

So totally on board with that solution

This will require refactoring of all the existing IE's

Yes and no - their code will have to be changed, but it was also code that was never tested or used perhaps outside of the .sbx case? Which we've also not seen used in quite a while

It also simplifies get_video to only take start_frame and end_frame, which I think is a good thing.

👍

'streams' are a low-cohesion concept to represent optical channels and depth planes.

Yeah, been talking with @h-mayorquin about this lately. It's purposely ambiguous to also allow any other type of future formatting distinction we might need to add while remainig universal across the API

It's also flatter and simpler, though what it really does is shift the complexity of structure to the string itself, but as long as we have a convenient fetcher get_stream_names that is less complicated than breaking the approach into separate calls per plane/channel option.

Multiple optic channels found? It shows up as a series of structured strings. Multiple planes found? They show up as another part of the string structure. Want to initialize an extractor? Just specify a single string that matches one of the selections.

Specify channel and plane separately by index (or ID?)

Strongly against the ID/index route. Neo supports both at the same time and the ID is the annoying one, the bane of many a stream index issue. Strongly recommend using a human-readable identifier. Less strongly against splitting stream_name into channel_name + plane_name if others think that is worth it compared to my 'single argument is simpler' point above

@h-mayorquin
Copy link
Collaborator

h-mayorquin commented Sep 19, 2023

I agree with @pauladkisson that we should break streams. As a note, Streams are not about different types of preprocessing. It is basically stuff that can be processed together in the context of spikeitnerface and raw.io in neo (same sampling_rate, start and duration). The concept is not documented but you can find some details here.

How to break depth, channel and whatever else might come?

  • I think that depth (moving across planes) is a natural category. Space is there and is not moving anywhere. This should be a thing on its own and be addressed as an index/number. Moreover, this is dimension that the PhotonSeries in pynwb accepts and other related modalities take for granted (e.g. brain imaging). I think is a solid well defined concept so it gets a specification.
  • Channel is probably better addressed with a string. Sometimes is going to be a number but I also think that the concept of channel is less defined in ophys. In the current conversion that me and @alessandratrapani are working, this is used to describe two different filters (and therefore two molecules) but this does feel less stable and way more ad-hoc than space. Therefore, I think it should get a very (the most?) general type.
  • Whatever else might come? I feel that the instinct of @CodyCBakerPhD is right in trying to future proof here. We don't know how ophys will change and we might not want to commit ourselves to some specific categories. In that light, creating a common API which is now called streams to reflect this diversity (channel, modality, preprocessing or whatever else) is a good way of building resilient structure that we can move forward as technology changes. With the exception of deph for the reasons mentioned above, I would be fine with having optical channel dumped in streams along anything else that might come in the future specified as keyword argument with type string. This could be stream as we currently have it, could be modality, mode or any other word that we find and agree is better.

As for this:

Combining multiple SinglePlaneImagingExtractors with MultiImagingExtractor is a bit hacky and confusing.
MultiImagingExtractor was designed to concatenate ImagingExtractors along the frames axis, but here we want to combine multiple IEs along a separate depth axis.

This is not the original intention:
#11
But I think it was changed here:
#127

I agree that we are conflating a tool to concatenate data extractors in time with some internal machinery to assemble extractors from channels, planes or sub-extractors in time. We came to need the latter because some formats provide multiple files and we judged (corrrectly?) that composing in this way is easier than ad-hoc parsing. I think it makes sense separting assembling in time, channels and depth.

The analgous in spikeinterface are concatenation and channel agregation.

Using MultiImagingExtractor for this purpose leads to wonky stuff like start_frames = [0, 0, ..., 0].

Can you point out to where does this show up and also add other wonky stuff / difficulties? I feel this point should be more descriptive of the existing problems.

I will add something about the third point later.

@weiglszonja
Copy link
Contributor

@h-mayorquin

Using MultiImagingExtractor for this purpose leads to wonky stuff like start_frames = [0, 0, ..., 0].

Can you point out to where does this show up and also add other wonky stuff / difficulties? I feel this point should be more descriptive of the existing problems.

This is from the BrukerTiffMultiPlaneImagingExtractor which uses MultiImagingExtractor to concat multiple BrukerTiffSinglePlaneImagingExtractor over depth

self._start_frames = [0] * len(stream_names)
self._end_frames = [self._num_frames] * len(stream_names)

@h-mayorquin
Copy link
Collaborator

Thanks for the pointer @weiglszonja. I still would like to see more confusions that arise from this use. This is the only case that we have for Paul's second point.

@pauladkisson
Copy link
Member Author

Notes from Barcelona 2023:

  • Consensus that optical channels should be separate Extractors --> need to refactor old stuff to fit
  • 2 Main goals for the repo:
    • Make conversion easier
    • Define unified API for users like SpikeExtractors in e-phys
  • Guiding Principle: Focus on how it looks for the user (usu. conversions) and then design the API around that
  • specify channel_name and plane_name via string at init
  • look out for segments in the future if/when we get data
  • numpydoc_validation needs some kind of automatic docstring fixer before it's worth adding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants