Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in CnmfeSegmentationExtractor sampling frequency data. #153

Open
h-mayorquin opened this issue Jun 20, 2022 · 5 comments
Open

Comments

@h-mayorquin
Copy link
Collaborator

When I was working in #138 I noticed that there is a field in the gin data that should contain the frequency but is not used to extract the sampling frequency of the imaging extractor. For the data that is currently in gin moreover, this sampling frequency is different from the one that is assigned to the extractor:

from pathlib import Path
from roiextractors import CnmfeSegmentationExtractor

OPHYS_DATA_PATH = Path("/home/heberto/ophys_testing_data/")

file_path = str(OPHYS_DATA_PATH / "segmentation_datasets" / "cnmfe" / "2014_04_01_p203_m19_check01_cnmfeAnalysis.mat")

imaging_extractor = CnmfeSegmentationExtractor(file_path=file_path)
sampling_frequency_file = imaging_extractor._dataset_file[imaging_extractor._group0[0]]["inputOptions"]["Fs"][...][0][0]

print(f"extractor_sampling_freuqency = {imaging_extractor.get_sampling_frequency()}")
print(f"sampling_frequency_from_data = {sampling_frequency_file}")
extractor_sampling_freuqency = 1.8554811303400338
sampling_frequency_from_data = 10.0

These values should be the same, we even write the sampling frequency to this Fs field in this library:

if segmentation_object.get_sampling_frequency() is not None:
inputoptions.create_dataset("Fs", data=segmentation_object.get_sampling_frequency())

For some reason, the sampling frequency in this extractor is determined by dividing the number of frames by the total time of reading:

self._sampling_frequency = self.get_num_frames() / self._tot_exptime_extractor_read()

This makes sense as well so it seems to me that there is some inconsistency in the data that we have on gin.

@CodyCBakerPhD
Copy link
Member

Yeah, I'm starting to notice this kind of thing as well as I look at the segmentation data... Will post more as I dig into it

@CodyCBakerPhD
Copy link
Member

Of course, ultimately what we really need is detailed tests for each interface tested on actual data like in https://github.com/catalystneuro/roiextractors/blob/master/tests/test_scan_image_tiff.py

@CodyCBakerPhD
Copy link
Member

This makes sense as well so it seems to me that there is some inconsistency in the data that we have on gin.

Just a guess, but I do suppose this would be the natural result if the data array itself was stubbed, but not the 'total runtime' metadata.

@h-mayorquin
Copy link
Collaborator Author

This makes sense as well so it seems to me that there is some inconsistency in the data that we have on gin.

Just a guess, but I do suppose this would be the natural result if the data array itself was stubbed, but not the 'total runtime' metadata.

That's a good guess, I was thinking the units of the time might be wrong but that makes more sense. That said, I think that the extractor should get the sampling frequency from the designated place (to the degree that it is):

# self._sampling_frequency = self._dataset_file[self._group0[0]]['inputOptions']["Fs"][...][0][0]

@CodyCBakerPhD
Copy link
Member

That said, I think that the extractor should get the sampling frequency from the designated place (to the degree that it is):

self._sampling_frequency = self._dataset_file[self._group0[0]]['inputOptions']["Fs"][...][0][0]

Oh, absolutely - there are many formats that have that capability (either 100% baked in, or an 'optional' field of their headers) but don't right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants