Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: 2-photon series movie not appearing in file. #1826

Closed
3 tasks done
rcpeene opened this issue Jan 17, 2024 · 15 comments
Closed
3 tasks done

[Bug]: 2-photon series movie not appearing in file. #1826

rcpeene opened this issue Jan 17, 2024 · 15 comments
Labels
category: bug errors in the code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users

Comments

@rcpeene
Copy link

rcpeene commented Jan 17, 2024

What happened?

We have packaged our NWB files and added 2-photon movies to the files. This is evidence by the fact that the file grows in size significantly, and that the movie can be seen when exploring the file in .h5 format. However, when opening with PyNWB, the object does not appear in nwb.acquisition.

Steps to Reproduce

- package a file using the following code snippet to insert a 2-photon series

ts = TwoPhotonSeries(
                name='raw_suite2p_motion_corrected',
                imaging_plane=plane,
                data=wrapped_data,
                format='raw',
                unit='SIunit',
                rate=10.71
            )
        input_nwb.add_acquisition(ts)
        io.write(input_nwb)

- open the file with PyNWB
- run `print(nwb.acquisition.keys())`

Traceback

No traceback

Operating System

Windows

Python Executable

Python

Python Version

3.9

Package Versions

accessible-pygments==0.0.4
aiohttp==3.7.4
aiosignal==1.3.1
alabaster==0.7.12
anyio==3.6.2
appdirs==1.4.4
argon2-cffi==21.3.0
argon2-cffi-bindings==21.2.0
argschema==2.0.2
arrow==1.2.3
asciitree==0.3.3
asttokens==2.2.0
async-timeout==3.0.1
attrs==21.4.0
Babel==2.10.3
backcall==0.2.0
bcrypt==4.0.1
beautifulsoup4==4.11.1
bg-atlasapi==1.0.2
bg-space==0.6.0
bidsschematools==0.7.1
bleach==5.0.1
boto3==1.28.10
botocore==1.31.10
bqplot==0.12.36
brainrender==2.0.5.5
bs4==0.0.1
cachetools==4.2.4
ccfwidget==0.5.3
cebra==0.2.0
cellpose==2.2.2
certifi==2022.9.24
cffi==1.15.1
chardet==3.0.4
charset-normalizer==2.1.1
ci-info==0.3.0
click==8.1.3
click-didyoumean==0.3.0
cloudpickle==2.2.0
colorama==0.4.6
colorcet==3.0.1
commonmark==0.9.1
contourpy==1.0.6
coverage==7.2.1
cryptography==41.0.3
cycler==0.11.0
dandi==0.55.1
dandischema==0.8.4
dask==2022.11.1
-e git+https://github.com/AllenInstitute/openscope_databook.git@1739f7510547142849b480093ad6f2789b8045c2#egg=databook_utils
debugpy==1.6.4
decorator==5.1.1
defusedxml==0.7.1
Deprecated==1.2.13
distro==1.8.0
dnspython==2.2.1
docutils==0.17.1
elephant==0.12.0
email-validator==1.3.0
entrypoints==0.4
etelemetry==0.3.0
exceptiongroup==1.1.0
execnet==1.9.0
executing==1.2.0
fasteners==0.18
fastjsonschema==2.16.2
fastremap==1.13.5
filelock==3.12.2
fonttools==4.38.0
fqdn==1.5.1
frozenlist==1.3.3
fscacher==0.3.0
fsspec==2022.11.0
future==0.18.2
gast==0.4.0
gitdb==4.0.9
GitPython==3.1.27
Glymur==0.8.19
google==3.0.0
greenlet==1.1.3
h5py==3.7.0
hdmf==3.9.0
humanize==4.4.0
idna==3.4
imagecodecs==2022.9.26
imageio==2.22.4
imagesize==1.4.1
importlib-metadata==4.13.0
importlib-resources==5.10.0
iniconfig==2.0.0
interleave==0.2.1
ipycanvas==0.13.1
ipydatagrid==1.1.14
ipydatawidgets==4.3.2
ipyevents==2.0.1
ipykernel==6.17.1
ipympl==0.9.2
ipysheet==0.5.0
ipython==8.7.0
ipython-genutils==0.2.0
ipytree==0.2.2
ipyvolume==0.6.0a10
ipyvtklink==0.2.3
ipyvue==1.8.0
ipyvuetify==1.8.4
ipywebrtc==0.6.0
ipywidgets==7.7.2
isodate==0.6.1
isoduration==20.11.0
itk-core==5.3.0
itk-filtering==5.3.0
itk-meshtopolydata==0.10.0
itk-numerics==5.3.0
itkwidgets==0.32.4
jaraco.classes==3.2.3
jedi==0.18.2
Jinja2==3.1.2
JIT==0.0.1
jmespath==0.10.0
joblib==1.2.0
jsonpointer==2.3
jsonschema==3.2.0
jupyter==1.0.0
jupyter-book==0.15.1
jupyter-cache==0.6.1
jupyter-console==6.4.4
jupyter-server==1.23.3
jupyter-server-mathjax==0.2.6
jupyter-sphinx==0.3.2
jupyter_client==7.4.7
jupyter_core==5.1.0
jupyterlab-pygments==0.2.2
jupyterlab-widgets==1.1.1
K3D==2.7.4
keyring==23.11.0
keyrings.alt==4.2.0
kiwisolver==1.4.4
latexcodec==2.0.1
linkify-it-py==2.0.2
literate-dataclasses==0.0.6
llvmlite==0.40.1
locket==1.0.0
loguru==0.6.0
lxml==4.9.1
markdown-it-py==1.1.0
MarkupSafe==2.1.1
marshmallow==3.0.0rc6
matplotlib==3.6.2
matplotlib-inline==0.1.6
matplotlib-venn==0.11.9
mdit-py-plugins==0.3.5
meshio==5.3.4
mistune==0.8.4
more-itertools==9.0.0
morphapi==0.1.7
MorphIO==3.3.3
mpl-interactions==0.22.0
mpmath==1.3.0
msgpack==1.0.4
multidict==6.0.2
munkres==1.1.4
myst-nb==0.17.2
myst-parser==0.18.1
myterial==1.2.1
natsort==8.2.0
nbclassic==0.4.8
nbclient==0.5.13
nbconvert==6.5.4
nbdime==3.1.1
nbformat==5.7.0
nbmake==1.3.5
nd2==0.7.1
ndx-events==0.2.0
ndx-grayscalevolume==0.0.2
ndx-icephys-meta==0.1.0
ndx-spectrum==0.2.2
neo==0.12.0
nest-asyncio==1.5.6
networkx==2.8.8
neurom==3.2.2
notebook==6.5.2
notebook_shim==0.2.2
numba==0.57.1
numcodecs==0.10.2
numexpr==2.8.3
numpy==1.23.5
nwbinspector==0.4.29
nwbwidgets==0.10.0
opencv-python==4.6.0.66
opencv-python-headless==4.8.0.74
ophys-nway-matching @ git+https://github.com/AllenInstitute/ophys_nway_matching@545504ab55922717ab623f8ede2c521a60aa1458
packaging==21.3
pandas==1.5.2
pandocfilters==1.5.0
param==1.12.2
paramiko==3.3.1
parso==0.8.3
partd==1.3.0
patsy==0.5.3
pickleshare==0.7.5
Pillow==9.3.0
-e git+https://github.com/AllenNeuralDynamics/physiology_codeocean_pipelines_paper.git@3bfed5c03bbc0494227ead9fbfab332874926510#egg=pipelinedatabook_utils
pkgutil_resolve_name==1.3.10
platformdirs==2.5.4
plotly==5.11.0
pluggy==1.0.0
prometheus-client==0.15.0
prompt-toolkit==3.0.33
psutil==5.9.4
psycopg2-binary==2.9.5
pure-eval==0.2.2
py==1.11.0
py2vega==0.6.1
pybtex==0.24.0
pybtex-docutils==1.0.2
pycparser==2.21
pycryptodomex==3.16.0
pyct==0.4.8
pydantic==1.10.2
pydata-sphinx-theme==0.13.3
Pygments==2.13.0
pyinspect==0.1.0
PyNaCl==1.5.0
pynrrd==0.4.3
pynwb==2.2.0
pyout==0.7.2
pyparsing==3.0.9
PyPDF2==3.0.1
PyQt5==5.15.9
pyqt5-plugins==5.15.9.2.3
PyQt5-Qt5==5.15.2
PyQt5-sip==12.12.2
pyqt5-tools==5.15.9.3.3
pyqtgraph==0.13.3
pyrsistent==0.19.2
pytest==7.2.1
pytest-cov==4.0.0
pytest-xdist==3.2.1
python-dateutil==2.8.2
python-dotenv==1.0.0
pythreejs==2.4.1
pytz==2022.6
PyWavelets==1.4.1
pywin32==306
pywin32-ctypes==0.2.0
pywinpty==2.0.10
PyYAML==6.0
pyzmq==24.0.1
qt5-applications==5.15.2.2.3
qt5-tools==5.15.2.1.3
qtconsole==5.4.0
QtPy==2.3.0
quantities==0.14.1
rastermap==0.1.3
requests==2.28.1
requests-toolbelt==0.10.1
resource-backed-dask-array==0.1.0
retry==0.9.2
rfc3339-validator==0.1.4
rfc3987==1.3.8
rich==12.6.0
roifile==2023.5.12
ruamel.yaml==0.17.21
ruamel.yaml.clib==0.2.7
s3transfer==0.6.1
sbxreader==0.2.2
scanimage-tiff-reader==1.4.1.4
scikit-build==0.16.4
scikit-image==0.19.3
scikit-learn==1.1.2
scipy==1.9.3
seaborn==0.12.1
semantic-version==2.10.0
semver==2.13.0
Send2Trash==1.8.0
SimpleITK==2.2.1
simplejson==3.18.0
six==1.16.0
smmap==5.0.0
sniffio==1.3.0
snowballstemmer==2.2.0
soupsieve==2.3.2.post1
Sphinx==4.5.0
sphinx-argparse==0.4.0
sphinx-book-theme==1.0.1
sphinx-comments==0.0.3
sphinx-copybutton==0.5.0
sphinx-jupyterbook-latex==0.5.2
sphinx-multitoc-numbering==0.1.3
sphinx-thebe==0.2.1
sphinx-togglebutton==0.3.2
sphinx_design==0.3.0
sphinx_external_toc==0.3.1
sphinxcontrib-applehelp==1.0.2
sphinxcontrib-bibtex==2.5.0
sphinxcontrib-devhelp==1.0.2
sphinxcontrib-htmlhelp==2.0.0
sphinxcontrib-jsmath==1.0.1
sphinxcontrib-qthelp==1.0.3
sphinxcontrib-serializinghtml==1.1.5
SQLAlchemy==1.4.41
stack-data==0.6.2
statsmodels==0.14.0
strict-rfc3339==0.7
style==1.1.0
suite2p==0.12.1
sympy==1.12
tables==3.7.0
tabulate==0.9.0
tenacity==8.1.0
tensortools==0.4
terminado==0.17.0
threadpoolctl==3.1.0
tifffile==2022.10.10
tinycss2==1.2.1
tomli==2.0.1
toolz==0.12.0
torch==1.13.1
tornado==6.2
tqdm==4.64.1
traitlets==5.6.0
traittypes==0.2.1
treelib==1.6.1
trimesh==3.16.4
typing_extensions==4.4.0
uc-micro-py==1.0.1
update==0.0.1
uri-template==1.2.0
urllib3==1.26.13
util-colleenjg==0.0.1
vedo==2021.0.5
vtk==9.2.2
wcwidth==0.2.5
webcolors==1.12
webencodings==0.5.1
websocket-client==1.4.2
widgetsnbextension==3.6.1
win32-setctime==1.1.0
wrapt==1.14.1
wslink==1.8.4
xarray==2022.11.0
yarl==1.8.1
zarr==2.13.3
zarr-checksum==0.2.9
zipp==3.11.0
zstandard==0.19.0

Code of Conduct

@rly
Copy link
Contributor

rly commented Jan 17, 2024

Looking at your code, I see nothing unusual. Would you be able to share that file? You can upload to this google drive folder.

@rly rly added category: bug errors in the code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users labels Jan 17, 2024
@CodyCBakerPhD
Copy link
Collaborator

Missing details from the code: can you show us how the (a) data was wrapped, and (b) how the io was opened?

@rcpeene
Copy link
Author

rcpeene commented Jan 17, 2024

here's the whole function

`def process_suit2p(raw_params):
    """Adds RAW info to an NWB

    Parameters
    ----------
    raw_params: dict
    Contains the nwb's file path and other data

    Returns
    -------
    """
    print("Processing timeseries data")
    with h5py.File(raw_params['suite_2p'], "r") as suite2p:
        data = suite2p['data']
        wrapped_data = H5DataIO(
            data=data,
            compression='gzip',
            compression_opts=4,
            chunks=True,
            maxshape=(None, 100)
        )
        nwb_file = raw_params['nwb_path']
        io = NWBHDF5IO(nwb_file, "r+", load_namespaces=True)
        input_nwb = io.read()
        try:
            ts = TwoPhotonSeries(
                name='raw_suite2p_motion_corrected',
                imaging_plane=(
                    input_nwb.processing['ophys']['image_segmentation']
                    ['cell_specimen_table'].imaging_plane
                ),
                data=wrapped_data,
                format='raw',
                unit='SIunit',
                rate=10.71
            )
        except KeyError:
            channel = OpticalChannel(
                name='place_holder Channel',
                description='place_holder Channel',
                emission_lambda=488.0
            )
            plane = input_nwb.create_imaging_plane(
                name='imaging_plane',
                optical_channel=channel,
                description='Failed Cell Segmentation',
                device=input_nwb.devices['MESO.2'],
                excitation_lambda=488.0,
                imaging_rate=10.71,
                indicator='GCaMP6f',
                location='Failed Cell Segmentation',
            )
            ts = TwoPhotonSeries(
                name='raw_suite2p_motion_corrected',
                imaging_plane=plane,
                data=wrapped_data,
                format='raw',
                unit='SIunit',
                rate=10.71
            )
        input_nwb.add_acquisition(ts)
        io.write(input_nwb)
Collapse`

@rcpeene
Copy link
Author

rcpeene commented Jan 17, 2024

@rly the file is quite large. I've given you access to dandiset 000336 because that might be faster

@CodyCBakerPhD
Copy link
Collaborator

Without seeing the resulting NWB file to understand how it got larger without the dataset being added, I would guess it has something to do with how the data is, at that point in the code, a h5py.Dataset object from a separate file, which can affect the way io.write enacts on only the H5DataIO compressor

What I would in general suggest is using the SliceableDataChunkIterator from neuroconv to wrap the suite2p['data'] object, which will buffer the amount loaded into RAM as well as guarantee the slices actually move data from one file to another

@rly
Copy link
Contributor

rly commented Jan 18, 2024

@rly the file is quite large. I've given you access to dandiset 000336 because that might be faster

@rcpeene I downloaded and opened sub-621602_ses-1193555033-acq-1193675745-denoised-movies_ophys.nwb from dandiset 000336. In there, I see in both pynwb and hdfview these keys in nwbfile.acquisition:

>>> nwbfile.acquisition.keys()
dict_keys(['EyeTracking', 'denoised_suite2p_motion_corrected', 'v_in', 'v_sig'])

Am I looking at the right file? I was expecting to see a raw_suite2p_motion_corrected in HDFView as your code example describes.

I also think @CodyCBakerPhD 's solution of using SliceableDataChunkIterator from neuroconv is worth trying.

@rcpeene
Copy link
Author

rcpeene commented Jan 18, 2024

Maybe its a problem with my environment? what version pywnb, h5py, and hdmf are you using?

@rcpeene
Copy link
Author

rcpeene commented Jan 18, 2024

Context:
for my purposes, I use two methods my own imported module dandi_stream_open and dandi_download_open to stream or download an nwb file from dandi, and return the io object. For reasons reported in this issue, nwb = io.read() and then returning the nwb file fails to work from the imported methods, so the io object is returned and then read in the outer scope.

New Info:
I discovered when I open the NWB file directly, or when I streamed the file, the 2-Photon movie is available. It is only when using dandi_download_open and returning the io object from a separate file that the movie fails to appear. It seems likely that versioning is also a component in this problem as discussed in the cited issue above.

In light of this new information, are there any solutions in my code to fix this, or will we have to repackage our many 2P movies?

@rly
Copy link
Contributor

rly commented Jan 18, 2024

Maybe its a problem with my environment? what version pywnb, h5py, and hdmf are you using?

I'm on a mac with python 3.11 using the same pynwb and hdmf versions that you are using. My h5py version is 3.10.0 because 3.7 is not supported on Mac M1.

It is only when using dandi_download_open and returning the io object from a separate file that the movie fails to appear.

I'm a bit confused. Can you share this function? It's possible that when adding raw_suite2p_motion_corrected to the file, a link was created instead of the data being copied, and that link fails in some contexts, but you said that the file size grows significantly, so I think something else is going on here.

@rcpeene
Copy link
Author

rcpeene commented Jan 18, 2024

Yes, not only do the file sizes grow but I can see the movie when viewing the file as a .h5 file with HDFView.

The methods are also in the other issue but I'll paste the relevant portions here for convenience

# streams an NWB file remotely from DANDI, opens it, and returns the IO object for the NWB
# dandi_api_key is required to access files from embargoed dandisets
def dandi_stream_open(dandiset_id, dandi_filepath, dandi_api_key=None):
    client = dandiapi.DandiAPIClient(token=dandi_api_key)
    dandiset = client.get_dandiset(dandiset_id)

    file = dandiset.get_asset_by_path(dandi_filepath)
    base_url = file.client.session.head(file.base_download_url)
    file_url = base_url.headers["Location"]

    fs = CachingFileSystem(
        fs=filesystem("http")
    )

    f = fs.open(file_url, "rb")
    file = h5py.File(f)
    io = NWBHDF5IO(file=file, mode='r', load_namespaces=True)
    return io
def dandi_download_open(dandiset_id, dandi_filepath, download_loc=None, dandi_api_key=None):
    client = dandiapi.DandiAPIClient(token=dandi_api_key)
    dandiset = client.get_dandiset(dandiset_id)

    file = dandiset.get_asset_by_path(dandi_filepath)
    file_url = file.download_url

    filename = dandi_filepath.split("/")[-1]
    filepath = f"{download_loc}/{filename}"

    download.download(file_url, output_dir=download_loc)
    print(f"Downloaded file to {filepath}")

    print("Opening file")
    io = NWBHDF5IO(filepath, mode="r", load_namespaces=True)
    return io

@rcpeene
Copy link
Author

rcpeene commented Jan 18, 2024

When defining the dandi_download_open method from within the same file that io.read() is called rather than importing it, this problem does not occur and the movie is visible.

@rly
Copy link
Contributor

rly commented Jan 18, 2024

In dandi_download_open, downloading the file and then opening vs opening an existing file should not make a difference, so for debugging, the function can be reduced to:

    io = NWBHDF5IO(filepath, mode="r", load_namespaces=True)
    return io

I have tried to reproduce the error:

In pynwb_1826/pynwb_1826a.py:

from pynwb import NWBHDF5IO


def open_function():
    filepath = "/Users/rly/Downloads/sub-621602_ses-1193555033-acq-1193675745-denoised-movies_ophys.nwb"
    io = NWBHDF5IO(filepath, mode="r", load_namespaces=True)
    return io

In pynwb_1826/pynwb_1826b.py:

from pynwb_1826a import open_function


def read_function(io):
    nwbfile = io.read()
    print(nwbfile.acquisition.keys())


if __name__ == "__main__":
    io = open_function()
    read_function(io)
    io.close()

On the command line:

test ❯ python pynwb_1826/pynwb_1826b.py
/Users/rly/mambaforge/envs/test/lib/python3.11/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-common' version 1.6.0 because version 1.8.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/Users/rly/mambaforge/envs/test/lib/python3.11/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'core' version 2.6.0-alpha because version 2.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
/Users/rly/mambaforge/envs/test/lib/python3.11/site-packages/hdmf/spec/namespace.py:531: UserWarning: Ignoring cached namespace 'hdmf-experimental' version 0.3.0 because version 0.5.0 is already loaded.
  warn("Ignoring cached namespace '%s' version %s because version %s is already loaded."
dict_keys(['EyeTracking', 'denoised_suite2p_motion_corrected', 'v_in', 'v_sig'])

To be clear, the issue exists on this file, right? Even though the TwoPhotonSeries that I see here is called denoised_suite2p_motion_corrected and not raw_suite2p_motion_corrected ?

Ultimately this might be an issue with IO objects and Python scope that is easier to troubleshoot over video chat. Would you be available for a quick call today between 12 and 2pm PT?

@rcpeene
Copy link
Author

rcpeene commented Jan 18, 2024

My code is running in a Jupyter notebook rather than regular Python. This could affect the issue of the imported method and multiple scopes. I'd be available at 12:30

@rly
Copy link
Contributor

rly commented Jan 18, 2024

Ah, I'll test this in a Jupyter notebook. I just sent an invite to your alleninstitute email. Thanks.

@rcpeene
Copy link
Author

rcpeene commented Jan 18, 2024

The error was two mistakes confounded on my part; using the incorrect variable path for one file, and viewing a different file that in fact did not contain the movies. Solved!

@rcpeene rcpeene closed this as completed Jan 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: bug errors in the code or code behavior priority: medium non-critical problem and/or affecting only a small set of NWB users
Projects
None yet
Development

No branches or pull requests

3 participants