Update slicer docs for indexed-gzip & keep_file_open #1058

ecc521 · 2021-09-26T21:18:40Z

I have some rather large (gzipped) NIFTI files I need to read without first buffering in memory (so reading in slices).

When loading the image via nibabel.load, slicer and dataobj appear to be re-reading from the beginning each time, resulting in quadratic time complexity with the number of slices taken.

Since I'm only interested in proceeding forward through the file, it would seem that the time complexity here should be linear - indeed, linear time complexity can be obtained by updating the code from an old question:

from io import BytesIO
from nibabel import FileHolder, Nifti1Image
from gzip import GzipFile
fh = FileHolder(fileobj=GzipFile(niftipath))
img = Nifti1Image.from_file_map({'header': fh, 'image': fh})

In this case, since GzipFile preserves the current decompression state, proceeding strictly forward in the file works at the expected speed (much faster).

Is there a way to obtain this same slicing performance using the nibabel.load() API (such as by passing a GzipFile, etc)? This would be greatly preferable, as it abstracts away file formats.

The text was updated successfully, but these errors were encountered:

effigies · 2021-09-26T22:31:54Z

If you install the indexed-gzip package, you should get performance improvements for free.

ecc521 · 2021-09-26T22:50:53Z

Thanks! @effigies
Looking at the indexed-gzip docs, I was able to find the flag - keep_file_open = True
While indexed-gzip alone does help, that's all that is needed for this use case.

Still confused as to why keep_file_open is off by default, but enabling it seems to be a solution.

Unless the defaults or relevant documentation (see slicer section - no mention) needs to be revisited to make keep_file_open/indexed-gzip more visible, I'm good to close this.

effigies · 2021-09-26T23:16:15Z

With indexed gzip, you should not need to set keep file open to get almost identical performance.

The reason it's off by default is that, when working with many files, you can exhaust file handle quotas, and the lifetimes of file handles are difficult to reason about.

Definitely good to update the docs.

ecc521 changed the title ~~Quadratic Time Complexity reading gzipped NIFTIs in slices~~ Update slicer docs for indexed-gzip & keep_file_open Sep 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update slicer docs for indexed-gzip & keep_file_open #1058

Update slicer docs for indexed-gzip & keep_file_open #1058

ecc521 commented Sep 26, 2021 •

edited

Loading

effigies commented Sep 26, 2021

ecc521 commented Sep 26, 2021 •

edited

Loading

effigies commented Sep 26, 2021

Update slicer docs for indexed-gzip & keep_file_open #1058

Update slicer docs for indexed-gzip & keep_file_open #1058

Comments

ecc521 commented Sep 26, 2021 • edited Loading

effigies commented Sep 26, 2021

ecc521 commented Sep 26, 2021 • edited Loading

effigies commented Sep 26, 2021

ecc521 commented Sep 26, 2021 •

edited

Loading

ecc521 commented Sep 26, 2021 •

edited

Loading