-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support str, path object or file-like object on file read #302
Comments
Can you elaborate on the use case please? What type of format are you thinking of? |
Appreciate the quick response. Basically I'm hoping that rosettasciio would support a similar interface to e.g. pandas or imageio: https://imageio.readthedocs.io/en/stable/_autosummary/imageio.v3.imread.html#imageio.v3.imread This would enable smoother use in distributed applications where the actual loading of the file is done without access to the original filesystem on which the file is stored, and would just be passed as a file-like object:
|
Off the top of my head, there may be already a few formats that can do that but I suspect that rosettasciio supports a wider variety of type of file than imageio and pandas and depending on the type, it may behave differently. Here is a list of the different type of files
There should be some low hanging fruit as it should be easy to implement for some type. |
@hsuominen is the idea that you are loading data that isn't on the computer doing the operation? I think zarr might be a good place to start. https://zarr.readthedocs.io/en/stable/api/storage.html#zarr.storage.LRUStoreCache This store implementation uses a LRU cache over an s3 bucket which might be interesting if aws is hosting data. |
yes that's right. Our intent is to get the data out of proprietary formats and into e.g. zarr (which looks great), but we need to run this extraction on compute that doesn't have the files sitting locally. There are fairly easy workarounds (e.g. using a TempFile) but thought it would be good to get this discussion going as I can see others eventually running into similar needs. Looking specifically at some of the file formats we are interested in, the changes needed in some cases would be pretty trivial (as @ericpre hinted): rosettasciio/rsciio/digitalmicrograph/_api.py Lines 1278 to 1279 in e499110
but likely harder in others: rosettasciio/rsciio/emd/_api.py Lines 171 to 173 in e499110
|
Describe the functionality you would like to see.
For a number of applications it would be preferable if file reading supported file-like objects as well as strings or paths.
The text was updated successfully, but these errors were encountered: