-
Notifications
You must be signed in to change notification settings - Fork 328
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support variably-sized caches in 'RasterDataset' #1694
Comments
Do you mind expanding on the rationale section a bit (mainly for my curiosity)? E.g. how big are your files / how much RAM is being consumed, etc. |
We've been using a fork of "RasterDataset" and have tweaked the cache size, batch size, and number of workers to get things to behave nicely. Now we are routinely using 60+ gb of RAM before tweaking would run out of RAM on machine configured with 125gb. We're using NAIP quarter quadrangle tiles downsampled to 1m/pixel that are ~160mb (or in some cases, quartered again to be ~40mb/image). Allowing for variable sized caches, in effect, allows users to optimize the ratio of number of worker processes vs. cache memory per process for batch loading on their platform. We've found that preparing batches (prior to transfer to GPU) is compute-bound rather than IO-bound (thanks to the cache), but would like to speed batch loading by exchanging smaller caches for additional worker processes. The optimal point would be when batch loading again becomes IO-bound due to files getting rotated out of the cache. |
Also relates to #1438 (@patriksabol) and #1578 (@trettelbach) |
Curious if any GDAL config options (especially |
We've observed a similar "sawtooth" pattern to memory usage. This is when training with a significant number of dataloader workers with Ideally, we'd have both If we can decrease the cache size for each worker, then we should be able to have persistent workers whose maximum memory consumption is less than the system's physical memory constraints. It's easy to do some back of the envelope math to see how the workers' memory consumption explodes: 16 workers * 128 files per worker cache * ~100mb per file > 200gb That's without considering any python/process overhead. @adamjstewart Is your idea that if we could adjust |
Yes, wondering if the bug could be avoided by a simple environment variable.
I don't have a sense but that may actually explain the issue if it's per-process. Can you experiment with various |
Summary
Currently, 'RasterDataset' caches warp files with a fixed-sized LRU cache with 128 elements. I propose supporting variably-sized caches for subclasses.
Rationale
When loading large raster files, the fixed-size cache consumes considerable memory. For a given machine, this fixed-size overhead restricts the number of parallel workers usable for DataLoaders.
In our application, training batch creation is limited by the number of parallel workers rather than data access speeds. If we could reduce the size of caches during training, we could spawn additional dataloader workers and remove the present bottleneck.
Implementation
We'd add a member to "RasterDataset" and cache "_load_warp_file" during the constructor.
Alternatives
There may be others, but this one plays (relatively) nicely with MyPy the inability of method decorators to access class- or instance-members.
Additional information
No response
The text was updated successfully, but these errors were encountered: