-
-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory use in the combine
function exceeds (by factor of 2-4 or more) the memory limit passed in
#638
Comments
Running the memory profilesTo run the memory profiles shown below:
Note: If you want the marks where a particular function begins/ends you need to mark the function with a decorator Parameters of profiling runs
Interpreting the outputThe graphs below show snippets from the memory profiles of two commits, one the tip of the file refactor, the other master (not the tip of master as of today, it is a few commits back). Prior to the interval shown memory use in both cases has reached roughly 150MB. That is prior to the beginning of the actual image combination. The cyan "brackets" show when After refactor to limit number of open files (commit c20a7bf)DiscussionI believe the jump from 150MB to what looks like a new "baseline" of 180MB is due to the creation of the CCDData object to hold the results. My expectation from the documentation is that during the combination, memory use for performing the combination will not exceed the limit set (32000000 MB), i.e. that memory use will not exceed 180 + 32 = 212MB. In fact, peak memory usage regularly reaches roughly 270MB, roughly 2x what I expected. The sharp drops after Recent master (commit c289d4e)DiscussionA few things jump out here. One is that the peak memory usage is much higher above the 150MB baseline. Peak is 290MB, and the floor on memory use never drops below 240MB, i.e. about 60MB above starting, and 30MB above the limit. Not sure why that is, exactly, but the new version is definitely an improvement. SummaryThe easiest fix here is to increase the number of chunks the file is broken into by adding a "fudge factor" of 2 or 4. I don't think that will slow the code appreciably and I don't think we will be able to accurately predict what memory usage will be in all cases. As an example, I plan to open a PR to switch us to using astropy sigma clipping instead of our own. Currently astropy sigma clipping makes copies of its input (because sorting is done in-place, I think) so memory usage will depend on whether sigma clipping is done. More broadly, we can't predict what memory use might be in other functions the user passes in. To some extent, I think that is not our problem to solve. On the other hand, it seems safe to assume that memory use for combination will be at last twice the memory limit if the number of chunks is calculated based only on the size of the images to be combined and it isn't hard to imagine it being 4 times. I don't think we have to be super strict about the amount of memory....but in the PR I'll try to come up with a test we can add to make sure there are not any bad regressions on this front. |
Thanks Matt for the careful analysis. I wonder if it wouldn't be enough (especially because of your "we will be able to accurately predict what memory usage will be in all cases. " statement) to fix the documentation to clearly state that "*roughly that amount of input data will be used and that the effective memory usage during combination can significantly higher (in general 2-4 times is to be expected)". We probably should investigate if we can reduce the memory-usage but given that we allow arbitrary combine and uncertainty functions it will be completely impossible to know what fudge factor makes sense. Even in the default case we would rely on numpy implementation details... |
Just out of interest: Did you happen to do memory profilings on #632? It probably isn't something we should use as-is (numba is cool but without anaconda a real pain to install & use) but an equivalent Cython/C approach wouldn't be too complicated. |
Can do today; would be interesting to see what happens. |
I'll try to write a more extensive set of profile cases today and write a more proper memory test runner instead of hacking the files-open runner. To some extent the examples above are really "toy" cases since the memory limit is being set to the size of one image. In a more real-world case it may be that the memory overhead is much less.... |
Updated discussion and more (and more realistic) profiles in #642 |
The
combine
function allows one to set a limit on the amount of memory used. Unfortunately, the function actually uses more memory than that limit when performing the image combination.Still working on tracking down the root cause, but will post in a moment some example memory profiles.
The text was updated successfully, but these errors were encountered: