-
Notifications
You must be signed in to change notification settings - Fork 768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ubuntu 22.04 for python tbb. hopefully will solve the hang problem. #1719
Ubuntu 22.04 for python tbb. hopefully will solve the hang problem. #1719
Conversation
Any reason you're not first testing this on your fork's CI to verify correctness? |
I first test it on my ci before I did a PR to here. Thank you for your merge! |
I verified this change in another PR on my local CI and I can confirm that this fixes the CI issue for the Python TBB. Thanks @talregev! |
@varunagrawal You already revert this commit. Please consider again the changes here also: |
Yes, because I wanted to verify the charges one by one. The changes to the swap don't have any effect, so #1714 isn't needed. I'll be adding the necessary changes in a subsequent PR. |
That correct, and less code is not needed that it better. That mean that the ci don't need memory to compile tbb as it use to be. |
We need the swap space for the wrapper to compile otherwise the process will sometimes get killed due to being out of memory. We don't add changes unless they are needed. I understand you don't have context for this. |
Can you share a link to recent ci fail due to out of memory error? |
This was from a few years ago. Sorry but I don't have the time to devote to this issue anymore. |
I don't think you're referring to the same thing. The CI would kill the wrapper compilation due to lack of RAM, and ever since we increased the swap space we haven't had that issue. Until I see an analysis showing that the extra swap is no longer needed, I don't agree with removing it and potentially reintroducing the compilation memory issue. The OOM issue this PR fixed is specific to the wrapper+TBB. This OOM issue seems to stem from running the tests. so at run time, not compile time. Also, I think I accidentally deleted your last comment. I was on my phone when responding and might have swiped incorrectly. I have quoted your comment for posterity. |
This is the exact issue I am talking about. It happen on compilation wrapper + python in windows and not happen any more because microsoft upgrade the vms memory. The same apply for linux vms. |
You already see the CI run for 2-3 weeks (until you decide to remove this commit) with compile wrapper + tbb and not fail on lack of memory. |
I'll need to see some proof of this, such as an upgrade notice from GitHub or Microsoft. Your comment is conjecture at this point. |
Change the os for python + tbb to be ubuntu 22.04.
It may solve the hang problem.
The hang problem, is a know problem as we can see in this issue:
actions/runner#1326
actions/runner#2454
I am not sure what cause the hang problem. Maybe gtsam code, maybe the machine itself, or maybe python version.
But I want to try change the os. Better then to stay on this state. We can always revert this PR.
I didn't do much tests (I run the CI on my end) or I don't have any prove, but I think the ubuntu 22.04 machine are stronger. Also maybe the os image github action is using for this os is better.
Eventually the ci moving forward with the os version, and delete old version and get new one.
This can be tested only on python tbb to see his behavior for long run.