-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some daemons need to be killed twice #673
Comments
I also observed the same problem with the receiver on another configuration ... |
If you look closely pid are different. I am pretty sure this is pkill issue because it does guarantee which process is kiled first. Try ps -ef | grep alignak you will see the ppid and so the parent process. You can see that leftover are usually attached to 1 and not to a parent anymore (if they were child process) |
I will try what you suggest to understand what is happening. Thanks |
They do not look to be attached to 1... Before killing. I started 2 receivers in
The parent process attachment seems to be consistent. I quitted the screens and I get this:
Note that all other daemons are lauched the same way and they are correctly stopped when I quit the screens. Only the receiver has this behavior :/ |
you are using screen that's why they won't attach 1. The issue is maybe in the singal handling part then. You should be able to reproduce this when launching the receiver in foreground |
Note tha ton the demo server currently, this happens with the broker daemon and always the broker) ... and I noticed on another server the same behavior with the poller daemon (and always the poller). I update the issue title ... |
I reupload my log from Alignak-monitoring/alignak-packaging#28 here. For the scheduler, a stop write the following lines in the log :
The broker show only a line when he stop:
If I understand the code, the log process X received a signal 15 is from here So now So we should see a line in the log with
But we don't see it for the broker. Another thing is the daemon seems have problem to stop only when they received some configurations. I will try to make more investigation (With setting the log level to debug) |
Great job @fpeyre and thanks for investigating this problem. I got this problem several times today when restarting Alignak on the demo server ... but I currently did not have time to investigate more :/ |
Ok. At least I found the problem! The problem happens with daemons that have some attached modules if those modules are waiting on a message queue. When the daemon tries to stop its external modules, it sends a SIGTERM, tries to The problem with this solution is that the There is probably nothing to do for this in tha Alignak core ... only the modules should be concerned, but I leave this issue opened for the moment |
Started alignak and got this (
ps -aux | grep alignak-
):Then I did:
Then I got this (
ps -aux | grep alignak-
):The poller and reactionner daemons are not killed. Only the poller and reactionner workers are killed...
I must then send another kill to make the daemons stop
The text was updated successfully, but these errors were encountered: