Running custom script before node reboot #992

DimitarStefanovMihov · 2024-10-15T15:14:03Z

Hello, kured team
I am running kured:1.16.0 on my Kubernetes cluster which hosts a number of different Databases(postgres, redis, mongo) on different nodes. Before I let kured do it's magic I need to execute a script that stepsDown the masters of those Databases:
--reboot-sentinel-command=nsenter --target 1 --mount --uts --ipc --net /usr/bin/python3 /var/run/pre-reboot-script.py - the script will return 0 if all is okey and the lock and reboot process may continue.

The problem is that I need to automate the whole process. In order to do that I copy the script and put it in /var/run/. Unfortunatelly, kured on each node starts looking for the script and trying to executed it, which turns into e big mess, because nodes start trading Database masters. I tried combining the script with the actual reboot command in:
--reboot-command - this way the script will surely be executed 1 at a time. But the reboot command wasn't working that way.
Can --reboot-command combine couple of commands together(even in a python script) ?

I decided to put manual lock in the begining of my script - I put a label on the node that is being seen from all other nodes and the scripts that run there. This way whichever kured on a kubernetes node starts the script and puts a label on that node, the other scripts will see that and won't continue executing. The problem here came with the postRebootNodeLabels flag - I couldn't make it remove the label I put in the begining so that it releases the node and allowed other scripts to run against their nodes.

Is it possible that postRebootNodeLabels flag can remove labels ?

Next, I tried putting Annotation on the daemonset that controls the kured pods - weave.works/kured-node-lock: - which is the official lock the kured and I know it disappears after rebooting the node. This way it will act as a manual lock in my script begining and when the official lock from kured takes over, it will put whatever info it needs there and will delete the Annotation after rebooting, allowing other nodes to put their lock and continue the scripts. But each time kured returned error that it expects another letter instead of some letter in the word I used for the lock.

Is it possible to manually add annotation to the daemonset so that it works with kured locked ?

Overall, is there another way that to run scripts before lock-drain-reboot or after reboot itself(or just run simple commands after reboot) ?

Thank you for your time and Thank you for this great tool !

The text was updated successfully, but these errors were encountered:

DimitarStefanovMihov · 2024-10-17T12:09:42Z

EDIT: I was able to achieve manual lock with annotateNodes: flag, but if you could answet my previous questions, I would very much appreciate that.

evrardjp · 2024-10-19T20:33:46Z

I had tough time trying to understand your goals. I am not sure I can answer your questions without a bit of clarifications first.

First, the sentinel: Whether it's a command or a file present, it should only be used to determine if reboot is required.
Don't over complicate, it's gonna be a pain later.

Second, the blockers: It looks like what you're trying to do is prevent a node from rebooting when it's master. Kured can already prevent a reboot if a pod is present. I suppose there is a way you can use a filter to prevent the active master to reboot. I think it's a bad idea - overall it won't help you to have the master blocked for reboots - but it's possible.

Assuming you go to the clean route of not blocking in case a database exists, you have pdbs. What's the problem with cordoning and draining the master? Isn't your database recovering from the drain? Don't you have an operator handling the database state? This should be the way. Kured should not "compensate" for something that's outside its work.

Yet, if you still want to do it, there is the reboot command. Keep in mind it happens after the drain/cordon. Here it seem that you want to ensure, if the pod is master on that node, that such pod gracefully stops before doing the rest of the work. This is totally doable. Push up a script for kured's use, point your kured ds to the script which will executed with nsenter -m/proc/1/ns/mnt -- yourexecutable. I don't see where the mess is (outside the fact of using a script, especially because it's something better suited for a controller). If you have anti-affinity on nodes and only certain nodes are hosting the databases, then make sure the script is only on those nodes, and configure the daemonset accordingly. But it's most likely error prone, should something be scheduled on another node one day...

As you can see, I am really confused about the problem you're hitting...

DimitarStefanovMihov · 2024-10-21T16:16:23Z

Apologies for the unclear writing. I will try my best to clarify.

First, the sentinel: Whether it's a command or a file present, it should only be used to determine if reboot is required. - unfortunately I need it to do more than just determine if reboot is required. So I need to push kured to the limits of it's abilities.

Second, the blockers: - No, here I am NOT trying to prevent a node from rebooting, but rather locking that master manually before the drain/cordon so that I can run my script and make the current node not have any masters on it, then I can safely return status 0 and the drain/cordon -> reboot can continue.

Assuming you go to the clean route of not blocking in case a database exists, you have pdbs. What's the problem with cordoning and draining the master? - I don't know the exact reason, but I am not allowed to drain/evict pods if they are masters.(it is a human restraint, not from Kubernetes)

evrardjp self-assigned this Oct 18, 2024

evrardjp removed their assignment Oct 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Running custom script before node reboot #992

Running custom script before node reboot #992

DimitarStefanovMihov commented Oct 15, 2024 •

edited

Loading

DimitarStefanovMihov commented Oct 17, 2024

evrardjp commented Oct 19, 2024 •

edited

Loading

DimitarStefanovMihov commented Oct 21, 2024

Running custom script before node reboot #992

Running custom script before node reboot #992

Comments

DimitarStefanovMihov commented Oct 15, 2024 • edited Loading

DimitarStefanovMihov commented Oct 17, 2024

evrardjp commented Oct 19, 2024 • edited Loading

DimitarStefanovMihov commented Oct 21, 2024

DimitarStefanovMihov commented Oct 15, 2024 •

edited

Loading

evrardjp commented Oct 19, 2024 •

edited

Loading