improve security of reboot mechanism #416

rptaylor · 2021-07-28T01:07:55Z

Hello,

We would really like to use kured but there is reluctance about a fully privileged daemonset that has access to execute arbitrary commands as root on every node. I have reviewed some options that could allow rebooting more securely.

Option 0 (current kured behaviour): nsenter and execute a configurable shutdown command (/bin/systemctl reboot by default) in the host namespace to reboot. Requires full privileges.
Option 1 grant CAP_SYS_BOOT to kured to allow the reboot(2) syscall
- Option 1A semi-dirty reboot: call sync() and then reboot(). Should avoid any filesystem corruption, but systemd will not be aware of the reboot so it is not very friendly to system services. Was discussed in Support unprivileged container #172
- Option 1B graceful reboot via Ctrl Alt Del: invoke reboot syscall with LINUX_REBOOT_CMD_CAD_OFF, then issue a CtrlAltDel keystroke to trigger a graceful systemd-managed shutdown via the ctrl-alt-del.target unit file.
Option 2: graceful reboot via a kill signal. Grant CAP_KILL to kured so it can send SIGRTMIN+5 to PID 1 (systemd), which is an equivalent way to achieve a graceful systemd-managed reboot (without relying on the ctrl-alt-del.target unit, thanks to evrardjp's comment - though this may be less portable than SIGINT) .

Comparisons

Option 0
- pros: currently working this way
- cons: not very secure, pod runs with far more privileges than should be needed to reboot
Option 1A:
- pros: easy to do, most secure (only CAP_SYS_BOOT required)
- cons: not graceful (systemd-managed)
Option 1B:
- would be the best of both worlds but after further investigation I am not sure if/how it is possible. Here is example code to issue a CtrlAltDel keystroke but using uinput which requires root privilege so that does not help. It might be possible with ioctl but that could require additional capabilities (using TIOCSTI, CAP_SYS_ADMIN?). (Also users may have masked ctrl-alt-del.target or changed the reboot keystroke from the default CtrlAltDel)
Option 2:
- pros: easy to do, graceful, requires far fewer privileges (only CAP_KILL), more secure than Option 0
- cons: sending any signal to any host process is still a significant attack surface.

All the options still require hostPID: true.

Anyway do you think option 2 is at least a clear improvement over 0 and might be suitable as a new default behaviour? And maybe Option 1 A (or B if possible) could be configurable as more secure alternatives?

The text was updated successfully, but these errors were encountered:

evrardjp · 2021-07-28T06:51:35Z

Option 2 is, as you mentioned, about equal to option1B.
There is an option 2B, if you follow systemd code. I did a PoC in one of my testbeds, for which I did a kill signal SIGRTMIN+15. It was not portable, as some other people tried without success. I did it with CAP_KILL and hostPID.

To our tests, I would say option2 should not be the default option.

However, I believe it would be nice to allow people to go for option2 or option3.
For that, I discussed these approaches in the past, and updated the code to go towards it.

I feel it's only a slight change to have an option to not wrap with nsenter.
If we bring that option, it would be easier for a user to override the command to kill -SIGRTMIN+15 1 with the right privileges.

You might be interested by #359 for which the security section is relevant for you.

rptaylor · 2021-07-28T19:44:52Z

@evrardjp thanks for the comments!
Okay, SIGRTMIN+15 should "immediately reboot the machine" , that has the advantage of not relying on ctrl-alt-del.target but it sounds like it might be less graceful (?), in the same way the reboot(2) call immediately reboots. SIGRTMIN+5 sounds like the best choice IMHO (starts the reboot.target unit).

If anything, sending a signal to PID 1 should be the most portable option, in the sense that it has at least some non-zero possibility of working on SysV or other non-systemd systems , unlike /bin/systemctl reboot. A priori most systems should behave according to the systemd documentation but it could require further testing and investigation on some platforms.

I feel it's only a slight change to have an option to not wrap with nsenter.

That sounds nice and seems relatively easy. But I wonder if there might be any gotchas with executing /bin/kill (packaged in the kured container image), as opposed to doing process.Signal(sig) in the code?
The latter should be cleaner. Not sure if there could be situations where the kill executable might have portability issues with different host OSes or kernels.

Another option would be to have a configurable property --reboot-signal which , if configured, is used instead of --reboot-command, specifying which signal to send to PID 1. What do you think?

Either way, the final part would be to actually realize the security benefit by adapting the PSP to grant limited capabilities based on which approach is selected. With Helm that could be done e.g.

{{- if .Values.configuration.rebootSignal}}
  privileged: false
  allowedCapabilities: ['KILL']
{{- else }}
  privileged: true
  allowedCapabilities: ['*']
{{ end }}

or similarly {{- if not .Values.configuration.use_nsenter }}

However, if the use_nsenter option would apply to both the reboot command and the sentinel command, the Helm logic could get slightly more complicated. Personally I like the idea of not using an executable command at all to reboot, just send the signal directly, just my 2c.

evrardjp · 2021-07-28T20:30:05Z

My notes are very close to what you're proposing. I used both signals.

Even when you think portability isn't a problem with systemd, it actually is. Who really reads what's packaged in the main OSes? Our tests have shown that it sometimes doesn't work. Depends on OS behaviour, apparmor, selinux, etc. Kind + ubuntu without apparmor was not happy last time I tried.

I am not against adding this new feature, as long as its properly tested.

PS: You should probably check #359 , maybe that could interest you, security wise. Because reducing the scope of the ds isn't a complete solution.

PS2: I would say that it's indeed easier to have this in our code. However, the refactoring will be bigger. Not impossible though ;)

rptaylor · 2021-07-28T22:46:06Z

@evrardjp Thanks, I did look at #359 but I think some significant security improvements can be achieved without a major architectural redesign (though that could also have valuable security improvements in its own right).

Even when you think portability isn't a problem with systemd, it actually is.
I am not against adding this new feature, as long as its properly tested.

Certainly. I already tested that SIGINT to PID 1 works as expected and documented on EL7,8. Likewise for SIGRTMIN+5:

Jul 28 22:33:46 el8-test.novalocal systemd[1]: Received SIGRTMIN+5 from PID 1764 (n/a).
...
Jul 28 22:33:47 el8-test.novalocal systemd[1]: Reached target Shutdown.

If we add a new option that is non-default behaviour, from my point of view as long as kured does what I tell it to do (send the specified signal to PID 1) , it is working correctly, and it is my responsibility (the user) to make sure I configured kured (and my cluster) correctly to have the desired outcome when the nodes receive that signal. Does that seem reasonable?

PS2: I would say that it's indeed easier to have this in our code. However, the refactoring will be bigger. Not impossible though ;)

Okay, do you suggest to proceed with option 2, by signaling in the code and a --reboot-signal option, rather than invoking /bin/kill ?
I can probably at least propose a PR and do testing but might need help with some details on code changes.

Thanks!

evrardjp · 2021-07-29T07:55:21Z

I don't see how it can't be done in two PRs, to iterate on this in a "simpler" way.

In any case, we can help you with the PRs! :)

First PR could tackle adding kill to the kured container, for which you can add tests: It's just a different rebootCommand. That should be a very simple PR.

The second PR we could probably remove that kill package from the image, and refactoring the code. That will take a longer time, to define the right refactor.

rptaylor · 2021-07-30T00:57:38Z

@evrardjp Okay thanks! Sorry I am not sure, do you mean for the 1st PR, it would involve basically copying https://github.com/weaveworks/kured/blob/main/tests/kind/follow-coordinated-reboot.sh with a different rebootCommand configured? I'm not sure where the kured config would go in that script.

github-actions · 2021-09-28T01:48:05Z

This issue was automatically considered stale due to lack of activity. Please update it and/or join our slack channels to promote it, before it automatically closes (in 7 days).

rptaylor · 2021-09-28T02:14:40Z

Still relevant but not sure how to proceed.

pjbgf · 2021-11-24T10:22:06Z

Thank you @rptaylor for the great work detailing the options.

Options 1 and 2 allows end-users to run kured without being privileged, which then also opens up for further lock-down capabilities (apparmor, seccomp, etc). Once this is in place it would allow myself and the folks from security-profiles-operator to create some security profiles for kured.

In terms of next steps, are we happy to proceed with a PR for option 2 that is gated by some sort of feature switch? This would allow for backwards compatibility, so no negative impact on users using the default nsenter approach.
I am happy to support @rptaylor with the PR and tests if that's the case.

rptaylor · 2021-11-24T21:41:14Z

Thanks @pjbgf . As far as I can tell I think a rebootSignal option (non default) would be needed as well as the existing rebootCommand, so that the container can work with only CAP_KILL instead of privileged. I am happy to at least take a stab at that if the maintainers agree.

Also if seccomp can limit which kill signals may be sent that could be a possible future improvement, out of scope of current issue.

admincasper · 2022-03-31T12:33:43Z

This has been an open issue for almost two years now, but there is still no implementation on the issue which I find concerning. We want to run Kured in production clusters but are unable to because the baseline security policy to restrict capabilities cannot be applied to Kured.

evrardjp · 2022-03-31T12:40:56Z

I am pretty sure the team is willing to accept any contributions that improve the security of kured... Keep in mind that it's a tough topic... like many engineering topics, it's all about tradeoffs.

I think it's important to know what you want to do with kured, as kured is very flexible. Some of the concerns here can be done without a code change. For the rest, please don't hesitate to contribute too :)

admincasper · 2022-03-31T13:00:30Z

We're using Kured to restart updated nodes every week during downtime, we get alerts in Teams and are very happy with how it's working. It's only a matter of security or specifically the securitycontext for the daemonset. Since Kured is specifically mentioned in Microsoft Documentation and the use-case for Kured is very useful I was hopeful it was ready for production environments.

I would like to contribute but I don't have the Linux expertise and have no downtime. But it's an issue I'm highly anticipating and supporting!

rptaylor · 2022-03-31T17:35:13Z

@evrardjp okay, can we proceed with adding a rebootSignal option then, to send a configurable signal to PID 1 ?

ckotzbauer · 2022-04-02T15:42:17Z

@rptaylor Yes, I think this would be the best option.

rptaylor · 2022-04-04T22:43:31Z

Going over the code again it seems to me like the best way to proceed IMHO would be:

1st PR

add a new configurable option: rebootMethod, default value "command"
check and complain if it has an invalid value, for now only "command" would be possible
refactor the rebooting code to first determine how to reboot based on rebootMethod, if "command" then invoke the same code path

No change in behaviour.

2nd PR

add support for another non-default value of rebootMethod, "signal"
add a new configurable option: rebootSignal, default value "SIGRTMIN+5"
add function to send the configured signal to PID 1
when performing the reboot , add: else if rebootMethod is "signal" , then invoke the new function to send the signal

However while reviewing the code I also noticed that although the documentation indicates the default behaviour is to check for existence of a sentinel file, the code nevertheless achieves this by executing a test -f sentinel command:
https://github.com/weaveworks/kured/blob/main/cmd/kured/main.go#L661
So a parallel effort would be needed on the sentinel side in order to realize the final goal of not executing commands on the host with nsenter. It could be achieved via read-only hostPath usage instead. #526

rptaylor · 2022-04-05T17:56:49Z

For the record I'm a cluster operator not a developer, and I don't know go per se. If someone wants to take a stab at a PR please feel free. Otherwise I could put my copy + paste skills to the test at some point but the refactoring involved to do it properly seems a bit more than I expected at first.

cloud-az · 2023-06-14T07:58:41Z

Hi. What is the status of using Kured without setting privileged: true?

ckotzbauer · 2023-08-04T08:04:36Z

There's no new status right now @cloud-az. But we need to bring things forward for the security-topic, would be glad if you could help us.

ckotzbauer · 2023-08-04T08:59:41Z

Just re-read all security-related threads. As discussed in this issue, I think we should start with the implementation of a second reboot-type in addition to the current "command" implementation as described by @rptaylor here.

Parallel to that the check logic needs to be adapted with the ro-hostPath approach from #526.
Both changes are not that hard to implement I think, but they need very good testing, were we need all of you folks @cloud-az @rptaylor @pjbgf @evrardjp @jackfrancis

TODOs:

Implement --reboot-method flag with "command" option, proper validation-logic and small code-refctoring to prepare things for second type afterwards. Add signal-reboot #814
Implement additional "signal" type which emits a SIGRTMIN+5 to PID 1 (p.Signal(syscall.Signal(34 + 5))) Add signal-reboot #814
When no custom --reboot-sentinel-command is given, execute the test -f without nsenter Sentinel-command without nsenter by default #813
Change the helm-chart to support the new --reboot-method flag. Integrate "reboot-signal" charts#51
Change the helm-chart to add a read-only hostPath-mount to the directory of the sentinel-file and configure --reboot-sentinel flag accordingly. Only add the mount and the flag, if there's no custom --reboot-sentinel-command given. Mount sentinel-location without sentinel-command charts#49

I can implement some parts on my own, but I would be happy if someone could help 😉

ckotzbauer · 2023-08-04T14:49:05Z

Current state: I've also implemented the first two todos locally and tested it in my home-cluster. The signal in general reboots the server (Ubuntu 20.04 LTS) gracefully, that looks good. However, with this securityContext the command errors with "permission denied":

    securityContext:
      allowPrivilegeEscalation: false
      capabilities:
        add:
        - CAP_KILL
        drop:
        - '*'
      privileged: false

Does somebody know which other capabilities might be needed?

rptaylor · 2023-08-04T18:36:53Z

Does somebody know which other capabilities might be needed?

Is the process still running as UID 0?
As far as capabilities go, it seems like CAP_KILL should be sufficient. To confirm whether your process actually has that capability in the container try getpcaps, or grep ^Cap /proc/<PID>/status and then use capsh --decode= on the displayed strings. Other things that come to mind which could possibly get in the way would be SELinux and seccomp. In the old way, that would be controlled by Pod Security Policy. I am not familiar with the new Pod Security Standards yet so not sure if/how SElinux and seccomp would be applicable there.

The pod will also need hostPID but that was also an old PSP concept. In any case if you set the Pod Security Admission mode to 'warn' it should bypass any of those issues for debugging purposes.

ckotzbauer · 2023-08-04T19:35:29Z

Thanks for your reply, I will have a more detailed look tomorrow.

ckotzbauer · 2023-08-05T09:26:33Z

@rptaylor

Is the process still running as UID 0?

yes

The granted capabilities for the container-process are the following, cap_kill is available.

0x00000000a80425fb=cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap

Neither PSPs nor Pod Security Admission are turned on in the cluster.
By default the RuntimeDefault Seccomp-Profile is used. However, also with the permissive Unconfined profile, the signal is denied. SELinux is disabled on the host.

Edit: Also with securityContext.capabilities.app='*' there are no additional capabilities permitted.

ckotzbauer · 2023-08-05T13:39:43Z

PR is opened with further testing-instructions: #814

sftim · 2023-11-24T01:34:27Z

An idea:

set up a systemd unit that triggers a reboot
have a Pod that is able to activate the previous unit by writing to a path

…

also have a unit that does the actual reboot

[Unit]
Description=Trigger patch and reboot
BindsTo=prepare-patch-and-reboot-trigger.service
After=prepare-patch-and-reboot-trigger.service

[Path]
# To trigger it, create a file or directory named /run/trigger-patch-and-reboot
PathExists=/run/trigger-patch-and-reboot

[Install]
# Enable the filesystem watching by default
WantedBy=multi-user.target

along with

# thinking of something like https://bootlin.com/pub/conferences/2022/elce/opdenacker-implementing-A-B-system-updates-with-u-boot/opdenacker-implementing-A-B-system-updates-with-u-boot.pdf
[Unit]
Description=Trigger reboot for update

[Service]
Type=oneshot
ExecStart=/bin/systemctl isolate prepare-patch-and-reboot.target

rptaylor · 2023-12-14T23:37:50Z

An idea:

@sftim I like that idea, it would only require write privilege to a hostpath , so the container could be fully unprivileged, no need to execute commands or send signals. I didn't know systemd could trigger based on a path, neat. Another reboot method "path" in addition to the "command" and "signal" ones would be needed.

That being said, setting up systemd unit files on the node means more of the solution is living outside kured and would need to be configured out of band. However, a systemd timer (like a cron job) would be useful to apply the OS updates in the first place on nodes and set a sentinel flag. So, that would make a valid rationale for cluster admins to be adding systemd files on their nodes already.

sftim · 2023-12-15T09:30:45Z

The systemd method could be one option of several, and would suit cluster admins who make (or consume) custom OS images. A custom node image can include a reboot trigger path for Pods to use.

rptaylor · 2023-12-15T23:25:06Z

This will be closed hopefully by #814 but I made #868 to retain the path idea.

rptaylor mentioned this issue Jul 28, 2021

Redesign idea: "Central nervous system for Kured" #359

Open

github-actions bot added the no-issue-activity label Sep 28, 2021

ckotzbauer removed the no-issue-activity label Sep 28, 2021

dholbach added the keep This won't be closed by the stale bot. label Sep 28, 2021

rptaylor mentioned this issue Mar 31, 2022

Tighten permissions (security best practices) #451

Open

rptaylor mentioned this issue Apr 4, 2022

improve security of sentinel mechanism #526

Closed

ckotzbauer added the security label Apr 5, 2022

ckotzbauer mentioned this issue Feb 3, 2023

Kured pods need to run without privileged permissions #722

Closed

ckotzbauer mentioned this issue Aug 5, 2023

Add signal-reboot #814

Merged

ckotzbauer added this to the 1.15.0 milestone Aug 12, 2023

rptaylor mentioned this issue Dec 15, 2023

path-based reboot mechanism #868

Open

ckotzbauer closed this as completed in #814 Jan 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improve security of reboot mechanism #416

improve security of reboot mechanism #416

rptaylor commented Jul 28, 2021 •

edited

Loading

evrardjp commented Jul 28, 2021 •

edited

Loading

rptaylor commented Jul 28, 2021 •

edited

Loading

evrardjp commented Jul 28, 2021 •

edited

Loading

rptaylor commented Jul 28, 2021 •

edited

Loading

evrardjp commented Jul 29, 2021 •

edited

Loading

rptaylor commented Jul 30, 2021

github-actions bot commented Sep 28, 2021

rptaylor commented Sep 28, 2021

pjbgf commented Nov 24, 2021

rptaylor commented Nov 24, 2021 •

edited

Loading

admincasper commented Mar 31, 2022

evrardjp commented Mar 31, 2022

admincasper commented Mar 31, 2022

rptaylor commented Mar 31, 2022

ckotzbauer commented Apr 2, 2022

rptaylor commented Apr 4, 2022 •

edited

Loading

rptaylor commented Apr 5, 2022

cloud-az commented Jun 14, 2023

ckotzbauer commented Aug 4, 2023

ckotzbauer commented Aug 4, 2023 •

edited

Loading

ckotzbauer commented Aug 4, 2023

rptaylor commented Aug 4, 2023

ckotzbauer commented Aug 4, 2023 •

edited

Loading

ckotzbauer commented Aug 5, 2023 •

edited

Loading

ckotzbauer commented Aug 5, 2023 •

edited

Loading

sftim commented Nov 24, 2023

rptaylor commented Dec 14, 2023

sftim commented Dec 15, 2023

rptaylor commented Dec 15, 2023

improve security of reboot mechanism #416

improve security of reboot mechanism #416

Comments

rptaylor commented Jul 28, 2021 • edited Loading

evrardjp commented Jul 28, 2021 • edited Loading

rptaylor commented Jul 28, 2021 • edited Loading

evrardjp commented Jul 28, 2021 • edited Loading

rptaylor commented Jul 28, 2021 • edited Loading

evrardjp commented Jul 29, 2021 • edited Loading

rptaylor commented Jul 30, 2021

github-actions bot commented Sep 28, 2021

rptaylor commented Sep 28, 2021

pjbgf commented Nov 24, 2021

rptaylor commented Nov 24, 2021 • edited Loading

admincasper commented Mar 31, 2022

evrardjp commented Mar 31, 2022

admincasper commented Mar 31, 2022

rptaylor commented Mar 31, 2022

ckotzbauer commented Apr 2, 2022

rptaylor commented Apr 4, 2022 • edited Loading

rptaylor commented Apr 5, 2022

cloud-az commented Jun 14, 2023

ckotzbauer commented Aug 4, 2023

ckotzbauer commented Aug 4, 2023 • edited Loading

ckotzbauer commented Aug 4, 2023

rptaylor commented Aug 4, 2023

ckotzbauer commented Aug 4, 2023 • edited Loading

ckotzbauer commented Aug 5, 2023 • edited Loading

ckotzbauer commented Aug 5, 2023 • edited Loading

sftim commented Nov 24, 2023

rptaylor commented Dec 14, 2023

sftim commented Dec 15, 2023

rptaylor commented Dec 15, 2023

rptaylor commented Jul 28, 2021 •

edited

Loading

evrardjp commented Jul 28, 2021 •

edited

Loading

rptaylor commented Jul 28, 2021 •

edited

Loading

evrardjp commented Jul 28, 2021 •

edited

Loading

rptaylor commented Jul 28, 2021 •

edited

Loading

evrardjp commented Jul 29, 2021 •

edited

Loading

rptaylor commented Nov 24, 2021 •

edited

Loading

rptaylor commented Apr 4, 2022 •

edited

Loading

ckotzbauer commented Aug 4, 2023 •

edited

Loading

ckotzbauer commented Aug 4, 2023 •

edited

Loading

ckotzbauer commented Aug 5, 2023 •

edited

Loading

ckotzbauer commented Aug 5, 2023 •

edited

Loading