-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add job for graceful node shutdown feature #32728
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: torredil The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
5362a44
to
d27b4c7
Compare
7459705
to
dfb8330
Compare
The regex change made in |
/assign @mrunalp |
/cc @wzshiming @bobbypage |
/hold is not clear to me the direction of these changes, the feature does not seem to have e2e tests (only e2e_node) and makes some existing tests to be skipped |
That is correct - today, see this PR for more context: kubernetes/kubernetes#125070. I synced up with @msau42 offline and we agreed to implement the e2e test written in that PR under |
e67751e
to
8ebdda9
Compare
Signed-off-by: torredil <[email protected]>
Sorry, I might be missing some context here but what is the purpose of having an e2e test for this vs relying on the existing node e2e tests? All of the logic for graceful shutdown is node specific so I'm trying to understand what additional signal a cluster e2e would provide. |
@bobbypage, the node specific logic is already covered by For example, users often observe 6 minute delays for stateful pods to enter a running state after a node is gracefully terminated, due to having to wait for the A/D controller to issue a force detach if volumes were not unmounted in time. ^ kubernetes/kubernetes#125070 addresses that delay by taking volume status into account before proceeding with termination and it includes an e2e test to validate that stateful pods enter a running state in a timely manner (as expected when nodes are gracefully terminated). My understanding is that this level of validation is not possible with |
cc: @bobbypage for review. |
This PR introduces a new job to validate the graceful node shutdown feature:
[Feature:GracefulNodeShutdown]
.SHUTDOWN_GRACE_PERIOD
&SHUTDOWN_GRACE_PERIOD_CRITICAL_PODS
environment variables to enable the graceful shutdown feature, see Conditionally add the graceful shutdown Kubelet parameters kubernetes#125413 (comment)