vpa-admission-controller: Wire contexts #6891

ialidzhikov · 2024-06-04T13:52:46Z

Which component are you using?:

vertical-pod-autoscaler

What version of the component are you using?:

Component version: 1.1.2

What k8s version are you using (kubectl version)?:

v1.29

What did you expect to happen?:
Right now, the requests the vpa-admission-controller handles are not contextified. For example, for the handler for Pods, context.TODO() is used in few places where the admission-controller is making requests along the way:

autoscaler/vertical-pod-autoscaler/pkg/target/fetcher.go

Line 186 in c79bdaf

scale, err := f.scaleNamespacer.Scales(namespace).Get(context.TODO(), groupResource, name, metav1.GetOptions{})
autoscaler/vertical-pod-autoscaler/pkg/target/controller_fetcher/controller_fetcher.go

Line 251 in 3a3b388

scale, err := f.scaleNamespacer.Scales(namespace).Get(context.TODO(), groupResource, name, metav1.GetOptions{})

Due to the usages of context.TODO(), when the caller (kube-apiserver) cancels the request (due to client side timeout), the admission-controller's Pod handler is not notified about this and continues to process the requests even when the request is cancelled client side.

We recently faced a VPA related outage (described in #6884) where the vpa-admission-controller was client-side throttled due to the low default kube-api-qps/burst settings.

From the logs we see that it was throttled > 50 minutes:

{"log":"Waited for 51m21.05416376s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver/apis/monitoring.coreos.com/v1/namespaces/foo/prometheuses/bar/scale","pid":"1","severity":"INFO","source":"request.go:697"}
{"log":"Waited for 51m21.024486679s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver/apis/monitoring.coreos.com/v1/namespaces/foo/prometheuses/bar/scale","pid":"1","severity":"INFO","source":"request.go:697"}
{"log":"Waited for 51m20.527328217s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver/apis/monitoring.coreos.com/v1/namespaces/foo/prometheuses/bar/scale","pid":"1","severity":"INFO","source":"request.go:697"}
{"log":"Waited for 51m19.975656855s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver/apis/monitoring.coreos.com/v1/namespaces/foo/prometheuses/bar/scale","pid":"1","severity":"INFO","source":"request.go:697"}
{"log":"Waited for 51m19.466347921s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver/apis/monitoring.coreos.com/v1/namespaces/foo/prometheuses/bar/scale","pid":"1","severity":"INFO","source":"request.go:697"}
{"log":"Waited for 51m18.572692764s due to client-side throttling, not priority and fairness, request: GET:https://kube-apiserver/apis/monitoring.coreos.com/v1/namespaces/foo/prometheuses/bar/scale","pid":"1","severity":"INFO","source":"request.go:697"}

Hence, the vpa-admission-controller currently would wait the client-side throttling (> 50min) instead of canceling the request.

Meanwhile the kube-apiserver cancelled the request after the configured timeout in the webhook (10s in our case):

E0604 13:21:25.379831       1 dispatcher.go:214] failed calling webhook "vpa.k8s.io": failed to call webhook: Post "https://vpa-webhook:443/?timeout=10s": context deadline exceeded

What happened instead?:

See above.

How to reproduce it (as minimally and precisely as possible):

Add a big sleep (higher than the kube-apiserver's timeout) in the Pod handler and make sure that the admission request continues to do things after kube-apiserver cancelled the request client-side.

Anything else we need to know?:

N/A

The text was updated successfully, but these errors were encountered:

Shubham82 · 2024-06-04T14:25:18Z

/area vertical-pod-autoscaler

ialidzhikov added the kind/bug Categorizes issue or PR as related to a bug. label Jun 4, 2024

k8s-ci-robot added the area/vertical-pod-autoscaler label Jun 4, 2024

ialidzhikov linked a pull request Jun 6, 2024 that will close this issue

vpa-admission-controller: Wire contexts #6899

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vpa-admission-controller: Wire contexts #6891

vpa-admission-controller: Wire contexts #6891

ialidzhikov commented Jun 4, 2024

Shubham82 commented Jun 4, 2024

vpa-admission-controller: Wire contexts #6891

vpa-admission-controller: Wire contexts #6891

Comments

ialidzhikov commented Jun 4, 2024

Shubham82 commented Jun 4, 2024