Skip to content
This repository has been archived by the owner on Jun 6, 2023. It is now read-only.

Zombi processes from readiness / liveness probes #50

Open
viceice opened this issue May 24, 2022 · 7 comments
Open

Zombi processes from readiness / liveness probes #50

viceice opened this issue May 24, 2022 · 7 comments

Comments

@viceice
Copy link
Contributor

viceice commented May 24, 2022

I've got a lot zombi processes.

       0 2382791  0.8  0.2 712820  8716 ?        Sl   May17  88:11 /var/lib/rancher/k3s/data/8c2b0191f6e36ec6f3cb68e2302fcc4be850c6db31ec5f8a74e4b3be403101d8/bin/containerd-shim-runc-v2 -namespace k8s.io -id 21da19fa6f824bc4dd21aafe9148d07e95886390c7ef9caad10dcb181b585f58 -address /run/k3s/containerd/containerd.sock
   65535 2382814  0.0  0.0    972     4 ?        Ss   May17   0:00  \_ /pause
       0 1523052  1.5  0.6 648720 23988 ?        Ssl  May19  96:49  \_ keydb-server 0.0.0.0:6379
       0 2207040  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0 2398171  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0 2419093  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0 2921360  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0 2921383  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_liveness_l] <defunct>
       0 3941935  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_liveness_l] <defunct>
       0 3941970  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0 3942325  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0  517206  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_readiness_] <defunct>
       0  517224  0.0  0.0      0     0 ?        Z    May20   0:00      \_ [ping_liveness_l] <defunct>
       0 1082427  0.0  0.0      0     0 ?        Z    May21   0:00      \_ [ping_readiness_] <defunct>
       0 1292829  0.0  0.0      0     0 ?        Z    May21   0:00      \_ [ping_readiness_] <defunct>
       0 3612252  0.0  0.0      0     0 ?        Z    May21   0:00      \_ [ping_readiness_] <defunct>
       0 3999899  0.0  0.0      0     0 ?        Z    May22   0:00      \_ [ping_readiness_] <defunct>
       0  316962  0.0  0.0      0     0 ?        Z    May22   0:00      \_ [ping_readiness_] <defunct>
       0 1221761  0.0  0.0      0     0 ?        Z    May22   0:00      \_ [ping_readiness_] <defunct>
       0 2383088  0.0  0.0      0     0 ?        Z    May22   0:00      \_ [ping_readiness_] <defunct>
       0 2770818  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 2899448  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 4044700  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0  235003  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 1007972  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 1203442  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 1203464  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_liveness_l] <defunct>
       0 1203886  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_liveness_l] <defunct>
       0 1203888  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 1204235  0.0  0.0      0     0 ?        Z    May23   0:00      \_ [ping_readiness_] <defunct>
       0 2429265  0.0  0.0      0     0 ?        Z    06:33   0:00      \_ [ping_readiness_] <defunct>
       0 2451119  0.0  0.0      0     0 ?        Z    06:42   0:00      \_ [ping_readiness_] <defunct>
       0 2466469  0.0  0.0      0     0 ?        Z    06:49   0:00      \_ [ping_readiness_] <defunct>
       0 2557980  0.0  0.0      0     0 ?        Z    07:32   0:00      \_ [ping_readiness_] <defunct>

values.yml

persistentVolume:
  enabled: true
  storageClass: local-path
  size: 1Gi

resources:
  requests:
    memory: 64Mi
  limits:
    memory: 256Mi

loadBalancer:
  enabled: true
  extraSpec:
    externalTrafficPolicy: Local
    loadBalancerIP: 1.2.3.4

existingSecret: some-secret
@viceice viceice changed the title Zombi processes from readiness probes Zombi processes from readiness / liveness probes May 24, 2022
@Antiarchitect
Copy link
Contributor

Couldn't track this on baremetal. Could you please confirm on other platforms like minikube or kind?

@viceice
Copy link
Contributor Author

viceice commented May 25, 2022

seeing this on k3s

@Antiarchitect
Copy link
Contributor

What k3s version do you use? I see some possibly related issues in k3s project: k3s-io/k3s#2722

@viceice
Copy link
Contributor Author

viceice commented May 25, 2022

I use v1.23.6+k3s1, so i don't think it's the containerd issue. I also run k3s on plain ubuntu 20.04 virtual maschines.

@thejan2009
Copy link

thejan2009 commented Jun 14, 2022

Found the same bug, but with an inhouse chart. Removing exec probes in favor of anything else fixed the problem. It seems related to bottom notices at https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#configure-probes although it doesn't make sense as k3s uses containerd instead of docker-shim.

Root cause seems to be that probe invocation processes are children of the main pod process which needs to reap them once they shut down.

Edit: also k3s v1.23.6+k3s1 @ Ubuntu 20.04.

@viceice
Copy link
Contributor Author

viceice commented Jun 14, 2022

For my own images i use dumb-init as entrypoint, with will do this job very well

@viceice
Copy link
Contributor Author

viceice commented Dec 16, 2022

I've now build a custom image which starts dumb-init before keydb, so hopefully no more zombies 🤞

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants