Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error while scaling a deployment #54

Open
abuechler opened this issue Apr 19, 2024 · 8 comments
Open

Error while scaling a deployment #54

abuechler opened this issue Apr 19, 2024 · 8 comments
Labels

Comments

@abuechler
Copy link
Contributor

abuechler commented Apr 19, 2024

I guess this has been seen before, but I couldn't find any related issue. I'm not sure this is actually a bug, or more a "wrong" log level for this kind of message.

How to reproduce

  1. Create a deployment with number of replicas > 1
  2. Wait until the PDB gets created through the pdb-controller
  3. Scale down the deployment to replicas=1
  4. The pdb-controller logs an error as show below
  5. The PDB for the deployment gets deleted as expected

The log entries for the pdb-controller in that case logs following error (on GKE 1.27.11 and Docker Desktop Kubernetes 1.28.2):

pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:01:13Z" level=info action=added namespace=web pdb=nginx-deployment-pdb-controller selector="&LabelSelector{MatchLabels:map[string]string{app: nginx-test,},MatchExpressions:[]LabelSelectorRequirement{},}"
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:02:13Z" level=info action=removed namespace=web pdb=nginx-deployment-pdb-controller selector="&LabelSelector{MatchLabels:map[string]string{app: nginx-test,},MatchExpressions:[]LabelSelectorRequirement{},}"
pdb-controller-7bb68c7cc7-xqg2l pdb-controller time="2024-04-19T13:02:13Z" level=error msg="Failed to update PDB: Operation cannot be fulfilled 

Sample deployment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx-test
  template:
    metadata:
      labels:
        app: nginx-test
    spec:
      containers:
        - name: nginx-container
          image: nginx:latest
          ports:
            - containerPort: 80
@abuechler
Copy link
Contributor Author

To report a new bug you should open an issue that summarizes the bug and set the label to "bug".

It seems I can't add labels, sorry.

@mikkeloscar
Copy link
Owner

@abuechler Is maybe some of the last log line missing? I assume the log is that it can't modify the pdb because it was modified by another resource just before.

@abuechler
Copy link
Contributor Author

abuechler commented Apr 19, 2024

@abuechler Is maybe some of the last log line missing? I assume the log is that it can't modify the pdb because it was modified by another resource just before.

No, there are no more log lines afterwards. The logs look exactly the same for GKE and on Docker Desktop (with no other components running).

@mikkeloscar
Copy link
Owner

Ok, I also see this sort of error in our setup. It looks like this:

Failed to update PDB: Operation cannot be fulfilled on poddisruptionbudgets.policy \"foo\": StorageError: invalid object, Code: 4, Key: /registry-foo1/poddisruptionbudgets/default/foo, ResourceVersion: 0, AdditionalErrorMsg: Precondition failed: UID in precondition: 9263b1a8-a10e-4e6f-93ad-fbd42ddbcf18, UID in object meta: 

This could be that it tries to update after it was already deleted. This seems like a small bug. It should still work though as you describe.

@abuechler
Copy link
Contributor Author

@mikkeloscar Thank you for maintaining this project and reacting so quickly! 🙏

@michohl
Copy link
Contributor

michohl commented Aug 5, 2024

Is there any update on this? We're seeing the same error in our production logs. Is this something that can be safely ignored or does it need addressed?

@szuecs
Copy link
Collaborator

szuecs commented Aug 15, 2024

@michohl I merged your pr, thanks for that one!
Please report back if the logs reduce or are fixed. As far as I understand from @mikkeloscar response the problem is only a logging problem and nothing serious.

If we can close the issue would be of course great 😊

@michohl
Copy link
Contributor

michohl commented Aug 18, 2024

@szuecs I can confirm that the change I submitted resolves the erroneous errors (agreed they weren't actually harmful) in all of our own clusters. I don't explicitly run the upstream code though, we run a very slightly tweaked fork.

So I would defer the final confirmation to @abuechler just in case there's some weird change in our fork that makes the change work better. I can't imagine any scenario that would be true but I would hate to tell you it's good without testing the exact code in question.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants