-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolve Errors Around Failed PDB Updates Against Deleted PDBs #57
Conversation
…rageErrors Signed-off-by: Michael Riesberg-Timmer <[email protected]>
4962a5b
to
6a69c00
Compare
After some more testing it seems this PR does resolve the second error message but it does not yet resolve the first one:
Even with a longer backoff than the default it doesn't seem to care much for the update we're trying to make. I'm going to dig some more and try and find where the conflict is coming from. |
After digging some more into this I still can't find which fields are exactly causing issues but because we're doing a I was able to work around this in our own clusters by adding a step after the I think the PR can still go as it is now and then I can come back with a second PR with the described logic for a separate review. I don't want to slow down the fix for the error around the controller attempting to update deleted PDBs. So I will modify the description of this PR to be more focused on that specific issue and will leave the failed updates for another PR after. |
How the retryOnConflict is used doesn't seem right/have any effect according to how it's documented here: https://pkg.go.dev/k8s.io/client-go/util/retry#RetryOnConflict I would propose to remove those changes from this PR, then we can merge the continue fix to avoid updating deleted resources. Also it's not really a problem that an update fails because the controller will automatically retry in next reconciliation loop. |
6a69c00
to
896034a
Compare
Agreed on removing the I have been running a local fork of it in our clusters with a |
👍 |
👍 |
This PR resolves the following error we were seeing in production logs which is related to #54.
The issue is occurs because the
reconcilePDBs
loop deletes the PDB then attempts to update it after it is deleted.This is resolved by ending the loop early if we delete the PDB.
I've also added a
RetryOnConflict
wrapper around theupdate
andcreate
methods since they're easy to add from theclient-go
library and prevent possible issues in the future around updating existing PDBs.