Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch protection rules prevent first PR of a merge-batch from being merged #32097

Closed
oliver-goetz opened this issue Feb 28, 2024 · 9 comments
Closed
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/testing Categorizes an issue or PR as relevant to SIG Testing.

Comments

@oliver-goetz
Copy link
Contributor

What happened:
When tide is creating a new merge-batch it starts the required tests for the batch, but also trigger those tests for its member with the lowest PR number (PR 9200 in my screenshot below).

Screenshot 2024-02-28 at 16 14 50

tide starts merging the PRs directly when the tests for the merge-batch are completed.
If there are branch-protection rules those rules prevent the PR with the lowest number (PR 9200) from being merged until its individual tests succeeded.
This results in a race between both test runs. When the merge-batch test run wins, all PRs except the first are merged and the first PR becomes part of the next merge-batch.
It is likely that the merge-batch test run wins the race, because it is created first.

Screenshot 2024-02-28 at 16 47 21

As you can see in tide history, poor PR 9200 already lost a couple of races and is still not merged yet.

Screenshot 2024-02-28 at 17 56 15

While I'm writing this issue, 9200 lost another race 😢

What you expected to happen:
I except that all PRs of a merge-batch are merged by tide if the tests succeed.

I could imagine two approaches to solve the issue:

  1. tide could wait until all tests of the single PR are finished too before it starts merging
  2. tide could still merge directly when the tests of the merge-batch succeed. In this case it could overwrite the contexts of the tests of the single PR

How to reproduce it (as minimally and precisely as possible):

Please provide links to example occurrences, if any:
You could check the history of PR 9200 in our prow instance:
https://prow.gardener.cloud/tide-history?repo=gardener%2Fgardener&pull=9200

Anything else we need to know?:
I'm happy to implement a solution if we can agree on an approach.

@oliver-goetz oliver-goetz added the kind/bug Categorizes issue or PR as related to a bug. label Feb 28, 2024
@k8s-ci-robot k8s-ci-robot added the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Feb 28, 2024
@oliver-goetz
Copy link
Contributor Author

/sig testing

@k8s-ci-robot k8s-ci-robot added sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Feb 28, 2024
@petr-muller
Copy link
Member

Ouch, that's painful. Nice report 👍

Personally I like the second approach:

tide could still merge directly when the tests of the merge-batch succeed. In this case it could overwrite the contexts of the tests of the single PR

When a batch passes, Tide could abort the individual jobs for the single PR and backfill their results from the jobs run within a batch. Not sure how feasible that is, but it feels better than waiting for the single PR (which may take more time and even fail in the presence of flakes)

@oliver-goetz
Copy link
Contributor Author

Okay 👍
I did not find the time yet, but maybe I can do something about it the next weeks 😅

@oliver-goetz
Copy link
Contributor Author

/assign

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 8, 2024
@BenTheElder
Copy link
Member

unfortunately github doesn't support migrating issues across organizations, this appears to be active? @oliver-goetz can you re-file this against kubernetes-sigs/prow, we're trying to consolidate prow issues there as part of the migration.

I dropped a ping in #prow slack to nudge re: reviewing your PR

@oliver-goetz
Copy link
Contributor Author

Thanks @BenTheElder 😄
I just re-filed this issue: kubernetes-sigs/prow#209

/close

@k8s-ci-robot
Copy link
Contributor

@oliver-goetz: Closing this issue.

In response to this:

Thanks @BenTheElder 😄
I just re-filed this issue: kubernetes-sigs/prow#209

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@BenTheElder
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. sig/testing Categorizes an issue or PR as relevant to SIG Testing.
Projects
None yet
5 participants