Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git resolver timeout > 1m doesn't work #6025

Open
lbernick opened this issue Jan 24, 2023 · 6 comments · May be fixed by #8366
Open

Git resolver timeout > 1m doesn't work #6025

lbernick opened this issue Jan 24, 2023 · 6 comments · May be fixed by #8366
Labels
area/resolution Issues related to remote resolution kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@lbernick
Copy link
Member

Expected Behavior

Git resolution times out after the amount of time specified by "fetch-timeout" in the git-resolver-config configmap in the tekton-pipelines-resolvers namespace.

Actual Behavior

Resolution times out after 1m. Possibly because of this?

Additional Info

  • Kubernetes version:

    Output of kubectl version:

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.3", GitCommit:"aef86a93758dc3cb2c658dd9657ab4ad4afc21cb", GitTreeState:"clean", BuildDate:"2022-07-13T14:21:56Z", GoVersion:"go1.18.4", Compiler:"gc", Platform:"darwin/arm64"}
Kustomize Version: v4.5.4
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.13-gke.900", GitCommit:"900b919b3b8275ecb9f70a489a318b5b08a4ab9c", GitTreeState:"clean", BuildDate:"2022-10-26T09:25:23Z", GoVersion:"go1.17.13b7", Compiler:"gc", Platform:"linux/amd64"}
  • Tekton Pipeline version:

    Output of tkn version or kubectl get pods -n tekton-pipelines -l app=tekton-pipelines-controller -o=jsonpath='{.items[0].metadata.labels.version}'

Client version: 0.25.0
Chains version: v0.8.0
Pipeline version: v0.41.0
Triggers version: v0.22.0
Dashboard version: v0.31.0
@lbernick lbernick added the kind/bug Categorizes issue or PR as related to a bug. label Jan 24, 2023
@abayer
Copy link
Contributor

abayer commented Jan 24, 2023

Yup, because of that. I missed that TODO. Whoops.

@lbernick
Copy link
Member Author

A brief explanation of why this issue is occurring:

When Pipelines sees a reference to a remote task or a remote pipeline, it creates a resolution request object that is labeled with the resolver that should handle the request. Resolvers, including the git resolver, watch for resolution requests to be created that they can handle. The git resolver will time out fetching something from git after the time configured in the git resolver feature flags. However, pipelines will cancel the resolution request after 1 minute. So if the git timeout is > 1m, the git resolver won't time out after 1m, but the request will get timed out anyway.

One idea for solving this is to allow the resolver to update the resolution request with something indicating "resolution is in progress, please don't time this out". The global timeout would then only apply if the resolver hadn't updated the resolution request at all.

@EmmaMunley
Copy link
Contributor

/assign @EmmaMunley

I can take this on. I think we can start with making the global timeout configurable, and later remove the git timeout if it's too confusing.

abayer added a commit to abayer/tektoncd-pipeline that referenced this issue Feb 2, 2023
fixes tektoncd#6091

(this can be merged before tektoncd/plumbing#1333, but it'd definitely be beter for that Plumbing PR to merge first)

This switches how we resolve the `publish-release` `Task` in `release-pipeline` from an anonymous git clone to using the GitHub API. The full clone approach is almost always timing out, in part thanks to tektoncd#6025, but even if it finished successfully every time, it'd still be adding at least a minute of extra time to the pipeline execution for no particularly good reason. Using the GitHub API-based resolution instead means no clone is needed, with the resolver just making a couple very specific API calls to get the contents of the specified file. So yeah, much faster!

Signed-off-by: Andrew Bayer <[email protected]>
tekton-robot pushed a commit that referenced this issue Feb 3, 2023
fixes #6091

(this can be merged before tektoncd/plumbing#1333, but it'd definitely be beter for that Plumbing PR to merge first)

This switches how we resolve the `publish-release` `Task` in `release-pipeline` from an anonymous git clone to using the GitHub API. The full clone approach is almost always timing out, in part thanks to #6025, but even if it finished successfully every time, it'd still be adding at least a minute of extra time to the pipeline execution for no particularly good reason. Using the GitHub API-based resolution instead means no clone is needed, with the resolver just making a couple very specific API calls to get the contents of the specified file. So yeah, much faster!

Signed-off-by: Andrew Bayer <[email protected]>
JeromeJu pushed a commit to JeromeJu/pipeline that referenced this issue Feb 6, 2023
fixes tektoncd#6091

(this can be merged before tektoncd/plumbing#1333, but it'd definitely be beter for that Plumbing PR to merge first)

This switches how we resolve the `publish-release` `Task` in `release-pipeline` from an anonymous git clone to using the GitHub API. The full clone approach is almost always timing out, in part thanks to tektoncd#6025, but even if it finished successfully every time, it'd still be adding at least a minute of extra time to the pipeline execution for no particularly good reason. Using the GitHub API-based resolution instead means no clone is needed, with the resolver just making a couple very specific API calls to get the contents of the specified file. So yeah, much faster!

Signed-off-by: Andrew Bayer <[email protected]>
@lbernick lbernick added the area/resolution Issues related to remote resolution label Feb 15, 2023
@EmmaMunley EmmaMunley removed their assignment May 11, 2023
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 9, 2023
@vdemeester
Copy link
Member

/lifecycle frozen

@tekton-robot tekton-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Aug 10, 2023
@anithapriyanatarajan
Copy link

/remove-lifecycle stale

Please prioritize this issue since this is impacting timely completion of many pipeline runs relying on remote tasks

@vdemeester vdemeester added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Oct 29, 2024
@jkhelil jkhelil linked a pull request Nov 5, 2024 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/resolution Issues related to remote resolution kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants