Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid repeat installs (#6) #7

Closed
wants to merge 1 commit into from

Conversation

rvesse
Copy link
Contributor

@rvesse rvesse commented Oct 14, 2024

This commit adds tracking and detection of when the current job has already called setup-trivy (whether directly/indirectly) and avoids repeatedly installing it once it has been installed

This is my proposed fix for #6, basically the action now does the following:

  • It sets a TRIVY_INSTALLED environment variable when it finishes installation which contains the version and path from the action inputs.
  • As the first step of the action checks whether that environment variable is populated with the values from the inputs and if so sets an output indicating that the requested version is already installed.
  • In all subsequent steps makes them conditional on that check reporting false

Example workflow with the fix in place:

Screenshot 2024-10-14 at 11 16 17

This is from my debugging repository (rvesse/setup-trivy-debugging) and you can see the full output at https://github.com/rvesse/setup-trivy-debugging/actions/runs/11325497781/job/31492526052

@CLAassistant
Copy link

CLAassistant commented Oct 14, 2024

CLA assistant check
All committers have signed the CLA.

action.yaml Show resolved Hide resolved
if [ "$TRIVY_INSTALLED" == "${{ inputs.version }}-${{ inputs.path }}" ]; then
echo "Trivy '${{ inputs.version }}' has already been installed by the current job, skipping reinstalling it again"
echo "installed=true" >> $GITHUB_OUTPUT
else
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be an argument for having an elif [ -n "$TRIVY_INSTALLED" ]; then branch here that generates a warning because if people are installing multiple versions of Trivy in a single job as there's a few pitfalls there:

  • Only the first install will be automatically found via the PATH as the action always appends to $GITHUB_PATH
  • If they don't set the path input to different values then one version overwrites the other which may/may not be what the user intended

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebased the PR to fix conflicts and added this extra check in:

Screenshot 2024-10-15 at 09 32 55

@DmitriyLewen
Copy link
Contributor

Hi @rvesse
I don't see similar logic in similar actions (e.g. setup-go) and don't see a real use case for these changes.
Can you give more real world examples for this?

@knqyf263 what do you think about these changes? I'm not sure if this can be useful for more users

It looks like if you want these changes, you can use the path input to set the path to the Trivy binary, and use that path in your action.

@knqyf263
Copy link
Collaborator

Am I understanding correctly that you are concerned about the overhead when setup-trivy is called multiple times? I can't think of many such cases, and I think it is fast enough since it is restored from the cache.

@rvesse
Copy link
Contributor Author

rvesse commented Oct 15, 2024

Hi @rvesse I don't see similar logic in similar actions (e.g. setup-go) and don't see a real use case for these changes. Can you give more real world examples for this?

So the difference there is that people are directly calling setup-go/setup-java etc once, usually at the start of their workflow and they control when that action is invoked.

In the case of setup-trivy users are not calling it directly, rather it is getting indirectly called whenever a user invokes trivy-action in their workflow.

As described in my issue #6 many users, including my company, are calling the trivy-action multiple times for legitimate reasons e.g.

  • Obtaining scan results in multiple output formats
  • Obtaining complete scan results for all severities for reporting/audit purposes while separately failing the build based on vulnerabilities exceeding a specific severity threshold
  • Multiple scan types in the same workflow e.g. Docker misconfiguration (config) and Docker image (image)

@knqyf263 what do you think about these changes? I'm not sure if this can be useful for more users

Note also that this repeated install risks creating a new source of rate limiting errors because now user workflows are querying the GitHub API for releases every time their workflow calls setup-trivy (which as noted above they typically don't directly do, it happens indirectly via the trivy-action action). While release downloads themselves are not subject to rate limits the API calls to find the release are subject to those.

It looks like if you want these changes, you can use the path input to set the path to the Trivy binary, and use that path in your action.

If I do that then I can't use trivy-action anymore because it is always going to invoke setup-trivy, I then have to manually call setup-trivy and manually invoke trivy myself in generic run based steps

The only way that approach would work was if you had a skip-setup input (or equivalent) in the trivy-action that allowed skipping the step that calls setup-trivy in the main action

@rvesse rvesse force-pushed the avoid-repeat-install branch from 9029297 to cdb9d8d Compare October 15, 2024 08:17
This commit adds tracking and detection of when the current job has
already called setup-trivy (whether directly/indirectly) and avoids
repeatedly installing it once it has been installed
@rvesse rvesse force-pushed the avoid-repeat-install branch from cdb9d8d to 5b3d6a1 Compare October 15, 2024 08:19
@rvesse
Copy link
Contributor Author

rvesse commented Oct 15, 2024

Am I understanding correctly that you are concerned about the overhead when setup-trivy is called multiple times? I can't think of many such cases, and I think it is fast enough since it is restored from the cache.

It's only restored from the cache if users are using a fixed version, if they are using latest then the cache is always skipped

Yes installing it is generally fast but that's still not a justification to install it multiple times unnecessarily:

  1. It pollutes the workflow logs with avoidable duplicated log messages
  2. As setup-trivy is called implicitly from trivy-action and can't be disabled then this is out of the users control
  3. It has the potential to hit GitHub API rate limits if you have an active repository with lots of workflows all calling trivy-action multiple times implicitly calling setup-trivy which potentially queries the GitHub Releases API each time

@DmitriyLewen
Copy link
Contributor

hm... that makes sense.
I'm still not sure about adding TRIVY_INSTALLED, but skip-trivy-setup input for trivy-action seems like a good solution to me.

It will add flexibility for users:

  • they can install trivy themselves (setup-trivy, brew, apt, etc.).
  • they can change the path for trivy binaries
  • it also solves this problem (users just install Trivy once (like with setup-go) and use trivy in trivy-action))

@rvesse
Copy link
Contributor Author

rvesse commented Oct 15, 2024

@DmitriyLewen Ok, skip-setup approach in draft PR aquasecurity/trivy-action#414

@knqyf263
Copy link
Collaborator

It's only restored from the cache if users are using a fixed version, if they are using latest then the cache is always skipped

The version is pinned by default, right? So, we can just add a caveat that using latest may cause unneeded API calls.

@rvesse
Copy link
Contributor Author

rvesse commented Oct 16, 2024

Superseded by aquasecurity/trivy-action#414

@rvesse rvesse closed this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants