Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: store artifacts in cache by default #399

Merged
merged 13 commits into from
Oct 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .github/workflows/bump-trivy.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,9 @@ jobs:
- uses: actions/checkout@v4
- name: Update Trivy versions
run: |
sed -r -i "s/ghcr.io\/aquasecurity\/trivy:[0-9]+\.[0-9]+\.[0-9]+/ghcr.io\/aquasecurity\/trivy:${{ inputs.trivy_version }}/" Dockerfile
find test/data -type f -name '*.test' | xargs sed -r -i 's/"version": "[0-9]+\.[0-9]+\.[0-9]+"/"version": "${{ inputs.trivy_version }}"/'
sed -r -i '/^\| `version`/ s/[0-9]+\.[0-9]+\.[0-9]+/${{ inputs.trivy_version }}/g' README.md
sed -r -i 's/(default:[ ]*'"'"')v[0-9]+\.[0-9]+\.[0-9]+/\1v${{ inputs.trivy_version }}/' action.yaml

- name: Create PR
id: create-pr
Expand Down
10 changes: 5 additions & 5 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,8 @@ jobs:
trivy --version

- name: Test
run: |
chmod +x entrypoint.sh
GITHUB_REPOSITORY_OWNER=aquasecurity\
TRIVY_CACHE_DIR=.cache TRIVY_DISABLE_VEX_NOTICE=true TRIVY_DEBUG=true\
bats --recursive --timing --verbose-run .
run: bats --recursive --timing --verbose-run .
env:
TRIVY_CACHE_DIR: .cache
TRIVY_DISABLE_VEX_NOTICE: true
TRIVY_DEBUG: true
5 changes: 0 additions & 5 deletions Dockerfile

This file was deleted.

158 changes: 99 additions & 59 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,22 @@

## Table of Contents

- [Usage](#usage)
- [Workflow](#workflow)
- [Docker Image Scanning](#using-trivy-with-github-code-scanning)
- [Git Repository Scanning](#using-trivy-to-scan-your-git-repo)
- [Customizing](#customizing)
- [Inputs](#inputs)
* [Usage](#usage)
* [Scan CI Pipeline](#scan-ci-pipeline)
* [Scan CI Pipeline (w/ Trivy Config)](#scan-ci-pipeline-w-trivy-config)
* [Cache](#cache)
* [Scanning a Tarball](#scanning-a-tarball)
* [Using Trivy with GitHub Code Scanning](#using-trivy-with-github-code-scanning)
* [Using Trivy to scan your Git repo](#using-trivy-to-scan-your-git-repo)
* [Using Trivy to scan your rootfs directories](#using-trivy-to-scan-your-rootfs-directories)
* [Using Trivy to scan Infrastructure as Code](#using-trivy-to-scan-infrastructure-as-code)
* [Using Trivy to generate SBOM](#using-trivy-to-generate-sbom)
* [Using Trivy to scan your private registry](#using-trivy-to-scan-your-private-registry)
* [Using Trivy if you don't have code scanning enabled](#using-trivy-if-you-dont-have-code-scanning-enabled)
* [Customizing](#customizing)
* [inputs](#inputs)
* [Environment variables](#environment-variables)
* [Trivy config file](#trivy-config-file)

## Usage

Expand All @@ -36,8 +46,7 @@ jobs:
- name: Checkout code
uses: actions/checkout@v3
- name: Build an image from Dockerfile
run: |
docker build -t docker.io/my-organization/my-app:${{ github.sha }} .
run: docker build -t docker.io/my-organization/my-app:${{ github.sha }} .
- name: Run Trivy vulnerability scanner
uses: aquasecurity/[email protected]
with:
Expand Down Expand Up @@ -95,6 +104,86 @@ Trivy uses [Viper](https://github.com/spf13/viper) which has a defined precedenc
- Config file
- Default

### Cache
The action has a built-in functionality for caching and restoring [the vulnerability DB](https://github.com/aquasecurity/trivy-db), [the Java DB](https://github.com/aquasecurity/trivy-java-db) and [the checks bundle](https://github.com/aquasecurity/trivy-checks) if they are downloaded during the scan.
The cache is stored in the `$GITHUB_WORKSPACE/.cache/trivy` directory by default.
The cache is restored before the scan starts and saved after the scan finishes.

It uses [actions/cache](https://github.com/actions/cache) under the hood but requires less configuration settings.
The cache input is optional, and caching is turned on by default.

#### Disabling caching
If you want to disable caching, set the `cache` input to `false`, but we recommend keeping it enabled to avoid rate limiting issues.

```yaml
- name: Run Trivy scanner without cache
uses: aquasecurity/[email protected]
with:
scan-type: 'fs'
scan-ref: '.'
cache: 'false'
```

#### Updating caches in the default branch
Please note that there are [restrictions on cache access](https://docs.github.com/en/actions/writing-workflows/choosing-what-your-workflow-does/caching-dependencies-to-speed-up-workflows#restrictions-for-accessing-a-cache) between branches in GitHub Actions.
By default, a workflow can access and restore a cache created in either the current branch or the default branch (usually `main` or `master`).
If you need to share caches across branches, you may need to create a cache in the default branch and restore it in the current branch.

To optimize your workflow, you can set up a cron job to regularly update the cache in the default branch.
This allows subsequent scans to use the cached DB without downloading it again.

```yaml
# Note: This workflow only updates the cache. You should create a separate workflow for your actual Trivy scans.
# In your scan workflow, set TRIVY_SKIP_DB_UPDATE=true and TRIVY_SKIP_JAVA_DB_UPDATE=true.
name: Update Trivy Cache

on:
schedule:
- cron: '0 0 * * *' # Run daily at midnight UTC
workflow_dispatch: # Allow manual triggering

jobs:
update-trivy-db:
runs-on: ubuntu-latest
steps:
- name: Get current date
id: date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT

- name: Download and extract the vulnerability DB
run: |
mkdir -p $GITHUB_WORKSPACE/.cache/trivy/db
oras pull ghcr.io/aquasecurity/trivy-db:2
tar -xzf db.tar.gz -C $GITHUB_WORKSPACE/.cache/trivy/db
rm db.tar.gz

- name: Download and extract the Java DB
run: |
mkdir -p $GITHUB_WORKSPACE/.cache/trivy/java-db
oras pull ghcr.io/aquasecurity/trivy-java-db:1
tar -xzf javadb.tar.gz -C $GITHUB_WORKSPACE/.cache/trivy/java-db
rm javadb.tar.gz

- name: Cache DBs
uses: actions/cache/save@v4
with:
path: ${{ github.workspace }}/.cache/trivy
key: cache-trivy-${{ steps.date.outputs.date }}
```

When running a scan, set the environment variables `TRIVY_SKIP_DB_UPDATE` and `TRIVY_SKIP_JAVA_DB_UPDATE` to skip the download process.

```yaml
- name: Run Trivy scanner without downloading DBs
uses: aquasecurity/[email protected]
with:
scan-type: 'image'
scan-ref: 'myimage'
env:
TRIVY_SKIP_DB_UPDATE: true
TRIVY_SKIP_JAVA_DB_UPDATE: true
```

### Scanning a Tarball
```yaml
name: build
Expand Down Expand Up @@ -123,56 +212,6 @@ jobs:
severity: 'CRITICAL,HIGH'
```

### Using cache for Trivy databases
Recently, there has been an increase in cases of receiving the `TOOMANYREQUESTS` error when downloading the Trivy databases (`trivy-db`, `trivy-java-db` and `trivy-checks`).

If you’re performing multiple scans, it makes sense to use [action/cache](https://github.com/actions/cache) to cache one or more databases.

The example below saves the `trivy-db` for each day in the cache:

```yaml
name: build
on:
push:
branches:
- main
pull_request:

jobs:
build:
name: Build
runs-on: ubuntu-20.04
steps:
- name: Checkout code
uses: actions/checkout@v4

## To avoid the trivy-db becoming outdated, we save the cache for one day
- name: Get data
id: date
run: echo "date=$(date +%Y-%m-%d)" >> $GITHUB_OUTPUT

- name: Restore trivy cache
uses: actions/cache@v4
with:
path: cache/db
key: trivy-cache-${{ steps.date.outputs.date }}
restore-keys:
trivy-cache-

- name: Run Trivy vulnerability scanner in fs mode
uses: aquasecurity/[email protected]
with:
scan-type: 'fs'
scan-ref: '.'
cache-dir: "./cache"

## Trivy-db uses `0600` permissions.
## But `action/cache` use `runner` user by default
## So we need to change the permissions before caching the database.
- name: change permissions for trivy.db
run: sudo chmod 0644 ./cache/db/trivy.db
```

### Using Trivy with GitHub Code Scanning
If you have [GitHub code scanning](https://docs.github.com/en/github/finding-security-vulnerabilities-and-errors-in-your-code/about-code-scanning) available you can use Trivy as a scanning tool as follows:
```yaml
Expand Down Expand Up @@ -630,7 +669,7 @@ Following inputs can be used as `step.with` keys:
| `severity` | String | `UNKNOWN,LOW,MEDIUM,HIGH,CRITICAL` | Severities of vulnerabilities to scanned for and displayed |
| `skip-dirs` | String | | Comma separated list of directories where traversal is skipped |
| `skip-files` | String | | Comma separated list of files where traversal is skipped |
| `cache-dir` | String | | Cache directory |
| `cache-dir` | String | `$GITHUB_WORKSPACE/.cache/trivy` | Cache directory |
| `timeout` | String | `5m0s` | Scan timeout duration |
| `ignore-policy` | String | | Filter vulnerabilities with OPA rego language |
| `hide-progress` | String | `false` | Suppress progress bar and log output |
Expand All @@ -641,6 +680,7 @@ Following inputs can be used as `step.with` keys:
| `github-pat` | String | | Authentication token to enable sending SBOM scan results to GitHub Dependency Graph. Can be either a GitHub Personal Access Token (PAT) or GITHUB_TOKEN |
| `limit-severities-for-sarif` | Boolean | false | By default *SARIF* format enforces output of all vulnerabilities regardless of configured severities. To override this behavior set this parameter to **true** |
| `docker-host` | String | | By default it is set to `unix://var/run/docker.sock`, but can be updated to help with containerized infrastructure values |
| `version` | String | `v0.56.1` | Trivy version to use, e.g. `latest` or `v0.56.1` |

### Environment variables
You can use [Trivy environment variables][trivy-env] to set the necessary options (including flags that are not supported by [Inputs](#inputs), such as `--secret-config`).
Expand Down
104 changes: 71 additions & 33 deletions action.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are adding GitHub cache - some users may want to change the update interval (e.g. update trivy-db every 2 days).

I think the skip-db-update (and flags for other DBs) flag can be added for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, we have these flags in config file.
But then perhaps we should mention these flags in the documents.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also added an example for cronjob.
a8b935f

Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: 'Aqua Security Trivy'
description: 'Scans container images for vulnerabilities with Trivy'
author: 'Aqua Security'

inputs:
scan-type:
description: 'Scan type to use for scanning vulnerability'
Expand All @@ -24,7 +25,7 @@ inputs:
description: 'ignore unfixed vulnerabilities'
required: false
default: 'false'
vuln-type:
vuln-type: # TODO: rename to pkg-types
description: 'comma-separated list of vulnerability types (os,library)'
required: false
default: 'os,library'
Expand Down Expand Up @@ -55,7 +56,7 @@ inputs:
cache-dir:
description: 'specify where the cache is stored'
required: false
default: ''
default: '${{ github.workspace }}/.cache/trivy'
timeout:
description: 'timeout (default 5m0s)'
required: false
Expand All @@ -79,9 +80,6 @@ inputs:
description: 'comma-separated list of relative paths in repository to one or more .trivyignore files'
required: false
default: ''
artifact-type:
description: 'input artifact type (image, fs, repo, archive) for SBOM generation'
required: false
github-pat:
description: 'GitHub Personal Access Token (PAT) for submitting SBOM to GitHub Dependency Snapshot API'
required: false
Expand All @@ -97,33 +95,73 @@ inputs:
docker-host:
description: 'unix domain socket path to use for docker scanning, ex. unix:///var/run/docker.sock'
required: false
version:
description: 'Trivy version to use'
required: false
default: 'v0.56.1'
cache:
description: 'Used to specify whether caching is needed. Set to false, if you would like to disable caching.'
required: false
default: 'true'

runs:
using: 'docker'
image: "Dockerfile"
args:
- '-a ${{ inputs.scan-type }}'
- '-b ${{ inputs.format }}'
- '-c ${{ inputs.template }}'
- '-d ${{ inputs.exit-code }}'
- '-e ${{ inputs.ignore-unfixed }}'
- '-f ${{ inputs.vuln-type }}'
- '-g ${{ inputs.severity }}'
- '-h ${{ inputs.output }}'
- '-i ${{ inputs.image-ref }}'
- '-j ${{ inputs.scan-ref }}'
- '-k ${{ inputs.skip-dirs }}'
- '-l ${{ inputs.input }}'
- '-m ${{ inputs.cache-dir }}'
- '-n ${{ inputs.timeout }}'
- '-o ${{ inputs.ignore-policy }}'
- '-p ${{ inputs.hide-progress }}'
- '-q ${{ inputs.skip-files }}'
- '-r ${{ inputs.list-all-pkgs }}'
- '-s ${{ inputs.scanners }}'
- '-t ${{ inputs.trivyignores }}'
- '-u ${{ inputs.github-pat }}'
- '-v ${{ inputs.trivy-config }}'
- '-x ${{ inputs.tf-vars }}'
- '-z ${{ inputs.limit-severities-for-sarif }}'
- '-y ${{ inputs.docker-host }}'
using: 'composite'
steps:
- name: Install Trivy
shell: bash
run: curl -sfL https://raw.githubusercontent.com/aquasecurity/trivy/main/contrib/install.sh | sudo sh -s -- -b /usr/local/bin ${{ inputs.version }}
Comment on lines +108 to +112

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this switched from docker to running on the local system? It's not a good practice to grant sudo access to github action runners. I just had some actions fail because they attempted to use sudo. Unless I misunderstand what "composite" means?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also means curl is a requirement where it wasn't previously.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @mattnakama-skytap @danielnitsche
We migrated to local system to add the ability to store databases in cache.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DmitriyLewen You might want to consider installing via something like https://github.com/jaxxstorm/action-install-gh-release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad request - jaxxstorm/[email protected] is not allowed to be used in apache/pulsar.
https://github.com/apache/pulsar/actions/runs/11267174066/job/31332113823?pr=23429#step:1:45
Addressing this for Apache Pulsar in this way: apache/pulsar#23431


- name: Get current date
id: date
shell: bash
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT

- name: Restore DB from cache
if: ${{ inputs.cache == 'true' }}
uses: actions/cache@v4
with:
path: ${{ inputs.cache-dir }}
key: cache-trivy-${{ steps.date.outputs.date }}
restore-keys: cache-trivy-

- name: Set GitHub Path
run: echo "$GITHUB_ACTION_PATH" >> $GITHUB_PATH
shell: bash
env:
GITHUB_ACTION_PATH: ${{ github.action_path }}

- name: Run Trivy
shell: bash
run: entrypoint.sh
env:
# For shell script
# > If the action is written using a composite, then it will not automatically get INPUT_<VARIABLE_NAME>
# cf. https://docs.github.com/en/actions/sharing-automations/creating-actions/metadata-syntax-for-github-actions#example-specifying-inputs
INPUT_SCAN_TYPE: ${{ inputs.scan-type }}
INPUT_IMAGE_REF: ${{ inputs.image-ref }}
INPUT_SCAN_REF: ${{ inputs.scan-ref }}
INPUT_TRIVYIGNORES: ${{ inputs.trivyignores }}
INPUT_GITHUB_PAT: ${{ inputs.github-pat }}
INPUT_LIMIT_SEVERITIES_FOR_SARIF: ${{ inputs.limit-severities-for-sarif }}

# For Trivy
# cf. https://aquasecurity.github.io/trivy/latest/docs/configuration/#environment-variables
TRIVY_INPUT: ${{ inputs.input }}
TRIVY_EXIT_CODE: ${{ inputs.exit-code }}
TRIVY_IGNORE_UNFIXED: ${{ inputs.ignore-unfixed }}
TRIVY_PKG_TYPES: ${{ inputs.vuln-type }}
TRIVY_SEVERITY: ${{ inputs.severity }}
TRIVY_FORMAT: ${{ inputs.format }}
TRIVY_TEMPLATE: ${{ inputs.template }}
TRIVY_OUTPUT: ${{ inputs.output }}
TRIVY_SKIP_DIRS: ${{ inputs.skip-dirs }}
TRIVY_SKIP_FILES: ${{ inputs.skip-files }}
TRIVY_CACHE_DIR: ${{ inputs.cache-dir }}
TRIVY_TIMEOUT: ${{ inputs.timeout }}
TRIVY_IGNORE_POLICY: ${{ inputs.ignore-policy }}
TRIVY_QUIET: ${{ inputs.hide-progress }}
TRIVY_LIST_ALL_PKGS: ${{ inputs.list-all-pkgs }}
TRIVY_SCANNERS: ${{ inputs.scanners }}
TRIVY_CONFIG: ${{ inputs.trivy-config }}
TRIVY_TF_VARS: ${{ inputs.tf-vars }}
TRIVY_DOCKER_HOST: ${{ inputs.docker-host }}
Loading
Loading