Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(k8s): scan config files as a folder #7690

Merged
merged 8 commits into from
Oct 21, 2024

Conversation

afdesk
Copy link
Contributor

@afdesk afdesk commented Oct 9, 2024

Description

Trivy kubernetes scan tries to process the k8s configs in parallel.
but this leads to large overhead, due to initialize the scanner for each file separately (it needs to read rego policies).

This PR stores all k8s config in the same folder and scan this one at once.
it gives a boost about 9 times on my test minikube instance.

Before

$ time trivy k8s --report all --skip-images -f json -o result.json
2024-10-10T10:12:01+06:00	INFO	Node scanning is enabled
2024-10-10T10:12:01+06:00	INFO	If you want to disable Node scanning via an in-cluster Job, please try '--disable-node-collector' to disable the Node-Collector job.
235 / 235 [-----------------------------------------------------------------------------------------------------------------] 100.00% 3 p/s
trivy k8s --report all --skip-images -f json -o result.json  147,50s user 8,63s system 148% cpu 1:45,12 total

After

$ time ./tr k8s --report all --skip-images -f json -o result.json
2024-10-10T17:04:04+06:00	INFO	Node scanning is enabled
2024-10-10T17:04:04+06:00	INFO	If you want to disable Node scanning via an in-cluster Job, please try '--disable-node-collector' to disable the Node-Collector job.
./tr k8s --report all --skip-images -f json -o result.json  16,83s user 1,12s system 85% cpu 21,002 total

Related issues

Checklist

  • I've read the guidelines for contributing to this repository.
  • I've followed the conventions in the PR title.
  • I've added tests that prove my fix is effective or that my feature works.
  • I've updated the documentation with the relevant information (if needed).
  • I've added usage information (if the PR introduces new options)
  • I've included a "before" and "after" example to the description (if the PR is a user interface change).

@afdesk afdesk changed the title fix(k8s): scan config files as a folder refactor(k8s): scan config files as a folder Oct 10, 2024
@afdesk afdesk marked this pull request as ready for review October 10, 2024 04:21
@afdesk afdesk requested review from knqyf263 and itaysk October 10, 2024 04:22
@afdesk
Copy link
Contributor Author

afdesk commented Oct 10, 2024

@itaysk @knqyf263 Could you take a look at this PR when you have time?

I'm unsure about logs. maybe it's a bit confusing for people.

also I can't reproduce #7684 with this update.

wdyt?

thanks a lot for your time.

@itaysk
Copy link
Contributor

itaysk commented Oct 10, 2024

I'm unsure about logs. maybe it's a bit confusing for people.

cluster scanning can have many resources, and logging each individual resource scan is too much IMO. we can make it a debug level log.

also I can't reproduce #7684 with this update.

Good, that means you think it solved #7684 too?

@afdesk
Copy link
Contributor Author

afdesk commented Oct 10, 2024

I'm unsure about logs. maybe it's a bit confusing for people.
cluster scanning can have many resources, and logging each individual resource scan is too much IMO. we can make it a debug level log.

Thanks!
I understood your point and agree with them.
so I remove the log update from this PR, and will create a new one for logs only.

also I can't reproduce #7684 with this update.
Good, that means you think it solved #7684 too?

I'm unsure, because I don't understand a reason of #7684.
I'm investigating it now

@afdesk
Copy link
Contributor Author

afdesk commented Oct 16, 2024

@itaysk @knqyf263 could you take a look at this PR again? thanks!

@afdesk
Copy link
Contributor Author

afdesk commented Oct 17, 2024

@itaysk I finally clarify the reason of #7684 - the race condition appears between a few misconfig scanners, so this PR should resolve it too.

pkg/k8s/scanner/io.go Outdated Show resolved Hide resolved
pkg/k8s/scanner/io.go Outdated Show resolved Hide resolved
@nikpivkin
Copy link
Contributor

@afdesk Left a couple small comments

@afdesk afdesk requested a review from nikpivkin October 17, 2024 10:54
Copy link
Contributor

@nikpivkin nikpivkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

return report.CreateResource(artifact, configReport, err), err
return nil, xerrors.Errorf("failed to scan filesystem: %w", err)
}
resources := make([]report.Resource, 0, len(k8sArtifacts))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any benefit of defining this of fixed length? Why not simplify like so

Suggested change
resources := make([]report.Resource, 0, len(k8sArtifacts))
var resources []report.Resource

Copy link
Contributor Author

@afdesk afdesk Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try to avoid a few copy operations and memory allocations by grabbing it all up front. (it's a quota )).
I'm not sure it makes sense, so we can change this one.

do you think we should use a simple construction?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't benchmarked this so I can't say for sure but I would assume the Go runtime can grow the slice as needed without much of an overhead.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make is faster than var, so if we know the length beforehand, we should use make for performance reasons.
https://sampath04.hashnode.dev/make-your-go-code-efficient-using-make-when-creating-slices

If the length of the slice is small enough, I personally prefer var because it's easier to read. In this case, k8s resources can be a lot. It makes sense to use make.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that this file is a little short on test coverage, maybe we can improve that a bit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, you're right. we should improve test cases

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - Since this PR also fixes a critical bug for k8s scanning, we can merge it first.

I opened #7768 to track improving the test coverage.

pkg/k8s/scanner/io.go Outdated Show resolved Hide resolved
pkg/k8s/scanner/io.go Outdated Show resolved Hide resolved
Copy link
Member

@simar7 simar7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, but I'll wait for @knqyf263 to take a look as well.

@afdesk
Copy link
Contributor Author

afdesk commented Oct 19, 2024

lgtm, but I'll wait for @knqyf263 to take a look as well.

@simar7 thanks for your review!!


// generateTempFolder creates a folder with yaml files generated from kubernetes artifacts
// returns a folder name, a map for mapping a temp target file to k8s artifact and error
func generateTempFolder(arts []*artifacts.Artifact) (string, map[string]*artifacts.Artifact, error) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think directory is more common than folder in UNIX. This function actually calls MkdirTemp, not MkfolderTemp.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename 84cacc5

Copy link
Collaborator

@knqyf263 knqyf263 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@afdesk
Copy link
Contributor Author

afdesk commented Oct 21, 2024

LGTM

@knqyf263 thanks for your review.

when tests are completed, I'll add this PR to merge queue, ok?

@knqyf263
Copy link
Collaborator

when tests are completed, I'll add this PR to merge queue, ok?

Sure. Just FYI: There are many places where it still says "folder".
84cacc5

@afdesk afdesk added this pull request to the merge queue Oct 21, 2024
Merged via the queue into aquasecurity:main with commit 010b213 Oct 21, 2024
12 checks passed
@afdesk afdesk deleted the fix/k8s-scantime branch October 21, 2024 18:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug (k8s): intermittent failures in k8s scanning bug(k8s): k8s scan works too long
5 participants