Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: auto-generate ruleset cache on source change #2133

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

s-ff
Copy link
Collaborator

@s-ff s-ff commented Jun 7, 2024

Closes #1898.
This is a proposal for the issue discussed in #1898.

Let's test out its feasibility and make sure it does not introduce an overhead.

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add bug fixes, new features, breaking changes and anything else you think is worthwhile mentioning to the master (unreleased) section of CHANGELOG.md. If no CHANGELOG update is needed add the following to the PR description: [x] No CHANGELOG update needed

Copy link
Collaborator

@williballenthin williballenthin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

required changes marked with 👀

Great first pass!

I don't love this logic living somewhat deep in the library code, since it depends on the environment of the program (capa). For example, another project that uses capa as a library may not want to have this behavior. Do you think it's possible to cleanly move the dev/git logic into capa.main?

capa/rules/cache.py Outdated Show resolved Hide resolved
capa/rules/cache.py Outdated Show resolved Hide resolved
capa/rules/cache.py Outdated Show resolved Hide resolved
capa/rules/cache.py Outdated Show resolved Hide resolved
capa/rules/cache.py Outdated Show resolved Hide resolved
@github-actions github-actions bot dismissed their stale review June 7, 2024 14:28

CHANGELOG updated or no update needed, thanks! 😄

capa/rules/cache.py Outdated Show resolved Hide resolved
@mr-tz
Copy link
Collaborator

mr-tz commented Jun 7, 2024

Do you think it's possible to cleanly move the dev/git logic into capa.main?

agree here to move this to capa.main or a utils file.

capa/rules/cache.py Outdated Show resolved Hide resolved
capa/rules/cache.py Outdated Show resolved Hide resolved
capa/rules/cache.py Outdated Show resolved Hide resolved
"developer environment detected, ruleset cache will be auto-generated upon each source modification"
)
return hash.hexdigest()
except Exception as e:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please select explicit Exceptions so we don't mask unknown errors

Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

beyond splitting this up we'd prefer to call this from capa.main (and not deep in the library)
can we add the check if this is a dev environment there instead? (e.g. around get_rules)

@s-ff
Copy link
Collaborator Author

s-ff commented Jun 10, 2024

beyond splitting this up we'd prefer to call this from capa.main (and not deep in the library) can we add the check if this is a dev environment there instead? (e.g. around get_rules)

Moving the dev environment checks to capa.main and importing them from capa.main would create circular dependencies when I tested it.

@mr-tz
Copy link
Collaborator

mr-tz commented Jun 10, 2024

beyond splitting this up we'd prefer to call this from capa.main (and not deep in the library) can we add the check if this is a dev environment there instead? (e.g. around get_rules)

Moving the dev environment checks to capa.main and importing them from capa.main would create circular dependencies when I tested it.

I mean to call it from there not to implement them in main.

s-ff and others added 9 commits July 3, 2024 12:01
This commits groups the computing cache identifier logic
into a a single function: compute_cache_identifier.

Initially, this the cache id computing was split into two functions
compute_ruleset_cache_identifier and compute_cache_identifier.
This commits groups the computing cache identifier logic
into a a single function: compute_cache_identifier.

Initially, this the cache id computing was split into two functions
compute_ruleset_cache_identifier and compute_cache_identifier.
@mr-tz mr-tz force-pushed the autogen-ruleset-cache branch 4 times, most recently from 4f26e07 to 8b89a70 Compare July 4, 2024 12:13
Comment on lines +249 to +251
if "site-packages" in __file__:
# running from a site-packages installation
return False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also consider dist-packages here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doesn't seem important but if you can demonstrate it's relevant this shouldn't be hard

@mr-tz
Copy link
Collaborator

mr-tz commented Jul 22, 2024

Any concerns or blockers before merging this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

main: add option to ignore rule cache
3 participants