Display analysis information #2134

s-ff · 2024-06-10T08:42:07Z

Closes #857.

This commit introduces two new metadata fields to result_document. Would this be considered a breaking change?

This would require regenrating the rdoc test files. see mandiant/capa-testfiles#239.

Checklist

No CHANGELOG update needed

No new tests needed

No documentation update needed

This commit introduces two new metadata fields: - apicall_count: total count of all API calls made in the sample - import_count: total count of Import symbols in the sample

Note, when using rutils.warn(), flake8 raises an error. So using rutils.bold() for now.

mr-tz

looks good, pending the tests to succeed

capa/render/proto/capa.proto

mr-tz · 2024-06-10T14:06:56Z

I think this requires regenerating the files in tests/data/rd/

s-ff · 2024-06-10T18:20:07Z

Should be good to go once mandiant/capa-testfiles#239 is merged.

williballenthin · 2024-06-10T19:03:35Z

capa/capabilities/static.py

@@ -96,7 +98,7 @@ def find_basic_block_capabilities(

 def find_code_capabilities(
    ruleset: RuleSet, extractor: StaticFeatureExtractor, fh: FunctionHandle
-) -> Tuple[MatchResults, MatchResults, MatchResults, int]:
+) -> Tuple[MatchResults, MatchResults, MatchResults, FeatureSet]:


changing the signature of a function is a breaking change, so this should wait until the next major release.

capa/capabilities/static.py

williballenthin · 2024-06-10T19:05:59Z

capa/capabilities/static.py

    feature_counts.file = feature_count

+    # cumulatively count the total number of Import features
+    for feature, _ in file_features.items():


use .keys() here to indicate that you won't use the value

williballenthin · 2024-06-10T19:06:53Z

capa/render/default.py

@@ -19,6 +19,9 @@

 tabulate.PRESERVE_WHITESPACE = True

+MIN_LIBFUNCS_RATIO = 0.4
+MIN_API_CALLS = 10


where did these numbers come from? and how should i interpret them?

MIN_LIBFUNCS_RATIO: When the total count of library function present in a sample is less then 40%, we inform users that capa might pick false positive matches from other functions that would have been classified as library functions. I don't have any statistical data to back this up other than this hex-rays blogpost.

MIN_API_CALLS: When the sample has very few API calls, it is a strong indication that it might be packed/encrypted as regular programs tend to make a lot more than 10 calls (though, we have to run a benchmark across multiple sample to decide what's a good number here). For example this packed capa-testfile emits 0 API features, luckily we detect that it is packed with UPX. If that weren't the case, this banner could serve as an indication that the sample might packed.

good explanations!

would you include the key parts here as a comment?

also i'm interested to see how frequently this message is shown to users. I don't think our dogs will identify 40% of functions in most binaries, so i'm a little concerned this message will be shown too often.

have you had a chance to collect these stats against a large number of samples?

I think it's still helpful information since we know there's most likely more library code than we've identified.

mr-tz · 2024-06-11T09:23:43Z

Stepping back here for a moment, let's consider if we want to implement this differently:

add new characteristics: few imports, few detected library functions
add new limitation rules using these features
update behavior to handle has_file_limitation or similar

That way we can handle the various limitations/warnings consistently. The core extraction logic still resides in capa but we don't have to extend the meta data.

Related: should we provide functionality to easier leverage this in other tools? Right now other tools need to reimplement the logic we have in capa.main to handle special cases/detections.

williballenthin · 2024-06-11T13:07:31Z

@mr-tz this would require many fewer breaking changes, which i like

s-ff added 4 commits June 9, 2024 23:08

add api call and import count metadata fields

ef846fc

This commit introduces two new metadata fields: - apicall_count: total count of all API calls made in the sample - import_count: total count of Import symbols in the sample

remove unsued flag is_standalone

76df545

minor fix: use rutils.bold

64565da

Note, when using rutils.warn(), flake8 raises an error. So using rutils.bold() for now.

changelog: display analysis information

5d75052

mr-tz reviewed Jun 10, 2024

View reviewed changes

capa/render/proto/capa.proto Outdated Show resolved Hide resolved

remove unwanted whitespace

04c93dc

s-ff mentioned this pull request Jun 10, 2024

regenerate result document test files mandiant/capa-testfiles#239

Open

fix: missing metadata fields

3a1504a

williballenthin approved these changes Jun 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Display analysis information #2134

Display analysis information #2134

s-ff commented Jun 10, 2024 •

edited

Loading

mr-tz left a comment

mr-tz commented Jun 10, 2024

s-ff commented Jun 10, 2024

williballenthin Jun 10, 2024

williballenthin Jun 10, 2024

williballenthin Jun 10, 2024

s-ff Jun 10, 2024 •

edited

Loading

williballenthin Jun 11, 2024

williballenthin Jun 11, 2024

mr-tz Jun 11, 2024

mr-tz commented Jun 11, 2024

williballenthin commented Jun 11, 2024

Display analysis information #2134

Are you sure you want to change the base?

Display analysis information #2134

Conversation

s-ff commented Jun 10, 2024 • edited Loading

Checklist

mr-tz left a comment

Choose a reason for hiding this comment

mr-tz commented Jun 10, 2024

s-ff commented Jun 10, 2024

williballenthin Jun 10, 2024

Choose a reason for hiding this comment

williballenthin Jun 10, 2024

Choose a reason for hiding this comment

williballenthin Jun 10, 2024

Choose a reason for hiding this comment

s-ff Jun 10, 2024 • edited Loading

Choose a reason for hiding this comment

williballenthin Jun 11, 2024

Choose a reason for hiding this comment

williballenthin Jun 11, 2024

Choose a reason for hiding this comment

mr-tz Jun 11, 2024

Choose a reason for hiding this comment

mr-tz commented Jun 11, 2024

williballenthin commented Jun 11, 2024

s-ff commented Jun 10, 2024 •

edited

Loading

s-ff Jun 10, 2024 •

edited

Loading