Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracts web domains and IP address and implements tests #2031

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
1b6bdaa
This pull request extracts web domains and IP addresses from files an…
aaronatp Mar 19, 2024
999f656
Update domain_ip_helpers.py
aaronatp Mar 19, 2024
80fc4d4
Update extract_domain_and_ip.py
aaronatp Mar 19, 2024
2bb09f8
Update test_domain_ip_extractor.py
aaronatp Mar 19, 2024
61c4ad5
Adds to changelog and fixes string concatenation style error
aaronatp Mar 19, 2024
627f187
Merge branch 'master' into master
aaronatp Mar 19, 2024
9b41074
Help debug why it won't build with PyInstaller 3.11
aaronatp Mar 19, 2024
6236b8f
Update capa/capabilities/extract_domain_and_ip.py
aaronatp Mar 19, 2024
f9f6bb4
Update capa/capabilities/extract_domain_and_ip.py
aaronatp Mar 19, 2024
aa5f542
Update verbose.py
aaronatp Mar 19, 2024
2e9c1a9
Update vverbose.py
aaronatp Mar 19, 2024
11585c3
Update extract_domain_and_ip.py
aaronatp Mar 19, 2024
58b0336
Fix PyInstaller 3.11 installation error with get_signatures
aaronatp Mar 19, 2024
c55169b
Fix get_extractor_from_doc issue for CAPE
aaronatp Mar 19, 2024
85dfa55
Fix 'backend' assignment in get_extractor_from_doc
aaronatp Mar 20, 2024
5afd175
Fix 'format' in get_extractor_from_doc
aaronatp Mar 20, 2024
78d0efc
Merge branch 'master' into master
aaronatp Mar 20, 2024
681e6c3
Reformat imports
aaronatp Mar 20, 2024
fb3ed8a
Reformat multi-line string
aaronatp Mar 20, 2024
0443caf
Correct Flake8 errors
aaronatp Mar 20, 2024
a000461
Update domain_ip_helpers.py
aaronatp Mar 22, 2024
93943e5
Update domain_ip_helpers.py
aaronatp Mar 22, 2024
32b4778
Update capa/render/verbose.py
aaronatp Mar 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions capa/capabilities/domain_ip_helpers.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
import logging
from pathlib import Path

from capa.helpers import get_auto_format
from capa.features.common import FORMAT_CAPE
from capa.render.result_document import ResultDocument
from capa.features.extractors.base_extractor import FeatureExtractor
from capa.features.extractors.cape.extractor import CapeExtractor

logger = logging.getLogger(__name__)

BACKEND_VIV = "vivisect"
BACKEND_DOTNET = "dotnet"
BACKEND_BINJA = "binja"
BACKEND_PEFILE = "pefile"


def get_file_path(doc: ResultDocument) -> Path:
return Path(doc.meta.sample.path)


def get_sigpaths_from_doc(doc: ResultDocument):
import capa.loader

if doc.meta.argv:
try:
if "-s" in list(doc.meta.argv):
idx = doc.meta.argv.index("-s")
sigpath = Path(doc.meta.argv[idx + 1])
if "./" in str(sigpath):
fixed_str = str(sigpath).split("./")[1]
sigpath = Path(fixed_str)

elif "--signatures" in list(doc.meta.argv):
idx = doc.meta.argv.index("--signatures")
sigpath = Path(doc.meta.argv[idx + 1])
if "./" in str(sigpath):
fixed_str = str(sigpath).split("./")[1]
sigpath = Path(fixed_str)

else:
sigpath = "(embedded)" # type: ignore

return capa.loader.get_signatures(sigpath)

except AttributeError:
raise NotImplementedError("Confirm that argv is an attribute of doc.meta")

else:
print("in 'get_sigpaths_from_doc', run in debug (-d) mode")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was for debugging? Why are you not using logger here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @VascoSch92 yes trying to figure out why it's failing to build with PyInstaller 3.11

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah ok easy... I just saw it and I wanted to check :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha yes thank you for the question though, I appreciate the comments and feedback you're giving me

logger.debug("'doc.meta' has not attribute 'argv', this is probably a bad sign...")


def get_extractor_from_doc(doc: ResultDocument) -> FeatureExtractor:
import capa.loader

path = get_file_path(doc)
format = doc.meta.analysis.format
os = doc.meta.analysis.os

_ = get_auto_format(get_file_path(doc))
if format == FORMAT_CAPE:
report = capa.helpers.load_json_from_path(path)
return CapeExtractor.from_report(report)
elif _ == BACKEND_VIV:
backend = BACKEND_VIV
elif _ == BACKEND_PEFILE:
backend = BACKEND_PEFILE
elif _ == BACKEND_BINJA:
backend = BACKEND_BINJA
elif _ == BACKEND_DOTNET:
backend = BACKEND_DOTNET
else:
backend = BACKEND_VIV # according to main.py this is the default

sigpath = get_sigpaths_from_doc(doc)

return capa.loader.get_extractor(
input_path=path,
input_format=format,
os_=os,
backend=backend,
sigpaths=sigpath,
)
Loading
Loading