Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Related files #725

Open
wants to merge 38 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
d72d752
migration and background in progress
Dec 12, 2022
c33d5c1
working upload (I think)
Dec 12, 2022
da5f8bc
adding related files in database
Dec 12, 2022
f9b3006
(probably) working backend
Dec 12, 2022
434e5e8
Raise exception when the same related file is uploaded second time
Dec 12, 2022
6f57b7e
Initial front-end commit
Dec 12, 2022
735fc1f
Front-end adjustments
Dec 12, 2022
1fd981a
Fix problem with too many rerenders
Dec 13, 2022
8dd170e
progress in front-end
Dec 13, 2022
55ea824
working download
Dec 13, 2022
d247ee2
little fix in removing related files
Dec 13, 2022
d589b0c
Working upload
Dec 13, 2022
eb66667
almost working front-end with better error handling
Dec 13, 2022
8c42fb1
visual improvements
Dec 13, 2022
3ee25b0
more visual improvements
Dec 13, 2022
2c7cdce
auto rerender tab after change
Dec 14, 2022
252cc2f
rename variables for clarity
Dec 14, 2022
d5b593f
RelatedFiles Capabilities, migration and small visual improvement
Dec 14, 2022
561418f
allow one RelatedFile to be associated with many Files, delete Relate…
Dec 14, 2022
a58c7ed
implement requested changes
Dec 15, 2022
2c417a3
visual improvements
Dec 16, 2022
e4d0b5e
New action - dowload all
Dec 16, 2022
9d0d600
Rename variables, improve descriptions
Dec 16, 2022
7f61450
fix: unable to delete object with related files
Dec 19, 2022
0003992
add tests
Dec 19, 2022
dc4c917
Documentation update
Dec 19, 2022
64d653a
Querying by related files
Dec 21, 2022
06251c5
Documentation update
Dec 21, 2022
52a3497
Documentation for querying and automated tests
Dec 22, 2022
b9e7a47
Merge branch 'master' into feature/related-files
psrok1 Jan 23, 2023
60e97d3
js => jsx
psrok1 Jan 23, 2023
10bff59
Merge branch 'master' into feature/related-files
psrok1 Jan 25, 2023
6c9bbce
fix migrations
Repumba Mar 15, 2023
d41537b
Merge branch 'master' into feature/related-files
Repumba Mar 15, 2023
05d729d
fix migrations 2
Repumba Mar 15, 2023
97733f1
Merge branch 'master' into feature/related-files
Repumba Mar 31, 2023
441612d
add toastify for alerts
Repumba Mar 31, 2023
8b4a38f
Merge branch 'master' into feature/related-files
Repumba Apr 3, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added docs/_static/related-files-tab.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
20 changes: 20 additions & 0 deletions docs/user-guide/2-Storing-malware-samples.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,3 +98,23 @@ Using that feature we can chain together various files extracted during the anal
- two memory dumps with unpacked code

Finally, from both of these dumps we got a malware configuration.


Related files
-------------

Sometimes we want to add the file that is related to the object, but not containing any malware i.e. pdf report. Every object of type **file** has designated tab for such purpose.

.. image:: ../_static/related-files-tab.png
:target: ../_static/related_files_tab
:alt: related files tab

In this tab you can:

- view list of all related files associated with this object
- upload new related file
- delete related file
- download single related file
- download .zip file containing every related file

Keep in mind that those actions might require special capabilities. For more details, please read :ref:`9. Sharing objects with other collaborators`
40 changes: 40 additions & 0 deletions docs/user-guide/7-Lucene-search.rst
Original file line number Diff line number Diff line change
Expand Up @@ -549,3 +549,43 @@ Afterwards, you can see your newly added query as another black-coloured badge.
.. image:: ../_static/7dXJkSH.png
:target: ../_static/7dXJkSH.png
:alt:

Related file field (\ ``related.<field>:``\ )
------------------------------------------------------------

You can query objects by their related files. There are 4 ways to do it:

* ``related.name:`` - query by related file's name

.. code-block:: python

related.name:"name.txt"


This field accepts wildcards.

* ``related.size:`` - query by related file's size

.. code-block:: python

related.size:"<5kb"


* ``related.sha256:`` - query by related file's sha256

.. code-block:: python
related.sha256:"2ed91d820157c0530ffbae54122d998e0de6d958f266b682f7c528942f770470"


* ``related.count:`` - query by number of files related to the object

.. code-block:: python

# get files, which have at least 0 related files and no more than 2
related.count:[0 TO 2]
# get files, which have at least 2 related files
related.count:">=2"
# get files, which have more than 2 related files
related.count:">2"
# get files, which have exactly 1 related file
related.count:"1"
47 changes: 31 additions & 16 deletions docs/user-guide/9-Sharing-objects.rst
Original file line number Diff line number Diff line change
Expand Up @@ -155,22 +155,22 @@ By default, ``admin`` private group has enabled all capabilities. All other grou

Each capability has its own name and scope:

*
*
**manage_users - Managing users and groups (system administration)**

Allows to access all users and groups in MWDB. Rules described in *Who is who?* don't apply to users with that permission. Enables user to create new user accounts, new groups and change their capabilities and membership. Allows to manage attribute keys, define new ones, delete and set the group permissions for them.

*
*
**share_queried_objects - Query for all objects in system**

That one is a bit tricky and will be possibly deprecated. MWDB will automatically share object and all descendants with group if member directly accessed it via identifier (knows the hash e.g. have direct link to the object). It can be used for bot accounts, so they have access only to these objects that are intended to be processed by them. Internally, we abandoned that idea, so that capability may not be stable.

*
*
**access_all_objects - Has access to all new uploaded objects into system**

Capability used by ``everything`` group, useful when you want to make additional "everything" that is separate from the original one. Keep in mind that it applies only to the **uploads made during the capability was enabled**\ , so if you want the new group to be truly "everything", you may need to share the old objects manually.

*
*
**sharing_with_all - Can share objects with all groups in system**

Implies the access to the list of all group names, but without access to the membership information and management features. Allows to share object with arbitrary group in MWDB. It also allows the user to view full history of sharing an object (if the user has access to the object).
Expand All @@ -180,27 +180,27 @@ Each capability has its own name and scope:

Can view who uploaded object and filter by uploader. Without this capability users can filter by / see only users in their workspaces.

*
*
**adding_tags - Can add tags**

Allows to tag objects. This feature is disabled by default, as you may want to have only tags from automated analyses.

*
*
**removing_tags - Can remove tags**

Allows to remove tags. Tag doesn't have "owner", so user will be able to remove all tags from the object.

*
*
**adding_comments - Can add comments**

Allows to add comments to the objects. Keep in mind that comments are public.

*
*
**removing_comments - Can remove (all) comments**

Allows to remove **all** comments, not only these authored by the user.

*
*
**adding_parents - Can add parents**

Allows to add new relationships by specifying object parent during upload or adding new relationship between existing objects.
Expand All @@ -215,22 +215,22 @@ Each capability has its own name and scope:

Enables upload of files. Enabled by default for ``registered`` group.

*
*
**adding_configs - Can upload configs**

Enables upload of configurations. Configurations are intended to be uploaded by automated systems or trusted entities that follow the conventions.

*
*
**adding_blobs - Can upload text blobs**

Enables upload of blobs. Blobs may have similar meaning as configurations in terms of user roles.

*
*
**reading_all_attributes - Has access to all attributes of object (including hidden)**

With that capability, you can read all the attributes, even if you don't have ``read`` permission for that attribute key. It allows to list hidden attribute values.

*
*
**adding_all_attributes - Can add all attributes to object**

Enables group to add all the attributes, even if it doesn't have ``set`` permission for that attribute key.
Expand All @@ -240,12 +240,12 @@ Each capability has its own name and scope:

Allows to remove attribute from object. To remove attribute, you need to have ``set`` permission for key. Combined with ``adding_all_attributes``\ , allows to remove all attributes.

*
*
**unlimited_requests - API requests are not rate-limited for this group**

Disables rate limiting for users from that group, if rate limiting feature is enabled.

*
*
**removing_objects - Can remove objects**

Can remove all accessible objects from the MWDB. May be quite destructive, we suggest to keep that capability enabled only for ``admin`` account.
Expand All @@ -260,7 +260,7 @@ Each capability has its own name and scope:

Allows to use personalization features like favorites or quick queries.

*
*
**karton_assign - Can assign existing analysis to the object**

Allows to assign Karton analysis to the object by setting ``karton`` attribute or using dedicated API.
Expand All @@ -269,6 +269,21 @@ Each capability has its own name and scope:
**karton_reanalyze - Can resubmit any object for analysis**

Can manually resubmit object to Karton.
*
**access_related_files - Can view and download RelatedFiles**

Allows to view list of RelatedFiles and download them.

*
**adding_related_files - Can upload new RelatedFiles**

Allows to upload new RelatedFiles.

*
**removing_related_files - removing_related_files**

Allows to remove existing RelatedFiles.


User capabilities are the sum of all group capabilities. If you want to enable capability system-wide (e.g. enable all users to add tags), enable that capability for ``registered`` group or ``public`` group if you want to include guests.

Expand Down
19 changes: 19 additions & 0 deletions mwdb/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,9 @@
FileDownloadZipResource,
FileItemResource,
FileResource,
RelatedFileItemResource,
RelatedFileResource,
RelatedFileZipDownloadResource,
)
from mwdb.resources.group import GroupListResource, GroupMemberResource, GroupResource
from mwdb.resources.karton import KartonAnalysisResource, KartonObjectResource
Expand Down Expand Up @@ -263,6 +266,22 @@ def require_auth():
api.add_resource(FileDownloadResource, "/file/<hash64:identifier>/download")
api.add_resource(FileDownloadZipResource, "/file/<hash64:identifier>/download/zip")

# RelatedFiles endpoints
api.add_resource(
RelatedFileResource,
"/<any(file, config, blob, object):type>/<hash64:main_obj_identifier>/related_file",
)
api.add_resource(
RelatedFileItemResource,
"/<any(file, config, blob, object):type>/<hash64:main_obj_identifier>"
"/related_file/<hash64:identifier>",
)
api.add_resource(
RelatedFileZipDownloadResource,
"/<any(file, config, blob, object):type>/<hash64:main_obj_identifier>"
"/related_file/zip",
)

# Config endpoints
api.add_resource(ConfigResource, "/config")
api.add_resource(ConfigStatsResource, "/config/stats")
Expand Down
6 changes: 6 additions & 0 deletions mwdb/core/capabilities.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,12 @@ class Capabilities(object):
karton_reanalyze = "karton_reanalyze"
# Can remove Karton analysis from the object
karton_unassign = "karton_unassign"
# Can view and download RelatedFiles
access_related_files = "access_related_files"
# Can upload new RelatedFiles
adding_related_files = "adding_related_files"
# Can remove existing RelatedFiles
removing_related_files = "removing_related_files"

@classmethod
def all(cls):
Expand Down
102 changes: 102 additions & 0 deletions mwdb/core/file_util.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
import io
import os
import shutil
import tempfile

from mwdb.core.config import StorageProviderType, app_config
from mwdb.core.util import get_s3_client


def write_to_storage(file_stream, file_object):
file_stream.seek(0, os.SEEK_SET)
if app_config.mwdb.storage_provider == StorageProviderType.S3:
get_s3_client(
app_config.mwdb.s3_storage_endpoint,
app_config.mwdb.s3_storage_access_key,
app_config.mwdb.s3_storage_secret_key,
app_config.mwdb.s3_storage_region_name,
app_config.mwdb.s3_storage_secure,
app_config.mwdb.s3_storage_iam_auth,
).put_object(
Bucket=app_config.mwdb.s3_storage_bucket_name,
Key=file_object._calculate_path(),
Body=file_stream,
)
elif app_config.mwdb.storage_provider == StorageProviderType.DISK:
with open(file_object._calculate_path(), "wb") as f:
shutil.copyfileobj(file_stream, f)
else:
raise RuntimeError(
f"StorageProvider {app_config.mwdb.storage_provider} " f"is not supported"
)


def get_from_storage(file_object):
if app_config.mwdb.storage_provider == StorageProviderType.S3:
# Stream coming from Boto3 get_object is not buffered and not seekable.
# We need to download it to the temporary file first.
stream = tempfile.TemporaryFile(mode="w+b")
try:
get_s3_client(
app_config.mwdb.s3_storage_endpoint,
app_config.mwdb.s3_storage_access_key,
app_config.mwdb.s3_storage_secret_key,
app_config.mwdb.s3_storage_region_name,
app_config.mwdb.s3_storage_secure,
app_config.mwdb.s3_storage_iam_auth,
).download_fileobj(
Bucket=app_config.mwdb.s3_storage_bucket_name,
Key=file_object._calculate_path(),
Fileobj=stream,
)
stream.seek(0, io.SEEK_SET)
return stream
except Exception:
stream.close()
raise
elif app_config.mwdb.storage_provider == StorageProviderType.DISK:
return open(file_object._calculate_path(), "rb")
else:
raise RuntimeError(
f"StorageProvider {app_config.mwdb.storage_provider} is not supported"
)


def delete_from_storage(file_object):
if app_config.mwdb.storage_provider == StorageProviderType.S3:
get_s3_client(
app_config.mwdb.s3_storage_endpoint,
app_config.mwdb.s3_storage_access_key,
app_config.mwdb.s3_storage_secret_key,
app_config.mwdb.s3_storage_region_name,
app_config.mwdb.s3_storage_secure,
app_config.mwdb.s3_storage_iam_auth,
).delete_object(
Bucket=app_config.mwdb.s3_storage_bucket_name,
Key=file_object._calculate_path(),
)
elif app_config.mwdb.storage_provider == StorageProviderType.DISK:
os.remove(file_object._calculate_path())
else:
raise RuntimeError(
f"StorageProvider {app_config.mwdb.storage_provider} " f"is not supported"
)


def iterate_buffer(file_object, chunk_size=1024 * 256):
"""
Iterates over bytes in the file contents
"""
fh = file_object.open()
try:
if hasattr(fh, "stream"):
yield from fh.stream(chunk_size)
else:
while True:
chunk = fh.read(chunk_size)
if chunk:
yield chunk
else:
return
finally:
fh.close()
Loading