Skip to content

Commit

Permalink
0.0.3 release
Browse files Browse the repository at this point in the history
  • Loading branch information
s-m-e committed May 1, 2019
2 parents d91e346 + 3522415 commit 9f7279c
Show file tree
Hide file tree
Showing 18 changed files with 1,184 additions and 316 deletions.
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ nosetests.xml
coverage.xml
*,cover
.hypothesis/
.pytest_cache/

# Translations
*.mo
Expand Down Expand Up @@ -97,3 +98,5 @@ test_mount
test_logs
fsx-linux*
screenshots/
notebooks/
notes.md
13 changes: 13 additions & 0 deletions CHANGES.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,19 @@
Changes
=======

0.0.3 (2019-05-01)
------------------

* FEATURE: LoggedFS-python can be used as a library in other Python software, enabling a user to specify callback functions on filesystem events. The relevant infrastructure is exported as ``loggedfs.loggedfs_notify``. See library example under ``docs``.
* FEATURE: New programmable filter pipeline, see ``loggedfs.filter_field_class``, ``loggedfs.filter_item_class`` and ``loggedfs.filter_pipeline_class``
* FEATURE: New flag ``-b``, explicitly activating logging of read and write buffers
* FEATURE: In "traditional" logging mode (not JSON), read and write buffers are also logged zlib-compressed and BASE64 encoded.
* FEATURE: Convenience function for decoding logged buffers, see ``loggedfs.decode_buffer``
* FIX: LoggedFS-python would have crashed if no XML configuration file had been specified.
* FIX: **Directory listing (``ls``) was broken.**
* FIX: Testing infrastructure did not catch all exceptions in tests.
* FIX: Testing infrastructure did not handle timeouts on individual tests correctly.

0.0.2 (2019-04-23)
------------------

Expand Down
94 changes: 94 additions & 0 deletions docs/library_example.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
Using LoggedFS-python as a library
==================================

Create a new directory, for instance in your current working directory, named ``demo_dir``. Then fire up an interactive Python shell such as Jupyter Notebook or Jupyter Lab. Now you can try the following:

.. code:: python
import re
import loggedfs
demo_filter = loggedfs.filter_pipeline_class(
include_list = [loggedfs.filter_item_class([loggedfs.filter_field_class(
name = 'proc_cmd', value = re.compile('.*kate.*').match
)])]
)
demo_data = []
demo_err = []
demo = loggedfs.loggedfs_notify(
'demo_dir',
background = True,
log_filter = demo_filter,
consumer_out_func = demo_data.append,
consumer_err_func = demo_err.append
)
You have just stated recording all filesystem events that involve a command containing the string ``kate``. Leave the Python shell and write some stuff into the ``demo_dir`` using ``Kate``, the KDE text editor. Once you are finished, go back to your Python shell and terminate the recording.

.. code:: python
demo.terminate()
Notice that the recorded data ends with an "end of transmission" marker. For convenience, remove it first:

.. code:: python
assert isinstance(demo_data[-1], loggedfs.end_of_transmission)
demo_data = demo_data[:-1]
Let's have a look at what you have recorded:

.. code:: python
print(demo_data[44]) # index 44 might show something different in your case
::

{'proc_cmd': '/usr/bin/kate -b /test/demo_dir/demo_file.txt',
'proc_uid': 1000,
'proc_uid_name': 'ernst',
'proc_gid': 100,
'proc_gid_name': 'users',
'proc_pid': 11716,
'action': 'read',
'status': True,
'param_path': '/test/demo_dir/demo_file.txt',
'param_length': 4096,
'param_offset': 0,
'param_fip': 5,
'return_len': 1486,
'return': '',
'time': 1556562162704772619}

Every single event is represented as a dictionary. ``demo_data`` is therefore a list of dictionaries. The following columns / keys are always present:

- proc_cmd: Command line of the process ordering the operation.
- proc_uid: UID (user ID) of the owner of the process ordering the operation.
- proc_uid_name: User name of the owner of the process ordering the operation.
- proc_gid: GID (group ID) of the owner of the process ordering the operation.
- proc_gid_name: Group name of the owner of the process ordering the operation.
- proc_pid: PID (process ID) of the process ordering the operation.
- action: Name of filesystem operation, such as ``open``, ``read`` or ``write``.
- status: Boolean, describing the success of the operation.
- return: Return value of operation. ``None`` if there is none.
- time: System time, nanoseconds, UTC

Other columns / keys are optional and depend on the operation and its status. With this knowledge, you can run typical Python data analysis frameworks across this data. Pandas for instance:

.. code:: python
import pandas as pd
data_df = pd.DataFrame.from_records(demo_data, index = 'time')
data_df[data_df['action'] == 'write'][['param_buf_len', 'param_offset', 'return']]
::

param_buf_len param_offset return
time
1556562164301499774 57.0 0.0 57
1556562164304043463 2.0 57.0 2
1556562164621417400 1487.0 0.0 1487
1556562165260276486 53.0 0.0 53
1556562165532797611 1486.0 0.0 1486
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@


# Bump version HERE!
_version_ = '0.0.2'
_version_ = '0.0.3'


# List all versions of Python which are supported
Expand Down Expand Up @@ -99,7 +99,7 @@
keywords = ['filesystem', 'fuse', 'logging', 'monitoring'],
include_package_data = True,
install_requires = [
'click',
'click>=7.0',
'fusepy @ git+https://github.com/s-m-e/fusepy@master#egg=fusepy-2.0.99',
'xmltodict'
],
Expand Down
14 changes: 11 additions & 3 deletions src/loggedfs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,16 @@
# IMPORT
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

from .cli import cli_entry
from .core import (
loggedfs,
from ._core.cli import cli_entry
from ._core.filter import (
filter_field_class,
filter_item_class,
filter_pipeline_class
)
from ._core.fs import (
_loggedfs,
loggedfs_factory
)
from ._core.ipc import end_of_transmission
from ._core.notify import notify_class as loggedfs_notify
from ._core.out import decode_buffer
25 changes: 25 additions & 0 deletions src/loggedfs/_core/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# -*- coding: utf-8 -*-

"""
LoggedFS-python
Filesystem monitoring with Fuse and Python
https://github.com/pleiszenburg/loggedfs-python
src/loggedfs/_core/__init__.py: Module core init
Copyright (C) 2017-2019 Sebastian M. Ernst <[email protected]>
<LICENSE_BLOCK>
The contents of this file are subject to the Apache License
Version 2 ("License"). You may not use this file except in
compliance with the License. You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
https://github.com/pleiszenburg/loggedfs-python/blob/master/LICENSE
Software distributed under the License is distributed on an "AS IS" basis,
WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the
specific language governing rights and limitations under the License.
</LICENSE_BLOCK>
"""
67 changes: 42 additions & 25 deletions src/loggedfs/cli.py → src/loggedfs/_core/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@
Filesystem monitoring with Fuse and Python
https://github.com/pleiszenburg/loggedfs-python
src/loggedfs/cli.py: Command line interface
src/loggedfs/_core/cli.py: Command line interface
Copyright (C) 2017-2019 Sebastian M. Ernst <[email protected]>
Expand All @@ -31,8 +31,9 @@

import click

from .core import loggedfs_factory
from .filter import parse_filters
from .defaults import LOG_ENABLED_DEFAULT, LOG_PRINTPROCESSNAME_DEFAULT
from .fs import loggedfs_factory
from .filter import filter_pipeline_class


# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Expand Down Expand Up @@ -70,54 +71,70 @@
is_flag = True,
help = 'Format output as JSON instead of traditional loggedfs format.'
)
@click.option(
'-b', '--buffers',
is_flag = True,
help = 'Include read/write-buffers (compressed, BASE64) in log.'
)
@click.option(
'--lib',
is_flag = True,
help = 'Run in library mode. DO NOT USE THIS FROM THE COMMAND LINE!',
hidden = True
)
@click.argument(
'directory',
type = click.Path(exists = True, file_okay = False, dir_okay = True, resolve_path = True)
)
def cli_entry(f, p, c, s, l, json, directory):
def cli_entry(f, p, c, s, l, json, buffers, lib, directory):
"""LoggedFS-python is a transparent fuse-filesystem which allows to log
every operations that happens in the backend filesystem. Logs can be written
to syslog, to a file, or to the standard output. LoggedFS comes with an XML
every operation that happens in the backend filesystem. Logs can be written
to syslog, to a file, or to the standard output. LoggedFS-python allows to specify an XML
configuration file in which you can choose exactly what you want to log and
what you don't want to log. You can add filters on users, operations (open,
read, write, chown, chmod, etc.), filenames and return code. Filename
filters are regular expressions.
read, write, chown, chmod, etc.), filenames, commands and return code.
"""

loggedfs_factory(
directory,
**__process_config__(c, l, s, f, p, json)
**__process_config__(c, l, s, f, p, json, buffers, lib)
)


def __process_config__(
config_fh,
log_file,
log_syslog_off,
fuse_foreground_bool,
fuse_allowother_bool,
log_json
fuse_foreground,
fuse_allowother,
log_json,
log_buffers,
lib_mode
):

if config_fh is not None:
config_xml_str = config_fh.read()
config_data = config_fh.read()
config_fh.close()
(
log_enabled, log_printprocessname, filter_obj
) = filter_pipeline_class.from_xmlstring(config_data)
config_file = config_fh.name
else:
config_file = '[None]'
config_xml_str = None

config_dict = parse_filters(config_xml_str)
log_enabled = LOG_ENABLED_DEFAULT
log_printprocessname = LOG_PRINTPROCESSNAME_DEFAULT
filter_obj = filter_pipeline_class()
config_file = None

return {
'log_includes': config_dict['log_includes'],
'log_excludes': config_dict['log_excludes'],
'log_enabled': config_dict['log_enabled'],
'log_printprocessname': config_dict['log_printprocessname'],
'fuse_foreground': fuse_foreground,
'fuse_allowother': fuse_allowother,
'lib_mode': lib_mode,
'log_buffers': log_buffers,
'_log_configfile' : config_file,
'log_enabled': log_enabled,
'log_file': log_file,
'log_syslog': not log_syslog_off,
'log_configmsg': 'LoggedFS-python using configuration file %s' % config_file,
'log_filter': filter_obj,
'log_json': log_json,
'fuse_foreground_bool': fuse_foreground_bool,
'fuse_allowother_bool': fuse_allowother_bool
'log_printprocessname': log_printprocessname,
'log_syslog': not log_syslog_off
}
41 changes: 41 additions & 0 deletions src/loggedfs/_core/defaults.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# -*- coding: utf-8 -*-

"""
LoggedFS-python
Filesystem monitoring with Fuse and Python
https://github.com/pleiszenburg/loggedfs-python
src/loggedfs/_core/defaults.py: Default configurations
Copyright (C) 2017-2019 Sebastian M. Ernst <[email protected]>
<LICENSE_BLOCK>
The contents of this file are subject to the Apache License
Version 2 ("License"). You may not use this file except in
compliance with the License. You may obtain a copy of the License at
https://www.apache.org/licenses/LICENSE-2.0
https://github.com/pleiszenburg/loggedfs-python/blob/master/LICENSE
Software distributed under the License is distributed on an "AS IS" basis,
WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the
specific language governing rights and limitations under the License.
</LICENSE_BLOCK>
"""


# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# CONST
# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

FUSE_ALLOWOTHER_DEFAULT = False
FUSE_FOREGROUND_DEFAULT = False

LIB_MODE_DEFAULT = False

LOG_BUFFERS_DEFAULT = False
LOG_ENABLED_DEFAULT = True
LOG_JSON_DEFAULT = False
LOG_PRINTPROCESSNAME_DEFAULT = True
LOG_SYSLOG_DEFAULT = False
Loading

0 comments on commit 9f7279c

Please sign in to comment.