Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support auditing binary model fields instead of throwing error #689

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

suricactus
Copy link

When diffing model fields that are binary, represent their value as a hex representation.

Prior to this change, all fields that are not datetime or JSON will be passed to Django's smart_str/force_str from within django-auditlog's get_field_value.
The value will then be encoded as a utf8 string, which does not work for all binary values, as they may contain arbitrary bytes - for example a md5 sum or a raster image.

This commit introduces a new special handling for the values of BinaryField. The value will be converted to the hex representation of the binary. Hex was chosen over the default Python representation with \x escape characters to save some space from excessive \x in the generated string for the audit diff.


Example error that motivated this PR:

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte
[...]
  File "/usr/local/lib/python3.10/site-packages/auditlog/diff.py", line 186, in model_instance_diff
    new_value = get_field_value(new, field)
  File "/usr/local/lib/python3.10/site-packages/auditlog/diff.py", line 84, in get_field_value
   value = smart_str(getattr(obj, field.name, None))
  File "/usr/local/lib/python3.10/site-packages/django/utils/encoding.py", line 33, in smart_str
    return force_str(s, encoding, strings_only, errors)
  File "/usr/local/lib/python3.10/site-packages/django/utils/encoding.py", line 74, in force_str
    raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte. You passed in b'\x96\x9ax\x83\xee\x8b\n3\xa6!\xdb\x9f\xe0\xe7\x8c$' (<class 'bytes'>)

When diffing model fields that are binary, represent their value as a
hex representation.

Prior to this change, all fields that are not datetime or JSON will be
passed to Django's `smart_str`/`force_str` from within django-auditlog's
`get_field_value`.
The value will then be encoded as a utf8 string, which does not work for
all binary values, as they may contain arbitrary bytes - for example a
md5 sum or a raster image.

This commit introduces a new special handling for the values of
`BinaryField`. The value will be converted to the hex representation of
the binary. Hex was chosen over the default Python representation with
`\x` escape characters to save some space from excessive `\x` in the
generated string for the audit diff.

---

Example error that motivated this PR:

```
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte
[...]
  File "/usr/local/lib/python3.10/site-packages/auditlog/diff.py", line 186, in model_instance_diff
    new_value = get_field_value(new, field)
  File "/usr/local/lib/python3.10/site-packages/auditlog/diff.py", line 84, in get_field_value
   value = smart_str(getattr(obj, field.name, None))
  File "/usr/local/lib/python3.10/site-packages/django/utils/encoding.py", line 33, in smart_str
    return force_str(s, encoding, strings_only, errors)
  File "/usr/local/lib/python3.10/site-packages/django/utils/encoding.py", line 74, in force_str
    raise DjangoUnicodeDecodeError(s, *e.args)
django.utils.encoding.DjangoUnicodeDecodeError: 'utf-8' codec can't decode byte 0x96 in position 0: invalid start byte. You passed in b'\x96\x9ax\x83\xee\x8b\n3\xa6!\xdb\x9f\xe0\xe7\x8c$' (<class 'bytes'>)
```
@hramezani
Copy link
Member

Thanks @suricactus for reporting this and providing the breaking case.

It would be great to include the fix as well.

Please remember to add a changelog entry in case you want to continue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants