Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] config does not trigger encoder when dumping to string #551

Open
MartinXPN opened this issue Dec 12, 2024 · 1 comment
Open

[BUG] config does not trigger encoder when dumping to string #551

MartinXPN opened this issue Dec 12, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@MartinXPN
Copy link

MartinXPN commented Dec 12, 2024

Description

There seems to be an issue with the behavior of the config function.
It does not trigger the function provided in the field.metadata.

Code snippet that reproduces the issue

import base64
from dataclasses import dataclass, field

from dataclasses_json import DataClassJsonMixin, LetterCase, Undefined, config


def base64_to_bytes(data: dict[str, str] | None) -> dict[str, bytes] | None:
    print(f'Triggering base64_to_bytes on {data}')
    print('base64_to_bytes:', {filename: type(content) for filename, content in (data or {}).items()})
    if data is not None and all(isinstance(content, str) for content in data.values()):
        return {filename: base64.b64decode(content.encode('utf-8')) for filename, content in data.items()}
    return data


def bytes_to_base64(data: dict[str, bytes] | None) -> dict[str, str] | None:
    print(f'Triggering bytes_to_base64 on {data}')
    print('bytes_to_base64:', {filename: type(content) for filename, content in (data or {}).items()})
    if data is not None and all(isinstance(content, bytes) for content in data.values()):
        return {filename: base64.b64encode(content).decode('utf-8') for filename, content in data.items()}
    return data


class DataClassJsonCamelMixIn(DataClassJsonMixin):
    dataclass_json_config = config(letter_case=LetterCase.CAMEL, undefined=Undefined.EXCLUDE)['dataclasses_json']


@dataclass
class TestCase(DataClassJsonCamelMixIn):
    input: str
    input_files: dict[str, str] | None = None               # mapping filename -> textual content
    input_assets: dict[str, bytes] | None = field(          # mapping filename -> binary content
        metadata=config(encoder=bytes_to_base64, decoder=base64_to_bytes),
        default=None,
    )


tests = [
    TestCase(
        input='input1',
        input_files={'file1': 'content1'},
        input_assets={'asset1': b'content1'},
    ),
    TestCase(
        input='input2',
        input_files={'file3': 'content3'},
        input_assets={'asset3': b'content3'},
    ),
]

print('\nThe ones below DO NOT trigger the base64_to_bytes decoder properly')
print(TestCase.schema().dumps(tests, many=True))
print(TestCase.schema().dumps(tests[0]))

print('\nThe ones below trigger the base64_to_bytes decoder properly')
print(tests[0].to_dict())
print(tests[0].to_json())

Describe the results you expected

I expected to see the bytes_to_base64 function getting triggered when we call the dumps() function. Yet, it doesn't get triggered; instead, the values are converted to numbers.

Python version you are using

3.12.7

Environment description

python -m pip freeze
annotated-types==0.7.0
anyio==4.3.0
appnope==0.1.4
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
arrow==1.3.0
asttokens==2.4.1
async-lru==2.0.4
attrs==23.2.0
Babel==2.14.0
beautifulsoup4==4.12.3
bleach==6.1.0
boto3==1.34.122
botocore==1.34.122
bs4==0.0.2
CacheControl==0.14.0
cached-property==1.5.2
cachetools==5.3.2
certifi==2024.2.2
cffi==1.16.0
charset-normalizer==3.3.2
comm==0.2.1
commonmark==0.9.1
contourpy==1.2.0
cryptography==42.0.4
cycler==0.12.1
dataclasses-json==0.6.7
debugpy==1.8.1
decorator==5.1.1
defusedxml==0.7.1
dictdiffer==0.9.0
distro==1.9.0
executing==2.0.1
fastjsonschema==2.19.1
firebase-admin==6.4.0
fonttools==4.49.0
fqdn==1.5.1
google-api-core==2.17.1
google-api-python-client==2.118.0
google-auth==2.28.1
google-auth-httplib2==0.2.0
google-cloud-core==2.4.1
google-cloud-firestore==2.15.0
google-cloud-storage==2.14.0
google-crc32c==1.5.0
google-resumable-media==2.7.0
googleapis-common-protos==1.62.0
greenlet==3.1.1
grpcio==1.62.0
grpcio-status==1.62.0
h11==0.14.0
httpcore==1.0.4
httplib2==0.22.0
httpx==0.27.0
idna==3.6
ipykernel==6.29.2
ipython==8.21.0
ipywidgets==8.1.2
isoduration==20.11.0
jedi==0.19.1
Jinja2==3.1.3
jiter==0.6.1
jmespath==1.0.1
json5==0.9.17
jsonpointer==2.4
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
jupyter==1.0.0
jupyter-console==6.6.3
jupyter-events==0.9.0
jupyter-lsp==2.2.2
jupyter_client==8.6.0
jupyter_core==5.7.1
jupyter_server==2.12.5
jupyter_server_terminals==0.5.2
jupyterlab==4.1.2
jupyterlab_pygments==0.3.0
jupyterlab_server==2.25.3
jupyterlab_widgets==3.0.10
kiwisolver==1.4.5
MarkupSafe==2.1.5
marshmallow==3.23.1
matplotlib==3.8.3
matplotlib-inline==0.1.6
mistune==3.0.2
msgpack==1.0.7
mypy-extensions==1.0.0
nbclient==0.9.0
nbconvert==7.16.1
nbformat==5.9.2
nest-asyncio==1.6.0
notebook==7.1.0
notebook_shim==0.2.4
notion==0.0.28
numpy==1.26.4
openai==1.52.2
outcome==1.3.0.post0
overrides==7.7.0
packaging==23.2
pandas==2.2.0
pandocfilters==1.5.1
parso==0.8.3
pexpect==4.9.0
pillow==10.2.0
platformdirs==4.2.0
playwright==1.48.0
prometheus_client==0.20.0
prompt-toolkit==3.0.43
proto-plus==1.23.0
protobuf==4.25.3
psutil==5.9.8
ptyprocess==0.7.0
pure-eval==0.2.2
pyasn1==0.5.1
pyasn1-modules==0.3.0
pycparser==2.21
pydantic==2.9.2
pydantic_core==2.23.4
pyee==12.0.0
Pygments==2.17.2
PyJWT==2.8.0
pyparsing==3.1.1
PySocks==1.7.1
python-dateutil==2.8.2
python-dotenv==1.0.1
python-json-logger==2.0.7
python-slugify==8.0.4
pytz==2024.1
PyYAML==6.0.1
pyzmq==25.1.2
qtconsole==5.5.1
QtPy==2.4.1
referencing==0.33.0
requests==2.31.0
rfc3339-validator==0.1.4
rfc3986-validator==0.1.1
rpds-py==0.18.0
rsa==4.9
s3transfer==0.10.1
selenium==4.25.0
Send2Trash==1.8.2
six==1.16.0
sniffio==1.3.0
sortedcontainers==2.4.0
soupsieve==2.5
stack-data==0.6.3
tabulate==0.9.0
terminado==0.18.0
text-unidecode==1.3
tinycss2==1.2.1
tornado==6.4
tqdm==4.66.2
traitlets==5.14.1
trio==0.27.0
trio-websocket==0.11.1
types-python-dateutil==2.8.19.20240106
typing-inspect==0.9.0
typing_extensions==4.12.2
tzdata==2024.1
tzlocal==5.2
uri-template==1.3.0
uritemplate==4.1.1
urllib3==1.26.20
wcwidth==0.2.13
webcolors==1.13
webdriver-manager==4.0.2
webencodings==0.5.1
websocket-client==1.8.0
widgetsnbextension==4.0.10
wsproto==1.2.0
@MartinXPN MartinXPN added the bug Something isn't working label Dec 12, 2024
@MartinXPN
Copy link
Author

MartinXPN commented Dec 12, 2024

Another minor issue is the warning from marshmallow:

/var/task/dataclasses_json/mm.py:288: UserWarning: Unknown type <class 'bytes'> at TestCase.input_assets: dict[str, bytes] | None It's advised to pass the correct marshmallow type to `mm_field`.

Would be great to support bytes when there is a metadata=config(encoder=..., decoder=...) provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant