Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/handle generics #209

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

mciszczon
Copy link
Collaborator

@mciszczon mciszczon commented Jan 9, 2023

This is a draft PR for introducing the handling of typing generics. Feel free to review this PR, but keep in mind it is not yet finished—publishing it just for the sake of code reviews and brainstorming.

Fixes #131
Fixes #157
Fixes #160

TODO

  • Add much more test cases that use typing.Generic and typing.TypeVar.
  • Consider handling other typing shenanigans, like Annotated, TypeAlias, TypeVarTuple.
  • Backport typing.get_origin and typing.get_args for lower Python versions.
  • Refactor and clean up the code, it's super ugly at the moment.
  • Consider using typing. TypeGuard for typehinting the is_instance and its related functions.

@mciszczon mciszczon self-assigned this Jan 9, 2023
@mciszczon mciszczon force-pushed the feature/handle-generics branch 2 times, most recently from c1aa6da to 1c0b66c Compare January 10, 2023 12:26
@mciszczon mciszczon force-pushed the feature/handle-generics branch 4 times, most recently from c9b8e71 to 8ce7a93 Compare January 16, 2023 08:26
@mciszczon mciszczon force-pushed the feature/handle-generics branch 2 times, most recently from 0d11d70 to 3addf2d Compare January 16, 2023 09:09
@mciszczon mciszczon force-pushed the feature/handle-generics branch from 3addf2d to 53f2f32 Compare January 16, 2023 09:18
@mciszczon mciszczon force-pushed the feature/handle-generics branch from bd877e3 to 9218ebc Compare January 16, 2023 11:28
@mciszczon mciszczon force-pushed the feature/handle-generics branch from 8e33ec7 to 912b13e Compare January 16, 2023 16:34
@mciszczon mciszczon closed this May 12, 2023
@mciszczon mciszczon force-pushed the feature/handle-generics branch from 912b13e to 10a9ec4 Compare May 12, 2023 11:26
@mciszczon mciszczon reopened this May 15, 2023
@mciszczon mciszczon marked this pull request as ready for review May 15, 2023 11:44
@mciszczon
Copy link
Collaborator Author

@konradhalas After consideration I think it could already be merged as is feature-wise. Maybe we might want to write a few more tests, but that would be it. What do you think?

Comment on lines +105 to +111
type_hints = cache(get_type_hints)(type(value))
for field_name, field_type in type_hints.items():
if isinstance(field_type, TypeVar):
args = get_args(type_)
return True if not args else any(isinstance(getattr(value, field_name, None), arg) for arg in args)
else:
return isinstance(value, type_)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using this branch for a project and found this. This block doesn't make sense and also doesn't work.

get_type_hints gets a mapping from field name to field type for the dataclass value. You're iterating over that, but you only ever check the first item because both branches end with return. I'm pretty sure both of those returns in the loop should only be returning if the value is False.

On the first branch, if field_type is a TypeVar, you check the field value against all the arguments for the subscripted generic type_. Why is get_args in the loop? Why is getattr in the inner loop? Neither one is changing at that point. I believe this would also fail to detect an incorrect type if, say, a field is hinted AnyStr, the value is None, and the dataclass has two type arguments, one str and one None.

I assume the second branch is trying to check if the field value matches the field type? But what you're actually doing is checking if the dataclass matches the subscripted dataclass type. This crashes with the exception TypeError: Subscripted generics cannot be used with class and instance checks for obvious reasons.

For the false negative issue with multiple type arguments, possibly something could be done with __orig_bases__? Not sure.

Here's my suggestion for a fixed version. I haven't written test cases, but I did try it with my fairly complicated use case and it seems to work.

Suggested change
type_hints = cache(get_type_hints)(type(value))
for field_name, field_type in type_hints.items():
if isinstance(field_type, TypeVar):
args = get_args(type_)
return True if not args else any(isinstance(getattr(value, field_name, None), arg) for arg in args)
else:
return isinstance(value, type_)
type_args = get_args(type_)
type_hints = cache(get_type_hints)(type(value))
for field_name, field_type in type_hints.items():
field_value = getattr(value, field_name, None)
if isinstance(field_type, TypeVar):
# TODO: this will fail to detect incorrect type in some cases
# see comments on https://github.com/konradhalas/dacite/pull/209
if not any(is_instance(field_value, arg) for arg in type_args):
return False
elif get_origin(field_type) is not ClassVar:
if not is_instance(field_value, field_type):
return False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants