Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #66: dataclass from Generic now serializes #68

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

rhaps0dy
Copy link
Contributor

I hit this bug reported in #66 and fixed it. I think the fix is correct, is it?

@@ -245,7 +245,7 @@ class A(Generic[T]):
pass

# Continue with the base classes.
for base in hint.bases or hint.type.__bases__:
for base in (*hint.bases, *hint.type.__bases__):
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure atm. whether the order plays a role here, i.e. if maybe the two unrolls should be the other way around. 🤔 Will dig a bit when I get the time, unless you want to hash out what it means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, quickly: this comes from typeapi ClassTypeHint .bases , which uses get_type_hint_original_bases. That on purpose accesses __orig_bases__, which seems to be a CPython attribute of classes which is not very well documented.

According to this PEP, __orig_bases__ contains the instantiated generics for a particular type (e.g. A[int]), as opposed to just the bare types (e.g. A). The former is useful because it gives more information to Databind about how to de/serialize a particular field. However, subclasses overrides methods, and it's not clear that that's going to be in the correct order.

We can get the correct method resolution order by calling .mro() on the class that we're dealing with. Given the test case with inheritance class A(Generic[T]) -> class B(A[int]) -> class C(B), calling C.mro() would give us (C, B, A) which would correctly tell where fields come from. But, we'd be losing type information.

So, my recommendation is:

  • We should just call Class.mro() before this entire loop to get the correct iteration order
  • For every typehint we get, we get its __orig_bases__ to get the instantiated-Generic types. If TypeHint(t).type matches anything in the MRO list, we use the instantiated-generic type instead of the original bare type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I were to commit a solution like this, would you merge it?

Copy link
Owner

@NiklasRosenstein NiklasRosenstein May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @rhaps0dy , I'm sorry for the late response.

Thanks for explaining your train of thought so thoroughly. I'm following you, but I think that if this should, it should probably be on the typeapi level instead. Thatis because recursing through ClassTypeHint.bases should allow you to reconstruct the MRO while at the same type getting access to the typed bases. (Just as you can reconstruct the MRO manually by recursing through type.__bases__, only that you don't have the additional type information).

The behaviour we're seeing seems to stem from the fact that __orig_bases__ does not get set on a subclass that inherits from a generic without parameterizing it.

from typing import Generic, TypeVar

T = TypeVar("T")


class Base(Generic[T]):
    a: int


class Correct(Base[T]):
    b: str


class Incorrect(Base):
    b: str


print(Correct.__orig_bases__)
assert "__orig_bases__" in vars(Correct)

print(Incorrect.__orig_bases__)
assert "__orig_bases__" not in vars(Incorrect)

This is a relatively old issue (so the requirement to always add Generic[T] into the mix is no longer present), but it hints at the fact that inheriting from a generic should be done with parameterization (even if that is with the same or another type variable). It seems there is no clear semantic assigned to inheriting from a generic without parameterizing it.

python/typing#85

This leads me to consider that this is not necessarily a bug in Databind or Typeapi but in your code. :)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I think the need to add # type: ignore[type-arg] is a pretty good hint that the way the inheritance is defined is actually invalid in Python's (aka. Mypy's) type system, and thus I don't think we should support the case.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants