Optimize is_unsafe_overlapping_overload_signatures for overlapping typevars case #18159

alexdrydew · 2024-11-16T16:44:08Z

This change optimizes is_unsafe_overlapping_overload_signatures so overloads containing many type variables can be checked faster.

Let:

$N$ - number of function overloads
$K$ - total number of type variables in function definition
$V$ - max number of constraints in type variables

It seems that currently typechecking a functions with overloads have $O(N^2V^{2K})$ time complexity:

we need to check each pair of overrides
then we expand both signatures to at most $V^K$ variants

If I am not mistaken (please correct me if I am) it is not necessary to check pairs of overrides where same type variable is expanded to different variants across different signatures, so we can expand variables jointly for both signatures reducing time complexity to $O(N^2V^K)$

On my original example (using 6 type variables) this change gives 40x time improvement:

Example code snippet

from __future__ import annotations

from typing import Generic, TypeVar, overload

T1 = TypeVar("T1", int, str, bool)
T2 = TypeVar("T2", int, str, bool)
T3 = TypeVar("T3", int, str, bool)
T4 = TypeVar("T4", int, str, bool)
T5 = TypeVar("T5", int, str, bool)
T6 = TypeVar("T6", int, str, bool)

class A(Generic[T1, T2, T3, T4, T5, T6]):
    @overload
    def foo(self: A[int, T2, T3, T4, T5, T6]) -> int: ...

    @overload
    def foo(self: A[str, T2, T3, T4, T5, T6]) -> str: ...

    def foo(self) -> int | str:
        raise NotImplementedError

py-spy results without change

py-spy results after change

sterliakov · 2024-11-16T19:12:32Z

This looks like an important performance fix, but unfortunately seems to change the mypy behaviour significantly. Two testcases based on your example:

from __future__ import annotations

from typing import Generic, TypeVar, overload

T1 = TypeVar("T1", int, str, bool)
T2 = TypeVar("T2", int, str, bool)
T3 = TypeVar("T3", int, str, bool)
T4 = TypeVar("T4", int, str, bool)

class A(Generic[T1, T2, T3, T4]):
    @overload
    def foo(self: A[int, T2, T3, T4]) -> int: ...

    @overload
    def foo(self: A[int, T2, T3, bool]) -> str: ...

    @overload
    def foo(self: A[str, T2, T3, T4]) -> str: ...

    def foo(self) -> int | str:
        return 0

And

from __future__ import annotations

from typing import Generic, TypeVar, overload

T1 = TypeVar("T1", int, str, bool)
T2 = TypeVar("T2", int, str, bool)
T3 = TypeVar("T3", int, str, bool)
T4 = TypeVar("T4", int, str, bool)

class A(Generic[T1, T2, T3, T4]):
    @overload
    def foo(self: A[int, T2, T3, bool]) -> str: ...

    @overload
    def foo(self: A[int, T2, T3, T4]) -> int: ...

    @overload
    def foo(self: A[str, T2, T3, T4]) -> str: ...

    def foo(self) -> int | str:
        return 0

(they only differ in overloads order)

Current mypy master correctly emits

error: Overloaded function signatures 1 and 2 overlap with incompatible return types

in both cases. With this patch the first one is green (which is invalid) and the second one emits

error: Overloaded function signatures 1 and 2 overlap with incompatible return types
note: Flipping the order of overloads will fix this error

alexdrydew · 2024-11-17T21:27:48Z

@sterliakov
Thank you for finding this case. It seems that the problem is deeper than I've initially thought.

Initially I thought that we get overload-overlap error for your provided examples because of T4: bool variant, but it turns out that the real source of the error is bool being subtype of int. It can be simplified to this example:

from typing import Generic, TypeVar, Union, overload


class A:
    pass


class B(A):
    pass


T1 = TypeVar("T1", A, B)
T2 = TypeVar("T2", A, B)


class T(Generic[T1, T2]):
    @overload
    def foo(self: T[T1, T2]) -> int: ...

    @overload
    def foo(self: T[T1, A]) -> str: ...  # same for flip signatures case

    def foo(self) -> Union[int, str]:
        raise NotImplementedError

Actual variant that triggers the error:

@overload
def foo(self: T[A, B]) -> int: ...

@overload
def foo(self: T[B, A]) -> str: ...

alternative example where `B` is not a subtype of `A` does not raise any errors

from typing import Generic, TypeVar, Union, overload


class A:
    pass


class B:
    pass


T1 = TypeVar("T1", A, B)
T2 = TypeVar("T2", A, B)


class T(Generic[T1, T2]):
    @overload
    def foo(self: T[T1, T2]) -> int: ...

    @overload
    def foo(self: T[T1, A]) -> str: ...  # same for flip signatures case

    def foo(self) -> Union[int, str]:
        raise NotImplementedError

It seems that mypy behavior here (at least according to testOverloadedPartiallyOverlappingTypeVarsAndUnion test case) is to assume that self type is covariant and to allow overload signatures to shadow later narrower signatures for some variants of generic.

I am actually not sure what is the correct behavior here. Currently this one seems most consistent to me:

all overloads are checked jointly, so each type variable can have only one value at a time across all overloads (I don't think that we should match against multiple variants of generic self when calling method at any point because we should already have specific type)
unsafe overlap error may only be triggered when there are partially overlapping signature variants and the earlier signature arguments are strictly narrower than the later ones but return types are incompatible (otherwise the second incompatible signature is considered shadowed). This case is actually triggered in your second example.

Maybe it is a good idea to allow only strictly narrower function signature variants to shadow later ones to prevent cases when there is a pair of signature variants with exactly same arguments but incompatible return types? I've changed the last check in is_unsafe_overlapping_overload_signatures to handle this edge case in my last commit, but as you may see even that already changes behavior of at least one test case.

sterliakov · 2024-11-18T00:15:23Z

Maybe it is a good idea to allow only strictly narrower function signature variants to shadow later ones to prevent cases when there is a pair of signature variants with exactly same arguments but incompatible return types?

Could you illustrate this with some example? As far as I understand, you suggest the following to become allowed:

from typing import overload

class A: pass
class B(A): pass

@overload
def fn(x: B) -> str: ...
@overload
def fn(x: A) -> float: ...
def fn(x: A) -> str | float:
    return "" if isinstance(x, B) else 0

It's quite important to report overloads-overlap here since otherwise the following (safe) upcasting results in unsafe code that raises at runtime and reports no mypy errors:

def handle(t: A) -> float:
    return fn(t) + 1

handle(B())

BTW, there's a bug in current mypy (#10143): if I replace A and B with float and int resp., the error isn't reported... Also note #10157.

alexdrydew · 2024-11-18T17:55:11Z

Could you illustrate this with some example? As far as I understand, you suggest the following to become allowed

Not really, sorry for the confusion, I meant that strictly wider in terms of the arguments earlier overload signatures should shadow the later ones (emphasis on strictly here, otherwise it is just the current behavior), so:

current variant of testOverloadedPartiallyOverlappingTypeVarsAndUnion is still valid:

@overload
def g(x: Union[B, C]) -> int: ...
@overload
def g(x: B) -> str: ...
def g(x): pass

but testOverloadOnExactSameTypeVariantWithIncompetibleReturnTypes is now invalid due to the T: B variant

class A: pass
class B: pass
T = TypeVar('T', A, B)
@overload
def f(x: T) -> int: ...  # E: Overloaded function signatures 1 and 2 overlap with incompatible return types
@overload
def f(x: B) -> str: ...
def f(x): ...

but this is still valid (as in testOverloadsSafeOverlapAllowed)

T = TypeVar('T', A, B)
@overload
def f(x: T) -> int: ...
@overload
def f(x: B = ...) -> str: ...
def f(x): ...

sterliakov · 2024-11-18T23:54:00Z

Ohh. I have to ask: why don't we produce "signature 2 will never be matched [overload-cannot-match]" there? It is indeed type-safe as the second overload is just ignored entirely, but also indicates a programming error.

I don't think [overloads-overlap] is appropriate here, not emitting it in your case is fine, and that becomes a separate [overload-cannot-match] bug, but I have a feeling that something is deeply wrong with overloads compatibility checking - perhaps it's time to rewrite that part...

sterliakov · 2024-11-19T00:01:54Z

TBH, I think changes in this PR improve mypy - even though not all corner cases are handled, they weren't handled previously either, and the performance problem is rather dramatic. Could you add a testcase exercising this specific behaviour (many vars in overloads) to prevent introducing similar perf regressions in future?

alexdrydew · 2024-11-19T22:13:45Z

Ohh. I have to ask: why don't we produce "signature 2 will never be matched [overload-cannot-match]" there? It is indeed type-safe as the second overload is just ignored entirely, but also indicates a programming error.

The problem is that [overload-cannot-match] check is done using an approximation where type variables are replaced with unions of all their respective bounds, so we can't detect errors which are true only for some specific variants of type variables. I guess to fix that one would need to rewrite this part to expand signatures by type variables as well or rewrite check_overlapping_overloads completely.

github-actions · 2024-11-19T22:23:44Z

According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅

alexdrydew added 4 commits November 16, 2024 15:50

optimize is_unsafe_overlapping_overload_signatures

be43234

add docstring

aa95616

fix dict union for 3.8

c9b465f

fix type annotation

ba26d12

This comment has been minimized.

Sign in to view

alexdrydew marked this pull request as ready for review November 16, 2024 17:20

disallow exactly same overloads with different return value

ba86157

This comment has been minimized.

Sign in to view

add test case for many type variables overload

5ddfc47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize is_unsafe_overlapping_overload_signatures for overlapping typevars case #18159

Optimize is_unsafe_overlapping_overload_signatures for overlapping typevars case #18159

alexdrydew commented Nov 16, 2024

This comment has been minimized.

sterliakov commented Nov 16, 2024

This comment has been minimized.

alexdrydew commented Nov 17, 2024

sterliakov commented Nov 18, 2024 •

edited

Loading

alexdrydew commented Nov 18, 2024

sterliakov commented Nov 18, 2024

sterliakov commented Nov 19, 2024

alexdrydew commented Nov 19, 2024 •

edited

Loading

github-actions bot commented Nov 19, 2024

Optimize is_unsafe_overlapping_overload_signatures for overlapping typevars case #18159

Are you sure you want to change the base?

Optimize is_unsafe_overlapping_overload_signatures for overlapping typevars case #18159

Conversation

alexdrydew commented Nov 16, 2024

This comment has been minimized.

sterliakov commented Nov 16, 2024

This comment has been minimized.

alexdrydew commented Nov 17, 2024

sterliakov commented Nov 18, 2024 • edited Loading

alexdrydew commented Nov 18, 2024

sterliakov commented Nov 18, 2024

sterliakov commented Nov 19, 2024

alexdrydew commented Nov 19, 2024 • edited Loading

github-actions bot commented Nov 19, 2024

sterliakov commented Nov 18, 2024 •

edited

Loading

alexdrydew commented Nov 19, 2024 •

edited

Loading