## Summary Part of https://github.com/astral-sh/ty/issues/1240 Stop unioning `Unknown` into the types of un-annotated container literals. We discussed perhaps continuing to union `Unknown` if the inferred type is a singleton type like `None`. I'd like to explore this as a separate change so we can see the ecosystem impact more clearly. ## Test Plan Adjusted many mdtest expectations. There's one test case that regresses with this change, because we don't fully support union type contexts (it can require a lot of repeat inference in pathological cases). So `x10: list[int | str] | list[int | None] = [1, 2, 3]` previously passed only because we inferred the RHS as `list[Unknown | int]` -- now we infer it as `list[int]` and the assignment fails due to invariance. I've kept this test as a TODO since it's not trivial to fix. Mypy errors in the same way we now do, suggesting it's not necessarily a huge priority either. ## Ecosystem This change is expected to cause new diagnostics and some false positives, since we are replacing very-forgiving gradual types with non-gradual inference heuristics. Many of these issues could be solved or significantly mitigated by https://github.com/astral-sh/ty/issues/1473, depending how far we are able to go with that, and particularly whether we can afford to apply it also to container literals which are not empty at construction. The downside of broad application of this approach is that in some cases it could cause us to widen container types when the user actually just made a mistake and added the wrong thing to a container, and would prefer an error at that location. Some categories of new error that show up in the ecosystem report: ### Implicit TypedDicts These are cases where the dictionary is heterogeneous and would ideally be typed as a `TypedDict` but isn't, for example: ```py def make_person(photo: bytes | None): person = {"name": "Pat", age: 29} if photo is not None: person["photo"] = photo ``` We (and pyrefly, and pyright in strict mode) error on the last line here because we already inferred `dict[str, str | int]`, so we can't add a `bytes` value. Mypy prefers common-base joins over union joins, so it infers `dict[str, object]`, which avoids the error adding a `bytes` value. This means the value type is less precise, which theoretically means potentially more errors using values from the dict later. But in practice with this heterogeneous pattern, either `object` or the union will cause similar problems when using values from the dict -- in either case you'd probably have to cast or narrow. Pyright (in non-strict mode) has a special case where it falls back to `Unknown` when it sees heterogenous value types, so it infers this as `dict[str, Unknown]`. I think we could consider either the mypy or pyright approaches here, but we don't need to do it in this PR; we can file an issue and consider it as a follow-up. Another symptom of this same root cause is repetitive diagnostics arising from a large union inferred as value type; the same fixes would address this. ### Negative intersections, particularly with e.g. `~AlwaysFalsy` or `~None`. Example: ```py class A: ... def _(a: A | None) -> dict[str, A]: if a: d = {"a": a} return d return {} ``` We error on `return d` because "expected `dict[str, A]`, found `dict[str, A & ~AlwaysFalsy]`". This is an issue specific to intersection types, so no other type checker has this problem. I think when we "promote literals" (we may need to give this operation a broader name -- it's really "type promotion to give a better inferred type when invariance means too-precise is bad") we should also eliminate all negative types from intersections. I would prefer to do this as a separate PR for easier review and better visibility of ecosystem impact, but I think it's high priority to land soon after this PR (ideally before a release). ### Overly-precise inference for singleton `None` This did show up, to the tune of ~100 new diagnostics ([example](https://github.com/pytorch/ignite/blob/b73a4c20e991b3e14949f2a69651ed2a7219f2fd/tests/ignite/engine/test_engine.py#L158)), so I think it is worth addressing as a follow-up.
6.0 KiB
Cycles
Function signature
Deferred annotations can result in cycles in resolving a function signature:
from __future__ import annotations
# error: [invalid-type-form]
def f(x: f):
pass
reveal_type(f) # revealed: def f(x: Unknown) -> Unknown
Unpacking
See: https://github.com/astral-sh/ty/issues/364
class Point:
def __init__(self, x: int = 0, y: int = 0) -> None:
self.x = x
self.y = y
def replace_with(self, other: "Point") -> None:
self.x, self.y = other.x, other.y
p = Point()
reveal_type(p.x) # revealed: Unknown | int
reveal_type(p.y) # revealed: Unknown | int
Self-referential bare type alias
[environment]
python-version = "3.12" # typing.TypeAliasType
from typing import Union, TypeAliasType, Sequence, Mapping
A = list["A" | None]
def f(x: A):
# TODO: should be `list[A | None]`?
reveal_type(x) # revealed: list[Divergent]
# TODO: should be `A | None`?
reveal_type(x[0]) # revealed: Divergent
JSONPrimitive = Union[str, int, float, bool, None]
JSONValue = TypeAliasType("JSONValue", 'Union[JSONPrimitive, Sequence["JSONValue"], Mapping[str, "JSONValue"]]')
def _(x: JSONValue):
reveal_type(x) # revealed: Sequence[JSONValue] | int | float | None | Mapping[str, JSONValue]
Self-referential legacy type variables
from typing import Generic, TypeVar
B = TypeVar("B", bound="Base")
class Base(Generic[B]):
pass
Parameter default values
This is a regression test for https://github.com/astral-sh/ty/issues/1402. When a parameter has a
default value that references the callable itself, we currently prevent infinite recursion by simply
falling back to Unknown for the type of the default value, which does not have any practical
impact except for the displayed type. We could also consider inferring Divergent when we encounter
too many layers of nesting (instead of just one), but that would require a type traversal which
could have performance implications. So for now, we mainly make sure not to panic or stack overflow
for these seeminly rare cases.
Functions
class C:
def f(self: "C"):
def inner_a(positional=self.a):
return
self.a = inner_a
# revealed: def inner_a(positional=...) -> Unknown
reveal_type(inner_a)
def inner_b(*, kw_only=self.b):
return
self.b = inner_b
# revealed: def inner_b(*, kw_only=...) -> Unknown
reveal_type(inner_b)
def inner_c(positional_only=self.c, /):
return
self.c = inner_c
# revealed: def inner_c(positional_only=..., /) -> Unknown
reveal_type(inner_c)
def inner_d(*, kw_only=self.d):
return
self.d = inner_d
# revealed: def inner_d(*, kw_only=...) -> Unknown
reveal_type(inner_d)
We do, however, still check assignability of the default value to the parameter type:
class D:
def f(self: "D"):
# error: [invalid-parameter-default] "Default value of type `Unknown | (def inner_a(a: int = ...) -> Unknown)` is not assignable to annotated parameter type `int`"
def inner_a(a: int = self.a): ...
self.a = inner_a
Lambdas
class C:
def f(self: "C"):
self.a = lambda positional=self.a: positional
self.b = lambda *, kw_only=self.b: kw_only
self.c = lambda positional_only=self.c, /: positional_only
self.d = lambda *, kw_only=self.d: kw_only
# revealed: (positional=...) -> Unknown
reveal_type(self.a)
# revealed: (*, kw_only=...) -> Unknown
reveal_type(self.b)
# revealed: (positional_only=..., /) -> Unknown
reveal_type(self.c)
# revealed: (*, kw_only=...) -> Unknown
reveal_type(self.d)
Self-referential implicit attributes
class Cyclic:
def __init__(self, data: str | dict):
self.data = data
def update(self):
if isinstance(self.data, str):
self.data = {"url": self.data}
# revealed: Unknown | str | dict[Unknown, Unknown] | dict[str, str]
reveal_type(Cyclic("").data)
Decorator defined on a base class with constrained typevars, accessed from a subclass with decorated generic parameters
This example was minimized from
a real issue in robotframework.
It created
a complicated cycle with multiple cycle heads,
which also involved
a tricky Salsa behavior that comes up when a query oscillates between being a cycle head and not being one.
entry.py:
from derived import Derived
Derived.decorate
# revealed: bound method <class 'Derived'>.decorate[T](item_class: type[T]) -> type[T]
reveal_type(Derived.decorate)
derived.py:
from ty_extensions import reveal_mro
import bases
class Derived(bases.GenericBase["Foo", "Bar"]): ...
@Derived.decorate
class Foo(bases.Foo): ...
# revealed: <class 'Foo'>
reveal_type(Foo)
# revealed: (<class 'derived.Foo'>, <class 'bases.Foo'>, <class 'object'>)
reveal_mro(Foo)
@Derived.decorate
class Bar(bases.Bar): ...
# revealed: <class 'Bar'>
reveal_type(Bar)
# revealed: (<class 'derived.Bar'>, <class 'bases.Bar'>, <class 'object'>)
reveal_mro(Bar)
bases.py:
from typing import Generic, TypeVar, Type
from ty_extensions import reveal_mro
T = TypeVar("T")
B1 = TypeVar("B1", bound="Foo")
B2 = TypeVar("B2", bound="Bar")
class GenericBase(Generic[B1, B2]):
@classmethod
def decorate(cls, item_class: Type[T]) -> Type[T]:
return item_class
# revealed: <class 'GenericBase'>
reveal_type(GenericBase)
# revealed: (<class 'GenericBase[Unknown, Unknown]'>, typing.Generic, <class 'object'>)
reveal_mro(GenericBase)
# revealed: (<class 'GenericBase[Foo, Bar]'>, typing.Generic, <class 'object'>)
reveal_mro(GenericBase["Foo", "Bar"])
class Foo: ...
class Bar: ...