Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support type inference for defaultdict() #8167

Merged
merged 6 commits into from
Dec 18, 2019
Merged

Support type inference for defaultdict() #8167

merged 6 commits into from
Dec 18, 2019

Conversation

JukkaL
Copy link
Collaborator

@JukkaL JukkaL commented Dec 18, 2019

This allows inferring type of x, for example:

from collections import defaultdict

x = defaultdict(list)  # Infer defaultdict[str, List[int]]
x['foo'].append(1)

The implemention is not pretty and we have probably
reached about the maximum reasonable level of special
casing in type inference now.

There is a hack to work around the problem with leaking
type variable types in nested generics calls (I think).
This will break some (likely very rare) use cases.

This allows inferring type of `x`, for example:

```
from collections import defaultdict

x = defaultdict(list)  # defaultdict[str, List[int]]
x['foo'].append(1)
```

The implemention is not pretty and we have probably
reached about the maximum reasonable level of special
casing in type inference now.

There is a hack to work around the problem with leaking
type variable types in nested generics calls (I think).
This will break some (likely very rare) use cases.
Copy link
Member

@ilevkivskyi ilevkivskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! This is the last big missing piece for type inference of empty collections. I have some comments, also why do you need the typeshed change?

mypy/checker.py Outdated
arg1 = get_proper_type(init_type.args[1])
if (isinstance(arg0, (NoneType, UninhabitedType)) and
isinstance(arg1, Instance) and
self.is_valid_ordereddict_partial_value_type(arg1)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is really weird:

  • Why ordereddict and not defaultdict?
  • I don't think it should be called valid, maybe empty?

For example is_empty_defaultdict_value_type().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I'll rename it.

arg0 = get_proper_type(init_type.args[0])
arg1 = get_proper_type(init_type.args[1])
if (isinstance(arg0, (NoneType, UninhabitedType)) and
isinstance(arg1, Instance) and
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be cleaner to move this check to the below helper.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. It's here since we rely on the narrowed down type below, so we'd need a type check anyway.

mypy/checker.py Outdated
@@ -2813,14 +2813,24 @@ def infer_partial_type(self, name: Var, lvalue: Lvalue, init_type: Type) -> bool
partial_type = PartialType(None, name)
elif isinstance(init_type, Instance):
fullname = init_type.type.fullname
if (isinstance(lvalue, (NameExpr, MemberExpr)) and
is_ref = isinstance(lvalue, (NameExpr, MemberExpr))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you are at it maybe change this to RefExpr?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense.

mypy/checker.py Outdated
@@ -2829,6 +2839,19 @@ def infer_partial_type(self, name: Var, lvalue: Lvalue, init_type: Type) -> bool
self.partial_types[-1].map[name] = lvalue
return True

def is_valid_ordereddict_partial_value_type(self, t: Instance) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a short docstring with an example (also the name needs fixing, see above).

mypy/checker.py Outdated
# they leak in cases like defaultdict(list) due to a bug.
# This can result in incorrect types being inferred, but only
# in rare cases.
if isinstance(arg, (TypeVarType, UninhabitedType)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this also include NoneType? Otherwise it may not work if strict_optional=False (see the check for key type). Add a test for this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add a test.

mypy/types.py Outdated
# the type argument is not final and will be replaced later (it's TypeVarType
# or UninhabitedType).
#
# TODO: The type argument can only be TypeVarType due to leaking type variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just store an erased instance here? Then we can remove this TODO and only keep the one in checker.py.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good point -- I'll try this out.

@@ -26,6 +26,9 @@ class ellipsis: pass
# Primitive types are special in generated code.

class int:
@overload
def __init__(self) -> None: pass
@overload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the change to mypyc needed here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test case relies on the int() constructor since it uses defaultdict(int).

from collections import defaultdict
x = defaultdict(int)
x[''] = 1
reveal_type(x) # N: Revealed type is 'collections.defaultdict[builtins.str, builtins.int]'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a similar test case where value type is generic, e.g.:

x = defaultdict(list)  # No error
x['a'] = [1, 2, 3]

and

x = defaultdict(list)  # Error here
x['a'] = []

@JukkaL
Copy link
Collaborator Author

JukkaL commented Dec 18, 2019

The typeshed change is accidental, I'll revert it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants