Split type-checking into interface and implementation in parallel workers#21119
Split type-checking into interface and implementation in parallel workers#21119ilevkivskyi wants to merge 14 commits intopython:masterfrom
Conversation
|
Oh btw, @JukkaL I think there is a bug in |
|
Diff from mypy_primer, showing the effect of this PR on open source code: cki-lib (https://gitlab.com/cki-project/cki-lib)
- cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" in typed context [no-untyped-call]
+ cki_lib/krb_ticket_refresher.py:26: error: Call to untyped function "_close_to_expire_ticket" of "RefreshKerberosTicket" in typed context [no-untyped-call]
discord.py (https://github.com/Rapptz/discord.py)
- discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "Literal[True]") [assignment]
+ discord/backoff.py:63: error: Incompatible default for parameter "integral" (default has type "Literal[False]", parameter has type "T") [assignment]
- discord/interactions.py:1109: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1255: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/interactions.py:1645: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
- discord/webhook/async_.py:969: error: Incompatible default for parameter "delay" (default has type "float | None", parameter has type "float") [assignment]
|
|
All things in (small) |
|
Could be worth adding a test for the discord.py improvement |
|
I'm planning to test this on a big internal repo (probably tomorrow). I'll also try parallel checking again -- last time memory usage was too high to use many workers, but things should be better now. |
|
I'm seeing mypy parallel run crashes with this PR when type checking the biggest internal codebase at work, but I'm not sure if they are caused by this -- this may just change the order of processing so that a pre-existing issue gets triggered. I will continue the investigation after the long weekend. |
|
@JukkaL can you post a traceback (and maybe a snippet of code where the crash happens)? It may well be some implicit assumption breaks when type-checking functions after top-levels. |
The general idea is very straightforward: when doing type-checking, we first type-check only module top-levels and those functions/methods that define/infer externally visible variables. Then we write cache and send new interface hash back to coordinator to unblock more SCCs early. This makes parallel type-checking ~25% faster.
However, this simple idea surfaced multiple quirks and old hacks. I address some of them in this PR, but I decided to handle the rest in follow up PR(s) to limit the size of this one.
First, important implementation details:
select()call, coordinator collects all responses, both interface and implementation ones, and processes them as a single batch. This simplifies reasoning and shouldn't affect performance.foo.meta_ex.ff. Not 100% sure about the name, couldn't find anything more meaningful.testWalrus.local_definitions()now do not yield methods of classes nested in functions. We add such methods to both symbol table of their actual class, and to the module top-level symbol table, thus causing double-processing.Now some smaller things I already fixed:
TypeFormsupport. I think two is enough, so I deleted the last one.AwaitableGeneratorreturn type wrapping used to happen during processing of function body, which is obviously wrong.Finally, some remaining problems and how I propose to address them in followups:
testNarrowingOfFinalPersistsInFunctions. Supporting this will be tricky/expensive, it would require preserving binder state at the point of each function definition, and restoring it later. IMO this is a relatively niche edge case, and we can simply "un-support" it (there is a simple workaround, add an assert in function body). To be clear, there are no problems with a much more common use of this feature: preserving narrowing in nested functions/lambdas.--disallow-incomplete-defsin plugins doesn't work, seetestDisallowIncompleteDefsAttrsPartialAnnotations. I think this should be not hard to fix (with some dedicated cleaner support). I can do this in a follow-up PR soon.testPEP695InferVarianceNotReadyWhenNeeded. However, when processing function/method bodies in a later phase, variance is ready more often. Although this is an improvement, it creates an inconsistency between parallel mode, and regular mode. I propose to address this by making the two-phase logic default even without parallel checking, see below.--local-partial-typeswhen behavior is different in parallel mode, see e.g.testLocalPartialTypesWithGlobalInitializedToNone. Again the new behavior is IMO clearly better. However, it again creates an inconsistency with non-parallel mode. I propose to address this by enabling two-phase (interface then implementation) checking whenever--local-partial-typesis enabled (globally, not per-file), even without parallel checking. Since--local-partial-typeswill be default behavior soon (and hopefully the only behavior at some point), this will allow us to avoid discrepancies between parallel and regular checking. @JukkaL what do you think?