JIT: skip stack allocation for unused locals#128541
Conversation
Extend escape analysis with a parallel "used" bit. A tracked local is used if it has at least one non-trivial use -- not a copy into another tracked local, and not a use the allocation satisfies statically (compare against null/zero, NULLCHECK, ARR_LENGTH, BOUNDS_CHECK, discarded value, LCL_ADDR as a store destination, non-GC IND/BLK). The relation is closed over the connection graph the same way escapes are. Locals that neither escape nor are used have no externally visible effect. Leave them as heap allocations with reason "[unused]"; the unreferenced helper call is then removed by later liveness/DCE. Stack- allocating these locals instead produces a dead temp whose initialization stores survive, costing code size. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
If we have a dead un-escaping allocation, we're usually better off leaving it as is, rather than stack allocating it and hoping we can clean up the resulting stores. This addresses some of the regressions seen in #128513. |
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR extends ObjectAllocator’s escape/connection-graph analysis with a parallel “used” attribute and uses it to avoid stack-allocating allocation temps that have no non-trivial uses, leaving them as heap allocations (with a diagnostic reason) so later optimization can eliminate the now-dead allocation path.
Changes:
- Add
m_DefinitelyUsedPointerstracking plusIs*Usedqueries andMarkIndexAsUsed. - Seed “used” from the computed escape set and propagate “used” backwards through the existing connection graph via a generalized closure routine.
- Gate stack allocation in
CanAllocateLclVarOnStackon “used” and classify certain parent-stack patterns as “trivial” uses inAnalyzeParentStack.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/coreclr/jit/objectalloc.h | Adds the “used” bitvector state and inline query helpers. |
| src/coreclr/jit/objectalloc.cpp | Implements “used” marking/closure and uses it to block stack allocation for unused allocation temps. |
Comments suppressed due to low confidence (1)
src/coreclr/jit/objectalloc.cpp:2048
- In AnalyzeParentStack, the enumerator-copy propagation block guarded by
if (isCopy)looks effectively dead:isCopyis set tofalseat the top of each loop iteration and is not set back totruein theGT_STORE_LCL_VARcase, so this condition will never be true here. If the intent is to treat a direct local->local store (possibly through BOX) as a copy, this likely needs to testwasCopy(the pre-reset value) or otherwise updateisCopybefore the check; otherwise consider removing this block to avoid misleading future maintenance.
// If the source of this store is an enumerator local,
// then the dest also becomes an enumerator local.
//
if (isCopy)
{
CheckForEnumeratorUse(lclNum, dstLclNum);
}
Extend escape analysis with a parallel "used" bit. A tracked local is used if it has at least one non-trivial use -- not a copy into another tracked local, and not a use the allocation satisfies statically (compare against null/zero, NULLCHECK, ARR_LENGTH, BOUNDS_CHECK, discarded value, LCL_ADDR as a store destination, non-GC IND/BLK). The relation is closed over the connection graph the same way escapes are.
Locals that neither escape nor are used have no externally visible effect. Leave them as heap allocations with reason "[unused]"; the unreferenced helper call is then removed by later liveness/DCE. Stack- allocating these locals instead produces a dead temp whose initialization stores survive, costing code size.