Skip to content

[wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af)#355

Open
pavelsavara wants to merge 2358 commits into
dotnet:dotnet/mainfrom
pavelsavara:dotnet/main-bump-binaryen-5.0.6
Open

[wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af)#355
pavelsavara wants to merge 2358 commits into
dotnet:dotnet/mainfrom
pavelsavara:dotnet/main-bump-binaryen-5.0.6

Conversation

@pavelsavara
Copy link
Copy Markdown
Member

@pavelsavara pavelsavara commented Apr 29, 2026

Contributes to dotnet/runtime#113786

Summary

Bumps dotnet/binaryen to upstream WebAssembly/binaryen at 54f9f7afa703 — the revision pinned by Emscripten 5.0.6.

Branch contents

https://github.com/pavelsavara/binaryen/commits/dotnet/main-bump-binaryen-5.0.6/

Validation

  • linux-x64 build+pack succeeded (22s, 0 errors)
  • Package: runtime.linux-x64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport.11.0.0-alpha.1.26253.2.nupkg (7.6M)

tlively and others added 30 commits February 9, 2026 09:07
Extract logic for noting subtyping, cast, and descriptor relationships
from the function-parallel analysis and fixed point analysis phases of
Unsubtyping into a shared CRTP utility. This will help avoid additional
duplication in a follow-on PR that newly notes casts outside the
function-parallel context.
wasm-opt will just fatally error on invalid module inputs, but
binaryen.js is a library and users want to get something they can handle, and see
the actual error. This PR throws a C++ exception instead, and converts
it on the JS side.

Fixes WebAssembly#8256
Before, `tuple.extract 3 2` would end up printed as `2 2`, which means
read index 2 from size 2, which was invalid. We need to take the index
size into account.
Part of WebAssembly#8180 and WebAssembly#8261. Fixes the semantics/spec test when the same tag
is imported in different instances, in which case the tag should behave
as a new identity, which was previously not the case (see the tags in
the modified instance.wast in this PR).
Fix copy-paste bug in `BinaryenAddTagImport` where `getGlobalOrNull`
was used instead of `getTagOrNull`

Fixes WebAssembly#8272
Take into account the implicit casts and conversions that happen on the
boundary with JS as well as the fact that JS can essentially read the
first field on descriptors with configured prototypes.
Fix for [fuzzer-detected crash when ctor-eval runs on a module that
imports a
tag](WebAssembly#8254 (comment)).
Prior to WebAssembly#8254, ctor-eval would
[crash](https://github.com/WebAssembly/binaryen/blob/23d218d0bd469a399ff17b26fdd71164beeb63fa/src/tools/wasm-ctor-eval.cpp#L396)
when an imported tag was evaluated, but not when imported. Change the
code to allow imported tags even during evaluation.

Note that we can't reason about the identity of imported tags. In the
following code, $t1 and $t2 may be the same or different tags:
```wasm
(import "foo" "bar" (tag $t1))
(import "foo" "bar2" (tag $t2))
```

In this PR, we assume that $t1 and $t2 are different tags, and that
they're the same tag if the import name is the same (this is also not
true in general, the hosting environment may provide two different
values for the same exact import name). This may cause some correctness
issues. As a followup, we can make equality comparison of two imported
tags throw FailToEvalException to make evaluation correct.

Part of WebAssembly#8180.
When IRBuilder creates a multivalue block wrapping a set and get of a
tuple scratch local, it might be implicitly adding a function type to
the module. Text round-tripping could previously fail if that function
type would conflict with another function type in the module after
binary writing, for example if it contained a bottom reference type with
GC disabled. Fix the problem by generalizing the type of scratch locals
to be the types that will eventually be written to a binary given the
enabled features.

This further pessimizes our handling of multivalue code by losing type
information in the scratch locals, but handling multivalue better will
require a much more systematic change anyway.

Fixes WebAssembly#8279.

---------

Co-authored-by: Alon Zakai <azakai@google.com>
…mbly#8282)

## Summary
- Fix swapped `mergeIf` arguments in `doVisitIf` when an `if` has no
`else` branch
- Add GTest to verify phi node values are correctly associated with
true/false conditions

Fixes WebAssembly#8273.

In the no-else case, `mergeIf(initialState, afterIfTrueState, ...)`
incorrectly paired the initial state (before the if body ran) with the
`ifTrue` condition and the after-if-true state with the `ifFalse`
condition. The fix swaps the arguments to `mergeIf(afterIfTrueState,
initialState, ...)`, matching the convention used in the if-with-else
case.

## Test plan
- Added `DataflowTest.IfNoElseMergeOrder` GTest that builds a dataflow
graph for an if-no-else function and verifies the phi node selects the
correct values for each branch
- Verified the test fails before the fix (values 10 and 42 are swapped)
and passes after
- All 306 existing unit tests continue to pass
The `visitTableCopy` and `visitTableFill` guards in
`LLVMMemoryCopyFillLowering` were defined with an uppercase `V`
(`VisitTableCopy`, `VisitTableFill`). The PostWalker dispatches to
lowercase `visit*` methods, so these guards were never called.
Additionally, `VisitTableFill` had the wrong parameter type
(`TableCopy*` instead of `TableFill*`).

This means a module containing `table.copy` or `table.fill` would not
get the intended `Fatal()` error from this pass.

Three changes:
- `VisitTableCopy` -> `visitTableCopy`
- `VisitTableFill` -> `visitTableFill`
- `TableCopy*` -> `TableFill*` in `visitTableFill`'s parameter

Tests: added two lit tests that verify the pass now Fatal()s on
`table.copy` and `table.fill`.
In `processResumeHandlers` (`src/ir/subtype-exprs.h`), the inner loop
uses the outer loop variable `i` instead of `j` to index into
`tagSig.params` and `expected`:

```cpp
for (Index i = 0; i < handlerTags.size(); ++i) {
  // ...
  for (Index j = 0; j < tagSig.params.size(); ++j) {
    self()->noteSubtype(tagSig.params[i], expected[i]); // should be [j]
  }
```

This produces wrong subtyping constraints for resume handlers whose tags
have multiple parameters. It also causes out-of-bounds access when the
handler index `i` is >= the tag's parameter count.

Fix: `[i]` -> `[j]` (one-character change).

Test: added a case with two handlers — one single-param tag and one
two-param tag — and verified the correct (Type, Type) pairs are noted.
WebAssembly#8297)

## Summary

When comparing stacks of different sizes, `Stack::compare` iterated from
the top matching elements pairwise, accumulating a comparison result.
However, when one stack was exhausted, it returned `LESS` or `GREATER`
based solely on which stack was taller, **ignoring any contradicting
result** from the element-wise comparisons.

For example, with `Stack<Bool>` comparing `[true, false]` against
`[true]`:
- Top elements: `false` vs `true` → `LESS`
- Then `b` is exhausted, `a` has extra non-bottom elements → code
returned `GREATER`
- Correct answer: `NO_RELATION` (the element ordering conflicts with the
size ordering)

The fix checks the accumulated `result` before returning at the
size-difference branches, returning `NO_RELATION` when there is a
conflict.

Note: This bug was not caught by the lattice fuzzer because `Stack` is
not included in the fuzz variants in `wasm-fuzz-lattices.cpp`.

## Test plan

- [x] Added `StackLattice.CompareDifferentSizeConflict` test case
- [x] All existing lattice tests still pass
- [x] Full unit test suite passes (304 tests)
…ionWithoutAdd (WebAssembly#8294)

Move `updateSymbol` calls for `prologLocation`/`epilogLocation`
outside the `debugLocations` loop in `copyFunctionWithoutAdd`, matching
the pattern of the `updateLocation` calls in the `fileIndexMap` block
above

The `updateSymbol` calls were a copy-paste error from the `fileIndexMap`
block. Being inside the loop caused two issues:
1. When `debugLocations` is empty, the loop never executes and
`prologLocation`/`epilogLocation` symbol indices are never remapped
2. When `debugLocations` has multiple entries, the indices are
double-remapped (or cause OOB access)
…bals (WebAssembly#8295)

## Summary

- `Interpreter::instantiate()` iterates all globals without checking
`imported()`, causing `ExpressionIterator` to be constructed with a null
`init` expression for imported globals. This triggers an assertion
failure in `PostWalker::walk()` (debug) or a null-pointer dereference
(release).
- Added a `global->imported()` check to skip imported globals during
instantiation, matching the pattern used by `walkModule()` in
`wasm-traversal.h`.
- Added two test cases: one for a module with only imported globals, and
one for a module mixing imported and local globals.

## Test plan

- [x] `InterpreterTest.ImportedGlobalI32` — verifies `addInstance`
succeeds with an imported global
- [x] `InterpreterTest.MixedImportedAndLocalGlobals` — verifies a local
global is correctly initialized when an imported global is also present
- [x] All 56 interpreter tests pass
## Summary

- `WasmStore::instances` was a `std::vector<Instance>`, but `Frame`
holds `Instance&` references. Vector reallocation when adding new
instances invalidates all frame references, leading to use-after-free.
- Changed `instances` to `std::deque<Instance>`, which guarantees that
references to existing elements are not invalidated by
`push_back`/`emplace_back`.
- Added a test that verifies Instance addresses remain stable after
adding many instances while a frame holds a reference.

## Test plan

- [x] `InterpreterTest.InstanceReferenceStability` — verifies that
adding 100 instances does not relocate existing instances or invalidate
frame references
- [x] All 55 interpreter tests pass
Previously we would validate that some usages (e.g. globals) of shared
types required the shared-everything feature, but that was not enough
to prevent fuzzer issues because it was still possible to write a file
that used shared types and passed validation without shared-everything,
so would fail when fuzzed on V8. Fix the problem by validating features
when building shared types in the first place.
These functions receive a reference value. We cannot log their
internals, but we can try to do some operations on them, and
this gets more such values moving across the JS/wasm
boundary. Future fuzzing of configureAll will build on this.

To achieve this, the `logValue` method had to be moved up in
the fuzzer, so we can use it from another place. The only
change there is to make it stop printing newlines all the time
(and let the caller do it, if needed).
Based on discussion in

WebAssembly#7574 (comment)

the `@binaryen.removable.if.unused` code annotation has the meaning that
if the result is unused (dropped), then the code can be considered
dead (no side effects, removable).

This can be used on a function to affect all calls to it, or on specific
call instructions. The optimizer then finds relevant dropped calls and
can remove them (in Vacuum).
and replace them with unreachables. This might be useful in situations
where a module needs to be processed by tools that do not support
relaxed SIMD, but where the relaxed SIMD usage also does not affect the
output of the tool.
We started validating that shared-everything is enabled when we defined
shared types in WebAssembly#8298, but this missed the case where a non-shared
type definition used a shared abstract heap type, which has no
definition. Update the validation to check that the used types are
allowed by the enabled feature set as well. Refactor the validation
logic into several functions to avoid duplication of logic.
…#8303)

These are handled with a different visitor than the scalar loads and
stores.
…ctCmpxchg (WebAssembly#8304)

## Summary

`visitStructRMW` and `visitStructCmpxchg` in `Struct2Local` are missing
the `Type::unreachable` guard that all sibling visitor methods have
(`visitRefGetDesc`, `visitStructGet`, `visitRefIsNull`, `visitRefEq`).
When a `struct.atomic.rmw` or `struct.atomic.rmw.cmpxchg` has an
unreachable operand (e.g., unreachable value), the expression type is
`unreachable` but the code asserts it equals the concrete field type,
causing an assertion failure:

```
Assertion failed: (type == field.type), function visitStructRMW, file Heap2Local.cpp, line 1043.
```

This is the same bug pattern that was fixed in WebAssembly#8283 for
`visitRefGetDesc` — it was just missed in these two methods.
## Summary

Two bugs in the experimental `TypeGeneralizing` pass's backward
analysis:

### 1. `visitStructSet` pushes non-ref field types onto the stack

`visitStructSet` unconditionally pushes the struct field type as a type
requirement onto the backward analysis stack (line 690). When the field
is a non-reference type (i32, f64, etc.), this corrupts the stack
because non-ref producers (like `visitConst`, `visitBinary`) are no-ops
that don't pop. The spurious non-ref value on the stack causes
subsequent `pop()` calls to retrieve wrong type requirements.

The analogous methods all correctly guard with `isRef()`:
- `visitArraySet` (line 792): `if (elemType.isRef())`
- `visitStructNew` (line 620): `if (field.type.isRef())`
- `handleCall` (line 364): `if (param.isRef())`

**Fix:** Add `if (fieldType.isRef())` guard before pushing.

### 2. `visitRefAs` crashes on `Type::none` from empty stack

When the backward analysis stack is empty (no downstream consumer
imposes a type requirement), `pop()` returns `Type::none`. `visitRefAs`
then calls `type.getHeapType()` on `Type::none`, triggering
`assert(isRef())` — crashing on any `ref.as_non_null` whose result is
dropped.

**Fix:** Check for `Type::none` before accessing heap type, and
propagate "no requirement" through.

Both bugs are in the experimental (not-yet-sound) pass and do not affect
production optimization pipelines.

## Test plan

- [x] New lit test `type-generalizing-fixes.wast` covering:
- `struct.set` on non-ref field followed by ref field (stack alignment)
  - `drop(ref.as_non_null(...))` (empty stack crash)
- `drop(any.convert_extern(extern.convert_any(...)))` (empty stack with
convert ops)
- [x] All 309 unit tests pass
Return an error when building a type that contains a string type when
strings are not enabled. This prevents the fuzzer from trying to run
modules that contain string types on V8.
Toolchains should remove these annotations before shipping, using

--strip-toolchain-annotations
tlively and others added 16 commits April 7, 2026 16:24
In WebAssembly#8568 we optimized the grouping of locals in the binary writer to
account for how types will be written given the enabled features.
However, that change did not properly update the handling of scratch
locals accordingly, leading to inconsistencies in the indices assigned
to local types in different locations. Fix the problem by reverting the
changes from WebAssembly#8568 and handling the mapping from IR types to written
types at a lower level; specifically, create a new `TypeIndexMap` type
that extends `InsertOrderedMap` but always applies
`asWrittenGivenFeatures` to its keys. Use this new map type both for the
`numLocalsByType` map and the `scratchLocals` map.
Split off from my [WIP for improving effects analysis for indirect
calls](https://github.com/WebAssembly/binaryen/compare/indirect-effects-3?expand=1).
These methods don't mutate the Module so they can be const.

Also move getModuleElement into the anonymous namespace to prevent the
name from leaking.

Since these getters are now const, I also change some usages of
Module&/Module* to const e.g. EffectsAnalyzer, since these usages also
only need read-only access to the Module.
WebAssembly#8581)

Continuing WebAssembly#8571, use a constexpr check to see when we are about
to visit something that has no children. In that case we don't need to
push a task for it and pop it later, we can just do the visit inline.
This is basically NFC but in the new place more code paths end up using
the flag, so this may increase our coverage slightly.
This refactors the code a bit to allow the VM classes in the fuzzer to
run JS. The function also allows running it in a checked (Python
exception on a non-0 return code) or unchecked way.

A future fuzzer will use this `run_js` method.
…art function (WebAssembly#8589)

The start may be needed for the ABI between the wasm and the outside.

The point of preserve-imports-and-exports is to not break such ABIs (or
at least have a chance of not doing so), so it doesn't seem like we need
a new option here.
Fix various spelling typos in source and test files, as reported by
Debian Lintian.
This avoids large slowdowns in cases with very long string names, etc.
The default in musl is apparently tiny, and MergeSimilarFunctions has
recursion which can hit it.

We can perhaps improve that pass to avoid recursion, but this change
seems generally good for robustness. It just makes us use the usual
8 MB stack size on Linux that all other Linuxes use.

Fixes WebAssembly#8594
Refactor GlobalEffects to not compute the transitive call graph
explicitly but instead aggregate effects as we go. This improves the
runtime of the pass by 4.3% on calcworker (1.21792 s -> 1.16586 s
averaged over 20 compilations). It also helps prepare the code for
future changes to support effects for indirect calls.

Another potential future improvement here is to use SCC, which would let
us stop processing children early in cases where there are no effects to
update. Currently we can't do this because we add trap effects to
potentially-recursive call loops, so even if no effects were updated, we
need to keep going to find potential cycles.
This starts from wasm+js testcases and then modifies the wasm in a way
that preserves imports and exports, so the wasm+js can still be run. This is
very different from our usual approach of starting with only wasm, then
bashing it into the shape that our general js code can handle.

The main benefit here is testing of more interesting wasm+js
interactions, specifically for the JS Interop proposal. Three
wasm+js combinations are added in this PR that test features
from that proposal.
The lexer previously used its own internal `LexerCtx` abstraction that
allowed it to consume the characters that made up a token without
changing the lexer state, then update the state at once when committing
to consuming the characters. However, manually resetting the lexer to
the original position when giving up on parsing a token is simple enough
that this abstraction was not holding its weight. Simplify the lexer by
removing internal contexts, and move the simplified method bodies to
lexer.h. Generally we try to avoid putting lots of code in headers, but
in this case making the code available to the inliner, along with
removing the extra layer of abstraction, makes the parser about 20%
faster.
@pavelsavara pavelsavara self-assigned this Apr 29, 2026
@pavelsavara pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from 9edfb9a to e698793 Compare May 4, 2026 19:32
@pavelsavara pavelsavara changed the title Merge upstream WebAssembly/binaryen 54f9f7af for Emscripten 5.0.6 [wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af) May 4, 2026
@pavelsavara pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from e698793 to aa1f919 Compare May 5, 2026 10:26
@pavelsavara pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from aa1f919 to 5031c30 Compare May 5, 2026 10:32
- use LLVM 23.1.0-alpha.1.26257.1 from general-testing
- fix symlinks
- fix -lc++
@pavelsavara pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from 212e73c to ed514b9 Compare May 8, 2026 18:02
@pavelsavara pavelsavara marked this pull request as ready for review May 8, 2026 18:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.