[wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af) by pavelsavara · Pull Request #355 · dotnet/binaryen

pavelsavara · 2026-04-29T17:28:57Z

Contributes to dotnet/runtime#113786

Summary

Bumps dotnet/binaryen to upstream WebAssembly/binaryen at 54f9f7afa703 — the revision pinned by Emscripten 5.0.6.

Branch contents

https://github.com/pavelsavara/binaryen/commits/dotnet/main-bump-binaryen-5.0.6/

Fresh from upstream pin (no merge, clean history)
Drop tests, fuzzers, docs, vendored fixtures, unused tooling for scan-area reduction (keeps third_party/llvm-project and FP16)
Add Arcade scaffolding + cmake patches for 5.0.6 (arcade synced from dotnet/main)
Merge dotnet/main to establish common history (strategy=ours, no content changes)

Validation

linux-x64 build+pack succeeded (22s, 0 errors)
Package: runtime.linux-x64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport.11.0.0-alpha.1.26253.2.nupkg (7.6M)

Extract logic for noting subtyping, cast, and descriptor relationships from the function-parallel analysis and fixed point analysis phases of Unsubtyping into a shared CRTP utility. This will help avoid additional duplication in a follow-on PR that newly notes casts outside the function-parallel context.

wasm-opt will just fatally error on invalid module inputs, but binaryen.js is a library and users want to get something they can handle, and see the actual error. This PR throws a C++ exception instead, and converts it on the JS side. Fixes WebAssembly#8256

Before, `tuple.extract 3 2` would end up printed as `2 2`, which means read index 2 from size 2, which was invalid. We need to take the index size into account.

Part of WebAssembly#8180 and WebAssembly#8261. Fixes the semantics/spec test when the same tag is imported in different instances, in which case the tag should behave as a new identity, which was previously not the case (see the tags in the modified instance.wast in this PR).

Fix copy-paste bug in `BinaryenAddTagImport` where `getGlobalOrNull` was used instead of `getTagOrNull` Fixes WebAssembly#8272

…ly#8283)

Take into account the implicit casts and conversions that happen on the boundary with JS as well as the fact that JS can essentially read the first field on descriptors with configured prototypes.

Fix for [fuzzer-detected crash when ctor-eval runs on a module that imports a tag](WebAssembly#8254 (comment)). Prior to WebAssembly#8254, ctor-eval would [crash](https://github.com/WebAssembly/binaryen/blob/23d218d0bd469a399ff17b26fdd71164beeb63fa/src/tools/wasm-ctor-eval.cpp#L396) when an imported tag was evaluated, but not when imported. Change the code to allow imported tags even during evaluation. Note that we can't reason about the identity of imported tags. In the following code, $t1 and $t2 may be the same or different tags: ```wasm (import "foo" "bar" (tag $t1)) (import "foo" "bar2" (tag $t2)) ``` In this PR, we assume that $t1 and $t2 are different tags, and that they're the same tag if the import name is the same (this is also not true in general, the hosting environment may provide two different values for the same exact import name). This may cause some correctness issues. As a followup, we can make equality comparison of two imported tags throw FailToEvalException to make evaluation correct. Part of WebAssembly#8180.

When IRBuilder creates a multivalue block wrapping a set and get of a tuple scratch local, it might be implicitly adding a function type to the module. Text round-tripping could previously fail if that function type would conflict with another function type in the module after binary writing, for example if it contained a bottom reference type with GC disabled. Fix the problem by generalizing the type of scratch locals to be the types that will eventually be written to a binary given the enabled features. This further pessimizes our handling of multivalue code by losing type information in the scratch locals, but handling multivalue better will require a much more systematic change anyway. Fixes WebAssembly#8279. --------- Co-authored-by: Alon Zakai <azakai@google.com>

…mbly#8282) ## Summary - Fix swapped `mergeIf` arguments in `doVisitIf` when an `if` has no `else` branch - Add GTest to verify phi node values are correctly associated with true/false conditions Fixes WebAssembly#8273. In the no-else case, `mergeIf(initialState, afterIfTrueState, ...)` incorrectly paired the initial state (before the if body ran) with the `ifTrue` condition and the after-if-true state with the `ifFalse` condition. The fix swaps the arguments to `mergeIf(afterIfTrueState, initialState, ...)`, matching the convention used in the if-with-else case. ## Test plan - Added `DataflowTest.IfNoElseMergeOrder` GTest that builds a dataflow graph for an if-no-else function and verifies the phi node selects the correct values for each branch - Verified the test fails before the fix (values 10 and 42 are swapped) and passes after - All 306 existing unit tests continue to pass

The `visitTableCopy` and `visitTableFill` guards in `LLVMMemoryCopyFillLowering` were defined with an uppercase `V` (`VisitTableCopy`, `VisitTableFill`). The PostWalker dispatches to lowercase `visit*` methods, so these guards were never called. Additionally, `VisitTableFill` had the wrong parameter type (`TableCopy*` instead of `TableFill*`). This means a module containing `table.copy` or `table.fill` would not get the intended `Fatal()` error from this pass. Three changes: - `VisitTableCopy` -> `visitTableCopy` - `VisitTableFill` -> `visitTableFill` - `TableCopy*` -> `TableFill*` in `visitTableFill`'s parameter Tests: added two lit tests that verify the pass now Fatal()s on `table.copy` and `table.fill`.

In `processResumeHandlers` (`src/ir/subtype-exprs.h`), the inner loop uses the outer loop variable `i` instead of `j` to index into `tagSig.params` and `expected`: ```cpp for (Index i = 0; i < handlerTags.size(); ++i) { // ... for (Index j = 0; j < tagSig.params.size(); ++j) { self()->noteSubtype(tagSig.params[i], expected[i]); // should be [j] } ``` This produces wrong subtyping constraints for resume handlers whose tags have multiple parameters. It also causes out-of-bounds access when the handler index `i` is >= the tag's parameter count. Fix: `[i]` -> `[j]` (one-character change). Test: added a case with two handlers — one single-param tag and one two-param tag — and verified the correct (Type, Type) pairs are noted.

WebAssembly#8297) ## Summary When comparing stacks of different sizes, `Stack::compare` iterated from the top matching elements pairwise, accumulating a comparison result. However, when one stack was exhausted, it returned `LESS` or `GREATER` based solely on which stack was taller, **ignoring any contradicting result** from the element-wise comparisons. For example, with `Stack<Bool>` comparing `[true, false]` against `[true]`: - Top elements: `false` vs `true` → `LESS` - Then `b` is exhausted, `a` has extra non-bottom elements → code returned `GREATER` - Correct answer: `NO_RELATION` (the element ordering conflicts with the size ordering) The fix checks the accumulated `result` before returning at the size-difference branches, returning `NO_RELATION` when there is a conflict. Note: This bug was not caught by the lattice fuzzer because `Stack` is not included in the fuzz variants in `wasm-fuzz-lattices.cpp`. ## Test plan - [x] Added `StackLattice.CompareDifferentSizeConflict` test case - [x] All existing lattice tests still pass - [x] Full unit test suite passes (304 tests)

Address feedback missed in WebAssembly#8267.

…ionWithoutAdd (WebAssembly#8294) Move `updateSymbol` calls for `prologLocation`/`epilogLocation` outside the `debugLocations` loop in `copyFunctionWithoutAdd`, matching the pattern of the `updateLocation` calls in the `fileIndexMap` block above The `updateSymbol` calls were a copy-paste error from the `fileIndexMap` block. Being inside the loop caused two issues: 1. When `debugLocations` is empty, the loop never executes and `prologLocation`/`epilogLocation` symbol indices are never remapped 2. When `debugLocations` has multiple entries, the indices are double-remapped (or cause OOB access)

…bals (WebAssembly#8295) ## Summary - `Interpreter::instantiate()` iterates all globals without checking `imported()`, causing `ExpressionIterator` to be constructed with a null `init` expression for imported globals. This triggers an assertion failure in `PostWalker::walk()` (debug) or a null-pointer dereference (release). - Added a `global->imported()` check to skip imported globals during instantiation, matching the pattern used by `walkModule()` in `wasm-traversal.h`. - Added two test cases: one for a module with only imported globals, and one for a module mixing imported and local globals. ## Test plan - [x] `InterpreterTest.ImportedGlobalI32` — verifies `addInstance` succeeds with an imported global - [x] `InterpreterTest.MixedImportedAndLocalGlobals` — verifies a local global is correctly initialized when an imported global is also present - [x] All 56 interpreter tests pass

## Summary - `WasmStore::instances` was a `std::vector<Instance>`, but `Frame` holds `Instance&` references. Vector reallocation when adding new instances invalidates all frame references, leading to use-after-free. - Changed `instances` to `std::deque<Instance>`, which guarantees that references to existing elements are not invalidated by `push_back`/`emplace_back`. - Added a test that verifies Instance addresses remain stable after adding many instances while a frame holds a reference. ## Test plan - [x] `InterpreterTest.InstanceReferenceStability` — verifies that adding 100 instances does not relocate existing instances or invalidate frame references - [x] All 55 interpreter tests pass

Previously we would validate that some usages (e.g. globals) of shared types required the shared-everything feature, but that was not enough to prevent fuzzer issues because it was still possible to write a file that used shared types and passed validation without shared-everything, so would fail when fuzzed on V8. Fix the problem by validating features when building shared types in the first place.

…8299)

These functions receive a reference value. We cannot log their internals, but we can try to do some operations on them, and this gets more such values moving across the JS/wasm boundary. Future fuzzing of configureAll will build on this. To achieve this, the `logValue` method had to be moved up in the fuzzer, so we can use it from another place. The only change there is to make it stop printing newlines all the time (and let the caller do it, if needed).

Based on discussion in WebAssembly#7574 (comment) the `@binaryen.removable.if.unused` code annotation has the meaning that if the result is unused (dropped), then the code can be considered dead (no side effects, removable). This can be used on a function to affect all calls to it, or on specific call instructions. The optimizer then finds relevant dropped calls and can remove them (in Vacuum).

and replace them with unreachables. This might be useful in situations where a module needs to be processed by tools that do not support relaxed SIMD, but where the relaxed SIMD usage also does not affect the output of the tool.

We started validating that shared-everything is enabled when we defined shared types in WebAssembly#8298, but this missed the case where a non-shared type definition used a shared abstract heap type, which has no definition. Update the validation to check that the used types are allowed by the enabled feature set as well. Refactor the validation logic into several functions to avoid duplication of logic.

…#8303) These are handled with a different visitor than the scalar loads and stores.

…ctCmpxchg (WebAssembly#8304) ## Summary `visitStructRMW` and `visitStructCmpxchg` in `Struct2Local` are missing the `Type::unreachable` guard that all sibling visitor methods have (`visitRefGetDesc`, `visitStructGet`, `visitRefIsNull`, `visitRefEq`). When a `struct.atomic.rmw` or `struct.atomic.rmw.cmpxchg` has an unreachable operand (e.g., unreachable value), the expression type is `unreachable` but the code asserts it equals the concrete field type, causing an assertion failure: ``` Assertion failed: (type == field.type), function visitStructRMW, file Heap2Local.cpp, line 1043. ``` This is the same bug pattern that was fixed in WebAssembly#8283 for `visitRefGetDesc` — it was just missed in these two methods.

## Summary Two bugs in the experimental `TypeGeneralizing` pass's backward analysis: ### 1. `visitStructSet` pushes non-ref field types onto the stack `visitStructSet` unconditionally pushes the struct field type as a type requirement onto the backward analysis stack (line 690). When the field is a non-reference type (i32, f64, etc.), this corrupts the stack because non-ref producers (like `visitConst`, `visitBinary`) are no-ops that don't pop. The spurious non-ref value on the stack causes subsequent `pop()` calls to retrieve wrong type requirements. The analogous methods all correctly guard with `isRef()`: - `visitArraySet` (line 792): `if (elemType.isRef())` - `visitStructNew` (line 620): `if (field.type.isRef())` - `handleCall` (line 364): `if (param.isRef())` **Fix:** Add `if (fieldType.isRef())` guard before pushing. ### 2. `visitRefAs` crashes on `Type::none` from empty stack When the backward analysis stack is empty (no downstream consumer imposes a type requirement), `pop()` returns `Type::none`. `visitRefAs` then calls `type.getHeapType()` on `Type::none`, triggering `assert(isRef())` — crashing on any `ref.as_non_null` whose result is dropped. **Fix:** Check for `Type::none` before accessing heap type, and propagate "no requirement" through. Both bugs are in the experimental (not-yet-sound) pass and do not affect production optimization pipelines. ## Test plan - [x] New lit test `type-generalizing-fixes.wast` covering: - `struct.set` on non-ref field followed by ref field (stack alignment) - `drop(ref.as_non_null(...))` (empty stack crash) - `drop(any.convert_extern(extern.convert_any(...)))` (empty stack with convert ops) - [x] All 309 unit tests pass

Return an error when building a type that contains a string type when strings are not enabled. This prevents the fuzzer from trying to run modules that contain string types on V8.

Toolchains should remove these annotations before shipping, using --strip-toolchain-annotations

In WebAssembly#8568 we optimized the grouping of locals in the binary writer to account for how types will be written given the enabled features. However, that change did not properly update the handling of scratch locals accordingly, leading to inconsistencies in the indices assigned to local types in different locations. Fix the problem by reverting the changes from WebAssembly#8568 and handling the mapping from IR types to written types at a lower level; specifically, create a new `TypeIndexMap` type that extends `InsertOrderedMap` but always applies `asWrittenGivenFeatures` to its keys. Use this new map type both for the `numLocalsByType` map and the `scratchLocals` map.

Split off from my [WIP for improving effects analysis for indirect calls](https://github.com/WebAssembly/binaryen/compare/indirect-effects-3?expand=1). These methods don't mutate the Module so they can be const. Also move getModuleElement into the anonymous namespace to prevent the name from leaking. Since these getters are now const, I also change some usages of Module&/Module* to const e.g. EffectsAnalyzer, since these usages also only need read-only access to the Module.

WebAssembly#8581) Continuing WebAssembly#8571, use a constexpr check to see when we are about to visit something that has no children. In that case we don't need to push a task for it and pop it later, we can just do the visit inline.

…ly#8576) fixes WebAssembly#8537

See https://github.com/emscripten-core/setup-emsdk

This is basically NFC but in the new place more code paths end up using the flag, so this may increase our coverage slightly.

…o globals (WebAssembly#8585) Continuations cannot be serialized.

Diff without whitespace is trivial.

This refactors the code a bit to allow the VM classes in the fuzzer to run JS. The function also allows running it in a checked (Python exception on a non-0 return code) or unchecked way. A future fuzzer will use this `run_js` method.

…art function (WebAssembly#8589) The start may be needed for the ABI between the wasm and the outside. The point of preserve-imports-and-exports is to not break such ABIs (or at least have a chance of not doing so), so it doesn't seem like we need a new option here.

Fix various spelling typos in source and test files, as reported by Debian Lintian.

This avoids large slowdowns in cases with very long string names, etc.

The default in musl is apparently tiny, and MergeSimilarFunctions has recursion which can hit it. We can perhaps improve that pass to avoid recursion, but this change seems generally good for robustness. It just makes us use the usual 8 MB stack size on Linux that all other Linuxes use. Fixes WebAssembly#8594

Refactor GlobalEffects to not compute the transitive call graph explicitly but instead aggregate effects as we go. This improves the runtime of the pass by 4.3% on calcworker (1.21792 s -> 1.16586 s averaged over 20 compilations). It also helps prepare the code for future changes to support effects for indirect calls. Another potential future improvement here is to use SCC, which would let us stop processing children early in cases where there are no effects to update. Currently we can't do this because we add trap effects to potentially-recursive call loops, so even if no effects were updated, we need to keep going to find potential cycles.

This starts from wasm+js testcases and then modifies the wasm in a way that preserves imports and exports, so the wasm+js can still be run. This is very different from our usual approach of starting with only wasm, then bashing it into the shape that our general js code can handle. The main benefit here is testing of more interesting wasm+js interactions, specifically for the JS Interop proposal. Three wasm+js combinations are added in this PR that test features from that proposal.

The lexer previously used its own internal `LexerCtx` abstraction that allowed it to consume the characters that made up a token without changing the lexer state, then update the state at once when committing to consuming the characters. However, manually resetting the lexer to the original position when giving up on parsing a token is simple enough that this abstraction was not holding its weight. Simplify the lexer by removing internal contexts, and move the simplified method bodies to lexer.h. Generally we try to avoid putting lots of code in headers, but in this case making the code available to the inliner, along with removing the extra layer of abstraction, makes the parser about 20% faster.

… scan-area reduction

- use LLVM 23.1.0-alpha.1.26257.1 from general-testing - fix symlinks - fix -lc++

tlively and others added 30 commits February 9, 2026 09:07

Fix unreachable tuple.extract printing (WebAssembly#8277)

24c2a9f

Before, `tuple.extract 3 2` would end up printed as `2 2`, which means read index 2 from size 2, which was invalid. We need to take the index size into account.

Format recent CHANGELOG entries. NFC (WebAssembly#8278)

ef3c33a

[C API] Fix BinaryenAddTagImport (WebAssembly#8281)

93267c5

Fix copy-paste bug in `BinaryenAddTagImport` where `getGlobalOrNull` was used instead of `getTagOrNull` Fixes WebAssembly#8272

Add const annotation for dump [NFC] (WebAssembly#8276)

5f91ba9

Heap2Local: Fix unreachability handling in visitRefGetDesc (WebAssemb…

8a12ceb

…ly#8283)

Fix Unsubtyping and GTO for configureAll (WebAssembly#8267)

e103d6d

Take into account the implicit casts and conversions that happen on the boundary with JS as well as the fact that JS can essentially read the first field on descriptors with configured prototypes.

[NFC] Update comments in gto-jsinterop.wast (WebAssembly#8293)

561b8d3

Address feedback missed in WebAssembly#8267.

Add comments to dataflow.h after WebAssembly#8282 [NFC] (WebAssembly#…

0c15503

…8299)

Add SIMD load and store support to Memory64Lowering pass (WebAssembly…

aebfa3c

…#8303) These are handled with a different visitor than the scalar loads and stores.

Validate strings in types (WebAssembly#8314)

bc4838e

Return an error when building a type that contains a string type when strings are not enabled. This prevents the fuzzer from trying to run modules that contain string types on V8.

Add a pass to remove toolchain annotations (WebAssembly#8301)

ec6d7ef

Toolchains should remove these annotations before shipping, using --strip-toolchain-annotations

tlively and others added 16 commits April 7, 2026 16:24

[JS & C API] Rename MemorySegment functions to DataSegment (WebAssemb…

b9a9afb

…ly#8576) fixes WebAssembly#8537

[ci] Use emsdk-setup github action (WebAssembly#8584)

68ea908

See https://github.com/emscripten-core/setup-emsdk

Move a v8 fuzzer flag to a more prominent place (WebAssembly#8582)

1527ce0

This is basically NFC but in the new place more code paths end up using the flag, so this may increase our coverage slightly.

[Stack Switching] wasm-ctor-eval: stop on serializing continuations t…

d918f9c

…o globals (WebAssembly#8585) Continuations cannot be serialized.

[NFC] Move fuzzer VMs out of CompareVMs (WebAssembly#8587)

f7b08ed

Diff without whitespace is trivial.

[NFC] Fix spelling typos (WebAssembly#8591)

baa1564

Fix various spelling typos in source and test files, as reported by Debian Lintian.

[NFC] Use unordered_set in effects.h and CodePushing (WebAssembly#8586)

3990615

This avoids large slowdowns in cases with very long string names, etc.

pavelsavara requested review from akoeplinger and radekdoulik April 29, 2026 17:28

pavelsavara self-assigned this Apr 29, 2026

pavelsavara mentioned this pull request Apr 29, 2026

Upgrade Emscripten to 5.0.6 dotnet/runtime#113786

Open

pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from 9edfb9a to e698793 Compare May 4, 2026 19:32

pavelsavara changed the title ~~Merge upstream WebAssembly/binaryen 54f9f7af for Emscripten 5.0.6~~ [wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af) May 4, 2026

del: drop tests, fuzzers, docs, vendored fixtures, unused tooling for…

1ef8989

… scan-area reduction

pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from e698793 to aa1f919 Compare May 5, 2026 10:26

eng: dotnet Arcade scaffolding + cmake patches for binaryen 5.0.6

5031c30

pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from aa1f919 to 5031c30 Compare May 5, 2026 10:32

pavelsavara added 2 commits May 5, 2026 12:36

merge: dotnet/main (resolve unrelated histories, ours wins)

28be7ec

- use azurelinux-3.0-net11.0

ed514b9

- use LLVM 23.1.0-alpha.1.26257.1 from general-testing - fix symlinks - fix -lc++

pavelsavara force-pushed the dotnet/main-bump-binaryen-5.0.6 branch from 212e73c to ed514b9 Compare May 8, 2026 18:02

pavelsavara marked this pull request as ready for review May 8, 2026 18:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af)#355

[wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af)#355
pavelsavara wants to merge 2358 commits into
dotnet:dotnet/mainfrom
pavelsavara:dotnet/main-bump-binaryen-5.0.6

pavelsavara commented Apr 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

Conversation

pavelsavara commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Branch contents

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

18 participants

pavelsavara commented Apr 29, 2026 •

edited

Loading