[wasm] Bump Binaryen to Emsdk 5.0.6 pin (54f9f7af)#355
Open
pavelsavara wants to merge 2358 commits into
Open
Conversation
Extract logic for noting subtyping, cast, and descriptor relationships from the function-parallel analysis and fixed point analysis phases of Unsubtyping into a shared CRTP utility. This will help avoid additional duplication in a follow-on PR that newly notes casts outside the function-parallel context.
wasm-opt will just fatally error on invalid module inputs, but binaryen.js is a library and users want to get something they can handle, and see the actual error. This PR throws a C++ exception instead, and converts it on the JS side. Fixes WebAssembly#8256
Before, `tuple.extract 3 2` would end up printed as `2 2`, which means read index 2 from size 2, which was invalid. We need to take the index size into account.
Part of WebAssembly#8180 and WebAssembly#8261. Fixes the semantics/spec test when the same tag is imported in different instances, in which case the tag should behave as a new identity, which was previously not the case (see the tags in the modified instance.wast in this PR).
Fix copy-paste bug in `BinaryenAddTagImport` where `getGlobalOrNull` was used instead of `getTagOrNull` Fixes WebAssembly#8272
Take into account the implicit casts and conversions that happen on the boundary with JS as well as the fact that JS can essentially read the first field on descriptors with configured prototypes.
Fix for [fuzzer-detected crash when ctor-eval runs on a module that imports a tag](WebAssembly#8254 (comment)). Prior to WebAssembly#8254, ctor-eval would [crash](https://github.com/WebAssembly/binaryen/blob/23d218d0bd469a399ff17b26fdd71164beeb63fa/src/tools/wasm-ctor-eval.cpp#L396) when an imported tag was evaluated, but not when imported. Change the code to allow imported tags even during evaluation. Note that we can't reason about the identity of imported tags. In the following code, $t1 and $t2 may be the same or different tags: ```wasm (import "foo" "bar" (tag $t1)) (import "foo" "bar2" (tag $t2)) ``` In this PR, we assume that $t1 and $t2 are different tags, and that they're the same tag if the import name is the same (this is also not true in general, the hosting environment may provide two different values for the same exact import name). This may cause some correctness issues. As a followup, we can make equality comparison of two imported tags throw FailToEvalException to make evaluation correct. Part of WebAssembly#8180.
When IRBuilder creates a multivalue block wrapping a set and get of a tuple scratch local, it might be implicitly adding a function type to the module. Text round-tripping could previously fail if that function type would conflict with another function type in the module after binary writing, for example if it contained a bottom reference type with GC disabled. Fix the problem by generalizing the type of scratch locals to be the types that will eventually be written to a binary given the enabled features. This further pessimizes our handling of multivalue code by losing type information in the scratch locals, but handling multivalue better will require a much more systematic change anyway. Fixes WebAssembly#8279. --------- Co-authored-by: Alon Zakai <azakai@google.com>
…mbly#8282) ## Summary - Fix swapped `mergeIf` arguments in `doVisitIf` when an `if` has no `else` branch - Add GTest to verify phi node values are correctly associated with true/false conditions Fixes WebAssembly#8273. In the no-else case, `mergeIf(initialState, afterIfTrueState, ...)` incorrectly paired the initial state (before the if body ran) with the `ifTrue` condition and the after-if-true state with the `ifFalse` condition. The fix swaps the arguments to `mergeIf(afterIfTrueState, initialState, ...)`, matching the convention used in the if-with-else case. ## Test plan - Added `DataflowTest.IfNoElseMergeOrder` GTest that builds a dataflow graph for an if-no-else function and verifies the phi node selects the correct values for each branch - Verified the test fails before the fix (values 10 and 42 are swapped) and passes after - All 306 existing unit tests continue to pass
The `visitTableCopy` and `visitTableFill` guards in `LLVMMemoryCopyFillLowering` were defined with an uppercase `V` (`VisitTableCopy`, `VisitTableFill`). The PostWalker dispatches to lowercase `visit*` methods, so these guards were never called. Additionally, `VisitTableFill` had the wrong parameter type (`TableCopy*` instead of `TableFill*`). This means a module containing `table.copy` or `table.fill` would not get the intended `Fatal()` error from this pass. Three changes: - `VisitTableCopy` -> `visitTableCopy` - `VisitTableFill` -> `visitTableFill` - `TableCopy*` -> `TableFill*` in `visitTableFill`'s parameter Tests: added two lit tests that verify the pass now Fatal()s on `table.copy` and `table.fill`.
In `processResumeHandlers` (`src/ir/subtype-exprs.h`), the inner loop
uses the outer loop variable `i` instead of `j` to index into
`tagSig.params` and `expected`:
```cpp
for (Index i = 0; i < handlerTags.size(); ++i) {
// ...
for (Index j = 0; j < tagSig.params.size(); ++j) {
self()->noteSubtype(tagSig.params[i], expected[i]); // should be [j]
}
```
This produces wrong subtyping constraints for resume handlers whose tags
have multiple parameters. It also causes out-of-bounds access when the
handler index `i` is >= the tag's parameter count.
Fix: `[i]` -> `[j]` (one-character change).
Test: added a case with two handlers — one single-param tag and one
two-param tag — and verified the correct (Type, Type) pairs are noted.
WebAssembly#8297) ## Summary When comparing stacks of different sizes, `Stack::compare` iterated from the top matching elements pairwise, accumulating a comparison result. However, when one stack was exhausted, it returned `LESS` or `GREATER` based solely on which stack was taller, **ignoring any contradicting result** from the element-wise comparisons. For example, with `Stack<Bool>` comparing `[true, false]` against `[true]`: - Top elements: `false` vs `true` → `LESS` - Then `b` is exhausted, `a` has extra non-bottom elements → code returned `GREATER` - Correct answer: `NO_RELATION` (the element ordering conflicts with the size ordering) The fix checks the accumulated `result` before returning at the size-difference branches, returning `NO_RELATION` when there is a conflict. Note: This bug was not caught by the lattice fuzzer because `Stack` is not included in the fuzz variants in `wasm-fuzz-lattices.cpp`. ## Test plan - [x] Added `StackLattice.CompareDifferentSizeConflict` test case - [x] All existing lattice tests still pass - [x] Full unit test suite passes (304 tests)
Address feedback missed in WebAssembly#8267.
…ionWithoutAdd (WebAssembly#8294) Move `updateSymbol` calls for `prologLocation`/`epilogLocation` outside the `debugLocations` loop in `copyFunctionWithoutAdd`, matching the pattern of the `updateLocation` calls in the `fileIndexMap` block above The `updateSymbol` calls were a copy-paste error from the `fileIndexMap` block. Being inside the loop caused two issues: 1. When `debugLocations` is empty, the loop never executes and `prologLocation`/`epilogLocation` symbol indices are never remapped 2. When `debugLocations` has multiple entries, the indices are double-remapped (or cause OOB access)
…bals (WebAssembly#8295) ## Summary - `Interpreter::instantiate()` iterates all globals without checking `imported()`, causing `ExpressionIterator` to be constructed with a null `init` expression for imported globals. This triggers an assertion failure in `PostWalker::walk()` (debug) or a null-pointer dereference (release). - Added a `global->imported()` check to skip imported globals during instantiation, matching the pattern used by `walkModule()` in `wasm-traversal.h`. - Added two test cases: one for a module with only imported globals, and one for a module mixing imported and local globals. ## Test plan - [x] `InterpreterTest.ImportedGlobalI32` — verifies `addInstance` succeeds with an imported global - [x] `InterpreterTest.MixedImportedAndLocalGlobals` — verifies a local global is correctly initialized when an imported global is also present - [x] All 56 interpreter tests pass
## Summary - `WasmStore::instances` was a `std::vector<Instance>`, but `Frame` holds `Instance&` references. Vector reallocation when adding new instances invalidates all frame references, leading to use-after-free. - Changed `instances` to `std::deque<Instance>`, which guarantees that references to existing elements are not invalidated by `push_back`/`emplace_back`. - Added a test that verifies Instance addresses remain stable after adding many instances while a frame holds a reference. ## Test plan - [x] `InterpreterTest.InstanceReferenceStability` — verifies that adding 100 instances does not relocate existing instances or invalidate frame references - [x] All 55 interpreter tests pass
Previously we would validate that some usages (e.g. globals) of shared types required the shared-everything feature, but that was not enough to prevent fuzzer issues because it was still possible to write a file that used shared types and passed validation without shared-everything, so would fail when fuzzed on V8. Fix the problem by validating features when building shared types in the first place.
These functions receive a reference value. We cannot log their internals, but we can try to do some operations on them, and this gets more such values moving across the JS/wasm boundary. Future fuzzing of configureAll will build on this. To achieve this, the `logValue` method had to be moved up in the fuzzer, so we can use it from another place. The only change there is to make it stop printing newlines all the time (and let the caller do it, if needed).
Based on discussion in WebAssembly#7574 (comment) the `@binaryen.removable.if.unused` code annotation has the meaning that if the result is unused (dropped), then the code can be considered dead (no side effects, removable). This can be used on a function to affect all calls to it, or on specific call instructions. The optimizer then finds relevant dropped calls and can remove them (in Vacuum).
and replace them with unreachables. This might be useful in situations where a module needs to be processed by tools that do not support relaxed SIMD, but where the relaxed SIMD usage also does not affect the output of the tool.
We started validating that shared-everything is enabled when we defined shared types in WebAssembly#8298, but this missed the case where a non-shared type definition used a shared abstract heap type, which has no definition. Update the validation to check that the used types are allowed by the enabled feature set as well. Refactor the validation logic into several functions to avoid duplication of logic.
…#8303) These are handled with a different visitor than the scalar loads and stores.
…ctCmpxchg (WebAssembly#8304) ## Summary `visitStructRMW` and `visitStructCmpxchg` in `Struct2Local` are missing the `Type::unreachable` guard that all sibling visitor methods have (`visitRefGetDesc`, `visitStructGet`, `visitRefIsNull`, `visitRefEq`). When a `struct.atomic.rmw` or `struct.atomic.rmw.cmpxchg` has an unreachable operand (e.g., unreachable value), the expression type is `unreachable` but the code asserts it equals the concrete field type, causing an assertion failure: ``` Assertion failed: (type == field.type), function visitStructRMW, file Heap2Local.cpp, line 1043. ``` This is the same bug pattern that was fixed in WebAssembly#8283 for `visitRefGetDesc` — it was just missed in these two methods.
## Summary Two bugs in the experimental `TypeGeneralizing` pass's backward analysis: ### 1. `visitStructSet` pushes non-ref field types onto the stack `visitStructSet` unconditionally pushes the struct field type as a type requirement onto the backward analysis stack (line 690). When the field is a non-reference type (i32, f64, etc.), this corrupts the stack because non-ref producers (like `visitConst`, `visitBinary`) are no-ops that don't pop. The spurious non-ref value on the stack causes subsequent `pop()` calls to retrieve wrong type requirements. The analogous methods all correctly guard with `isRef()`: - `visitArraySet` (line 792): `if (elemType.isRef())` - `visitStructNew` (line 620): `if (field.type.isRef())` - `handleCall` (line 364): `if (param.isRef())` **Fix:** Add `if (fieldType.isRef())` guard before pushing. ### 2. `visitRefAs` crashes on `Type::none` from empty stack When the backward analysis stack is empty (no downstream consumer imposes a type requirement), `pop()` returns `Type::none`. `visitRefAs` then calls `type.getHeapType()` on `Type::none`, triggering `assert(isRef())` — crashing on any `ref.as_non_null` whose result is dropped. **Fix:** Check for `Type::none` before accessing heap type, and propagate "no requirement" through. Both bugs are in the experimental (not-yet-sound) pass and do not affect production optimization pipelines. ## Test plan - [x] New lit test `type-generalizing-fixes.wast` covering: - `struct.set` on non-ref field followed by ref field (stack alignment) - `drop(ref.as_non_null(...))` (empty stack crash) - `drop(any.convert_extern(extern.convert_any(...)))` (empty stack with convert ops) - [x] All 309 unit tests pass
Return an error when building a type that contains a string type when strings are not enabled. This prevents the fuzzer from trying to run modules that contain string types on V8.
Toolchains should remove these annotations before shipping, using --strip-toolchain-annotations
In WebAssembly#8568 we optimized the grouping of locals in the binary writer to account for how types will be written given the enabled features. However, that change did not properly update the handling of scratch locals accordingly, leading to inconsistencies in the indices assigned to local types in different locations. Fix the problem by reverting the changes from WebAssembly#8568 and handling the mapping from IR types to written types at a lower level; specifically, create a new `TypeIndexMap` type that extends `InsertOrderedMap` but always applies `asWrittenGivenFeatures` to its keys. Use this new map type both for the `numLocalsByType` map and the `scratchLocals` map.
Split off from my [WIP for improving effects analysis for indirect calls](https://github.com/WebAssembly/binaryen/compare/indirect-effects-3?expand=1). These methods don't mutate the Module so they can be const. Also move getModuleElement into the anonymous namespace to prevent the name from leaking. Since these getters are now const, I also change some usages of Module&/Module* to const e.g. EffectsAnalyzer, since these usages also only need read-only access to the Module.
WebAssembly#8581) Continuing WebAssembly#8571, use a constexpr check to see when we are about to visit something that has no children. In that case we don't need to push a task for it and pop it later, we can just do the visit inline.
This is basically NFC but in the new place more code paths end up using the flag, so this may increase our coverage slightly.
…o globals (WebAssembly#8585) Continuations cannot be serialized.
Diff without whitespace is trivial.
This refactors the code a bit to allow the VM classes in the fuzzer to run JS. The function also allows running it in a checked (Python exception on a non-0 return code) or unchecked way. A future fuzzer will use this `run_js` method.
…art function (WebAssembly#8589) The start may be needed for the ABI between the wasm and the outside. The point of preserve-imports-and-exports is to not break such ABIs (or at least have a chance of not doing so), so it doesn't seem like we need a new option here.
Fix various spelling typos in source and test files, as reported by Debian Lintian.
This avoids large slowdowns in cases with very long string names, etc.
The default in musl is apparently tiny, and MergeSimilarFunctions has recursion which can hit it. We can perhaps improve that pass to avoid recursion, but this change seems generally good for robustness. It just makes us use the usual 8 MB stack size on Linux that all other Linuxes use. Fixes WebAssembly#8594
Refactor GlobalEffects to not compute the transitive call graph explicitly but instead aggregate effects as we go. This improves the runtime of the pass by 4.3% on calcworker (1.21792 s -> 1.16586 s averaged over 20 compilations). It also helps prepare the code for future changes to support effects for indirect calls. Another potential future improvement here is to use SCC, which would let us stop processing children early in cases where there are no effects to update. Currently we can't do this because we add trap effects to potentially-recursive call loops, so even if no effects were updated, we need to keep going to find potential cycles.
This starts from wasm+js testcases and then modifies the wasm in a way that preserves imports and exports, so the wasm+js can still be run. This is very different from our usual approach of starting with only wasm, then bashing it into the shape that our general js code can handle. The main benefit here is testing of more interesting wasm+js interactions, specifically for the JS Interop proposal. Three wasm+js combinations are added in this PR that test features from that proposal.
The lexer previously used its own internal `LexerCtx` abstraction that allowed it to consume the characters that made up a token without changing the lexer state, then update the state at once when committing to consuming the characters. However, manually resetting the lexer to the original position when giving up on parsing a token is simple enough that this abstraction was not holding its weight. Simplify the lexer by removing internal contexts, and move the simplified method bodies to lexer.h. Generally we try to avoid putting lots of code in headers, but in this case making the code available to the inliner, along with removing the extra layer of abstraction, makes the parser about 20% faster.
9edfb9a to
e698793
Compare
54f9f7af for Emscripten 5.0.6… scan-area reduction
e698793 to
aa1f919
Compare
aa1f919 to
5031c30
Compare
- use LLVM 23.1.0-alpha.1.26257.1 from general-testing - fix symlinks - fix -lc++
212e73c to
ed514b9
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Contributes to dotnet/runtime#113786
Summary
Bumps
dotnet/binaryento upstreamWebAssembly/binaryenat54f9f7afa703— the revision pinned by Emscripten 5.0.6.Branch contents
https://github.com/pavelsavara/binaryen/commits/dotnet/main-bump-binaryen-5.0.6/
Validation
runtime.linux-x64.Microsoft.NETCore.Runtime.Wasm.Binaryen.Transport.11.0.0-alpha.1.26253.2.nupkg(7.6M)