-
Notifications
You must be signed in to change notification settings - Fork 746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flatten pass removes block results required by br_on_*
instructions
#6924
Comments
This is a current limitation of flatten, I believe. In general the idea of flatten was to remove block results in order to simplify the IR. But things like I'm curious what you are using flatten for? In Binaryen itself we've used it in wasm2js, but that isn't important for WasmGC code anyhow (since wasm2js doesn't support GC in general, yet, and no current plans to). |
Thanks @kripken. I'm interested in using binaryen/src/passes/LoopInvariantCodeMotion.cpp Lines 21 to 22 in 432ed1c
binaryen/src/passes/LoopInvariantCodeMotion.cpp Lines 212 to 213 in 432ed1c
I'm not yet sure how impactful LICM is on our artifact (generated from the Scala.js wasm-backend), I'm just exploring how LICM moves our code 😄 Also, I tried to flatten on compiler side (by inserting
Yeah, that's not great unless we can transform those instructions back to |
I see, interesting. About licm, it is not enabled by default because in practice it doesn't seem that helpful: VMs will do licm themselves, and it makes the code larger (it adds locals), so in general it increases size without speeding things up. Though, it might help baseline tiers and interpreters, perhaps. But I'd be interested to hear thoughts or data otherwise!
Hmm, yeah, in general the passes will "un-flatten" as that is more efficient. You might experiment with running licm as the first pass, though. (But, again, I am not sure how useful licm is, as I mentioned above.) |
Previously, we used mutable array-based interface tables (itables) without a specific reason. However, accessing mutable arrays can lead to missed optimization opportunities. For instance, loop-invariant code motion (LICM) won't be able to move certain parts of interface dispatching operations out of loops: When we invoke a method using interface table dispatching, it requires retrieving an itable from the class's itable array, which is considered non-pure. In fact, reading from an array is marked as unsafe to move in wasm-opt's LICM implementation: https://github.com/WebAssembly/binaryen/blob/34ad6a7598e662e9ff357987f2c81fde1e05c522/src/ir/effects.h#L207-L210 https://github.com/WebAssembly/binaryen/blob/34ad6a7598e662e9ff357987f2c81fde1e05c522/src/passes/LoopInvariantCodeMotion.cpp#L125-L128 (Note: We may not run wasm-opt's LICM in favor of the VM's LICM, as discussed in WebAssembly/binaryen#6924) This commit changes itables to be immutable structs, making itable retrieval a pure operation. Here's a comparison of benchmark results between commit 4c9494e and this change: | | array itables | struct itables | | ----------- | ------------- | -------------- | | sha512 | 12128.73213 | 12107.94142 | | queens | 3.632068223 | 3.655815179 | | list | 84.61223471 | 65.69730564 | | richards | 92.12833022 | 91.19028905 | | cd | 46267.10318 | 46574.81215 | | gcbench | 116540.568 | 112594.324 | | tracerFloat | 2569.675034 | 2551.227714 | | tracer | 1725.591249 | 1697.907466 | | sudoku | 14863.33909 | 12066.23882 | | nbody | 366388.3292 | 199625.971 | | permute | 1006.215249 | 983.9858283 | | deltaBlue | 2205.002528 | 2320.038418 | | kmeans | 1752457.829 | 1754425.973 | | brainfuck | 36983.97583 | 36093.7837 | | json | 1205.92764 | 1191.768986 | | bounce | 21.14001572 | 20.8851082 | While most benchmark results remain largely unchanged, some tests such as `list`, `sudoku`, and `nbody` show improved performance compared to the array-based itables. This improvement may be attributed to the VM's optimization capabilities. **immutable array?** In this change, we implement itables using immutable structs However, we can use immutable arrays as an alternative. We implemented and benchmarked itables using immutable arrays, and the results show nearly same performance as the immutable struct-based implementation. WebAssembly/gc#250 v8/v8@6e36e3e Let's proceed with the immutable struct-based implementation for now because - While V8 seems to optimize immutable array-related operations, other VMs may not offer similar optimizations (?) - Currently, wasm-opt doesn't take into account the mutability of arrays in its optimization process.
Previously, we used mutable array-based interface tables (itables) without a specific reason. However, accessing mutable arrays can lead to missed optimization opportunities. For instance, loop-invariant code motion (LICM) won't be able to move certain parts of interface dispatching operations out of loops: When we invoke a method using interface table dispatching, it requires retrieving an itable from the class's itable array, which is considered non-pure. In fact, reading from an array is marked as unsafe to move in wasm-opt's LICM implementation: https://github.com/WebAssembly/binaryen/blob/34ad6a7598e662e9ff357987f2c81fde1e05c522/src/ir/effects.h#L207-L210 https://github.com/WebAssembly/binaryen/blob/34ad6a7598e662e9ff357987f2c81fde1e05c522/src/passes/LoopInvariantCodeMotion.cpp#L125-L128 (Note: We may not run wasm-opt's LICM in favor of the VM's LICM, as discussed in WebAssembly/binaryen#6924) This commit changes itables to be immutable structs, making itable retrieval a pure operation. Here's a comparison of benchmark results between commit 4c9494e and this change: | | array itables | struct itables | | ----------- | ------------- | -------------- | | sha512 | 12128.73213 | 12107.94142 | | queens | 3.632068223 | 3.655815179 | | list | 84.61223471 | 65.69730564 | | richards | 92.12833022 | 91.19028905 | | cd | 46267.10318 | 46574.81215 | | gcbench | 116540.568 | 112594.324 | | tracerFloat | 2569.675034 | 2551.227714 | | tracer | 1725.591249 | 1697.907466 | | sudoku | 14863.33909 | 12066.23882 | | nbody | 366388.3292 | 199625.971 | | permute | 1006.215249 | 983.9858283 | | deltaBlue | 2205.002528 | 2320.038418 | | kmeans | 1752457.829 | 1754425.973 | | brainfuck | 36983.97583 | 36093.7837 | | json | 1205.92764 | 1191.768986 | | bounce | 21.14001572 | 20.8851082 | While most benchmark results remain same, some tests such as `list`, `sudoku`, and `nbody` show improved performance compared to the array-based itables (may be thanks to the VM's optimization capabilities). **immutable array?** In this change, we implement itables using immutable structs However, we can use immutable arrays as an alternative. We implemented and benchmarked itables using immutable arrays, and the results show nearly same performance as the immutable struct-based implementation. WebAssembly/gc#250 v8/v8@6e36e3e Let's proceed with the immutable struct-based implementation for now because - While V8 seems to optimize immutable array-related operations, other VMs may not offer similar optimizations (?) - Currently, wasm-opt doesn't take into account the mutability of arrays in its optimization process.
Previously, we used mutable array-based interface tables (itables) without a specific reason. However, accessing mutable arrays can lead to missed optimization opportunities. For instance, loop-invariant code motion (LICM) won't be able to move certain parts of interface dispatching operations out of loops: When we invoke a method using interface table dispatching, it requires retrieving an itable from the class's itable array, which is considered non-pure. In fact, reading from an array is marked as unsafe to move in wasm-opt's LICM implementation: https://github.com/WebAssembly/binaryen/blob/34ad6a7598e662e9ff357987f2c81fde1e05c522/src/ir/effects.h#L207-L210 https://github.com/WebAssembly/binaryen/blob/34ad6a7598e662e9ff357987f2c81fde1e05c522/src/passes/LoopInvariantCodeMotion.cpp#L125-L128 (Note: We may not run wasm-opt's LICM in favor of the VM's LICM, as discussed in WebAssembly/binaryen#6924) This commit changes itables to be immutable structs, making itable retrieval a pure operation. Here's a comparison of benchmark results between commit 4c9494e and this change: | | array itables | struct itables | | ----------- | ------------- | -------------- | | sha512 | 12128.73213 | 12107.94142 | | queens | 3.632068223 | 3.655815179 | | list | 84.61223471 | 65.69730564 | | richards | 92.12833022 | 91.19028905 | | cd | 46267.10318 | 46574.81215 | | gcbench | 116540.568 | 112594.324 | | tracerFloat | 2569.675034 | 2551.227714 | | tracer | 1725.591249 | 1697.907466 | | sudoku | 14863.33909 | 12066.23882 | | nbody | 366388.3292 | 199625.971 | | permute | 1006.215249 | 983.9858283 | | deltaBlue | 2205.002528 | 2320.038418 | | kmeans | 1752457.829 | 1754425.973 | | brainfuck | 36983.97583 | 36093.7837 | | json | 1205.92764 | 1191.768986 | | bounce | 21.14001572 | 20.8851082 | While most benchmark results remain same, some tests such as `list`, `sudoku`, and `nbody` show improved performance compared to the array-based itables (may be thanks to the VM's optimization capabilities). **immutable array?** In this change, we implement itables using immutable structs However, we can use immutable arrays as an alternative. We implemented and benchmarked itables using immutable arrays, and the results show nearly same performance as the immutable struct-based implementation. WebAssembly/gc#250 v8/v8@6e36e3e Let's proceed with the immutable struct-based implementation for now because - While V8 seems to optimize immutable array-related operations, other VMs may not offer similar optimizations (?) - Currently, wasm-opt doesn't take into account the mutability of arrays in its optimization process.
Thanks, that’s interesting. It seems like it's better to leave LICM optimizations to the VMs. We recently moved the I'll close the issue for now. Thank you very much, @kripken, for answering my questions! |
WasmGC introduces new instructions: br_on_cast, br_on_cast_fail, br_on_non_null. These instructions branch to a specified label, passing the operand with a target type.
For example, in the following code,
(br_on_non_null $labelNotNull (local.get $x))
branches to the (outside of)block $labelNotNull
with the value(ref $obj)
on the stack if$x
is non-null.When we apply the
flatten
pass to the code contains those instructions, the pass emits an invalid code:It is invalid because, while the break type of
br_on_non_null
isref $obj
, the block type is notype(?).related: #6724
Should we lower
br_on_non_null
intoref.is_non_null
and something, or can we somehow keep the block type through flatten pass?The text was updated successfully, but these errors were encountered: