Skip to content

Conversation

@rmnattas
Copy link
Contributor

@rmnattas rmnattas commented Nov 27, 2024

First commit fixes creating invariant initialization trees with array address calculation to insert loading the dataAddrPtr

BEFORE
n690n     astore  <internal pointer temp slot 8>[#450  Auto] [flags 0x40007 0x0 ]             [0x7815e188c760] bci=[-1,5,35] rc=0 vc=0 vn=- li=-1 udi=- nc=1
n689n       aladd (internalPtr )                                                              [0x7815e188c710] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=2 flg=0x8000
n688n         aload  <pinning array temp slot 6>[#448  Auto] [flags 0x10000007 0x0 ]          [0x7815e188c6c0] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=0
n687n         lconst 4 (highWordZero X!=0 X>=0 )                                              [0x7815e188c670] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x4104

AFTER
n692n     astore  <internal pointer temp slot 8>[#450  Auto] [flags 0x40007 0x0 ]             [0x70e90b08c800] bci=[-1,5,35] rc=0 vc=0 vn=- li=-1 udi=- nc=1
n691n       aladd (internalPtr )                                                              [0x70e90b08c7b0] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=2 flg=0x8000
n690n         aloadi  <contiguousArrayDataAddrField>[#373  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [0x70e90b08c760] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=->
n689n           aload  <pinning array temp slot 6>[#448  Auto] [flags 0x10000007 0x0 ]        [0x70e90b08c710] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=0
n688n         lconst 4 (highWordZero X!=0 X>=0 )                                              [0x70e90b08c6c0] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x4104 

Second commit disables remaining code that stores dataAddr pointers and calculations using it in temp variables (addressed in #7560).

It disables it in 2 areas:

  1. Disable auto reassociation of dataAddrPtrs
BEFORE
n348n     astore  <internal pointer temp slot 7>[#427  Auto] [flags 0x40007 0x0 ]             [0x7009f90a5c80] bci=[-1,3,11] rc=0 vc=0 vn=- li=-1 udi=- nc=1
n347n       aloadi  <contiguousArrayDataAddrField>[#373  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [0x7009f90a5c30] bci=[-1,3,11] rc=1 vc=0 vn=- li=- udi=- nc=1 flg=0x8020
n346n         aload  <pinning array temp slot 3>[#423  Auto] [flags 0x10000007 0x0 ]          [0x7009f90a5be0] bci=[-1,3,11] rc=1 vc=0 vn=- li=- udi=- nc=0

n31n      istore  <auto slot 0>[#417  Auto] [flags 0x3 0x0 ]                                  [0x7009f9004ff0] bci=[-1,20,12] rc=0 vc=606 vn=- li=6 udi=4 nc=1
n30n        iadd                                                                              [0x7009f9004fa0] bci=[-1,19,12] rc=1 vc=606 vn=- li=- udi=- nc=2
n29n          iloadi  <array-shadow>[#224  Shadow] [flags 0x80000603 0x0 ] (cannotOverflow )  [0x7009f9004f50] bci=[-1,18,12] rc=1 vc=606 vn=- li=- udi=- nc=1 flg=0x1000
n28n            aladd (X>=0 internalPtr )                                                     [0x7009f9004f00] bci=[-1,18,12] rc=1 vc=606 vn=- li=- udi=- nc=2 flg=0x8100
n345n             aload  <internal pointer temp slot 7>[#427  Auto] [flags 0x40007 0x0 ]      [0x7009f90a5b90] bci=[-1,18,12] rc=1 vc=0 vn=- li=-1 udi=- nc=0
n27n              lload  <temp slot 6>[#426  Auto] [flags 0x4 0x0 ] (highWordZero X>=0 cannotOverflow )  [0x7009f9004eb0] bci=[-1,18,12] rc=2 vc=606 vn=- li=- udi=- nc=0 flg=0x5100
n16n          iload  <auto slot 0>[#417  Auto] [flags 0x3 0x0 ] (cannotOverflow )             [0x7009f9004b40] bci=[-1,10,12] rc=1 vc=606 vn=- li=- udi=11 nc=0 flg=0x1000


AFTER
n31n      istore  <auto slot 0>[#417  Auto] [flags 0x3 0x0 ]                                  [0x7c11e5a04ff0] bci=[-1,20,12] rc=0 vc=663 vn=- li=12 udi=7 nc=1
n30n        iadd                                                                              [0x7c11e5a04fa0] bci=[-1,19,12] rc=1 vc=663 vn=- li=- udi=- nc=2
n29n          iloadi  <array-shadow>[#224  Shadow] [flags 0x80000603 0x0 ] (cannotOverflow )  [0x7c11e5a04f50] bci=[-1,18,12] rc=1 vc=663 vn=- li=- udi=- nc=1 flg=0x1000
n28n            aladd (X>=0 internalPtr )                                                     [0x7c11e5a04f00] bci=[-1,18,12] rc=1 vc=663 vn=- li=- udi=- nc=2 flg=0x8100
n24n              aloadi  <contiguousArrayDataAddrField>[#373  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [0x7c11e5a04dc0] bci=[-1,18,12] rc=1 vc=661 vn=- li=- udi=- nc=1 flg=0x8000
n17n                aload  <temp slot 3>[#423  Auto] [flags 0x7 0x0 ] (X!=0 X>=0 )            [0x7c11e5a04b90] bci=[-1,11,12] rc=1 vc=661 vn=- li=- udi=18 nc=0 flg=0x104
n27n              lload  <temp slot 10>[#430  Auto] [flags 0x4 0x0 ] (X>=0 cannotOverflow )   [0x7c11e5a04eb0] bci=[-1,18,12] rc=2 vc=663 vn=- li=- udi=- nc=0 flg=0x1100
n16n          iload  <auto slot 0>[#417  Auto] [flags 0x3 0x0 ] (cannotOverflow )             [0x7c11e5a04b40] bci=[-1,10,12] rc=1 vc=663 vn=- li=- udi=20 nc=0 flg=0x1000
  1. Disable hoisting invariant array address calculation that includes loading dataAddrPtr similar to
n691n     astore  <internal pointer temp slot 8>[#450  Auto] [flags 0x40007 0x0 ]             [0x7f2ea708c7b0] bci=[-1,5,35] rc=0 vc=0 vn=- li=-1 udi=- nc=1
n690n       aladd (internalPtr )                                                              [0x7f2ea708c760] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=2 flg=0x8000
n689n         aloadi  <contiguousArrayDataAddrField>[#373  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [0x7f2ea708c710] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=->
n688n           aload  <pinning array temp slot 6>[#448  Auto] [flags 0x10000007 0x0 ]        [0x7f2ea708c6c0] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=0
n687n         lconst 4 (highWordZero X!=0 X>=0 )                                              [0x7f2ea708c670] bci=[-1,5,35] rc=1 vc=0 vn=- li=- udi=- nc=0 flg=0x4104

Keeping in note that hoisting loop induction variable related calculations is still enabled by hoisting the baseObj and calculation separately

n444n     astore  <temp slot 7>
n32n        aloadi  Loop.x [I[  // baseArray
n613n     lstore  <temp slot 11> // offset
n612n       ladd                                                                              [     0x3fcaeb8af00] bci=[-1,27,24] rc=1 vc=0 vn=- li=-1 udi=- nc=2
n610n         lmul                                                                            [     0x3fcaeb8ae60] bci=[-1,27,24] rc=1 vc=0 vn=- li=-1 udi=- nc=2
n608n           lload  <temp slot 10> // i
n609n           lconst 4 (highWordZero X!=0 X>=0 )                                            [     0x3fcaeb8ae10] bci=[-1,27,24] rc=1 vc=701 vn=- li=-1 udi=- nc=0 flg=0x4104
n611n         lconst 128 (highWordZero X!=0 X>=0 )                                            [     0x3fcaeb8aeb0] bci=[-1,27,24] rc=1 vc=0 vn=- li=-1 udi=- nc=0 flg=0x4104


n52n      BBStart <block_6> (freq 10000) (in loop 6)                                          [     0x3fcaeb80000] bci=[-1,29,24] rc=0 vc=703 vn=- li=- udi=- nc=0
n75n      istore  <auto slot 2>[#406  Auto] [flags 0x3 0x0 ]                                  [     0x3fcaeb80730] bci=[-1,48,25] rc=0 vc=703 vn=- li=8 udi=5 nc=1
n74n        iadd                                                                              [     0x3fcaeb806e0] bci=[-1,47,25] rc=1 vc=703 vn=- li=- udi=- nc=2
n73n          iloadi  <array-shadow>[#254  Shadow] [flags 0x80000603 0x0 ] (cannotOverflow )  [     0x3fcaeb80690] bci=[-1,46,25] rc=1 vc=703 vn=- li=- udi=- nc=1 flg=0x1000
n72n            aladd (X>=0 internalPtr )                                                     [     0x3fcaeb80640] bci=[-1,46,25] rc=1 vc=703 vn=- li=- udi=- nc=2 flg=0x8100
n68n              aloadi  <contiguousArrayDataAddrFieldSymbol>[#355  final Shadow +8] [flags 0x20604 0x0 ] (internalPtr )  [     0x3fcaeb80500] bci=[-1,46,25] rc=1 vc=703 vn=- li=- udi=- nc=1 flg=0x8000
n57n                aload  <temp slot 7>[#432  Auto] [flags 0x7 0x0 ] (X!=0 X>=0 )            [     0x3fcaeb80190] bci=[-1,38,25] rc=1 vc=703 vn=- li=- udi=12 nc=0 flg=0x104
n71n              lload  <temp slot 11>[#436  Auto] [flags 0x4 0x0 ] (highWordZero X>=0 cannotOverflow )  [     0x3fcaeb805f0] bci=[-1,46,25] rc=2 vc=703 vn=- li=- udi=- nc=0 flg=0x5100
n55n          iload  <auto slot 2>[#406  Auto] [flags 0x3 0x0 ] (cannotOverflow )             [     0x3fcaeb800f0] bci=[-1,36,25] rc=1 vc=703 vn=- li=- udi=14 nc=0 flg=0x1000

Signed-off-by: Abdulrahman Alattas <rmnattas@gmail.com>
@rmnattas
Copy link
Contributor Author

rmnattas commented Dec 2, 2024

PR Tested and ready for merge @zl-wang

@zl-wang
Copy link
Contributor

zl-wang commented Dec 2, 2024

@hzongaro please review/approve/merge ...

@rmnattas rmnattas marked this pull request as draft December 3, 2024 15:46
@rmnattas rmnattas marked this pull request as ready for review December 4, 2024 14:29
@rmnattas
Copy link
Contributor Author

rmnattas commented Dec 4, 2024

Earlier code changed auto reassociation to store base object instead of dataAddrPtr and load the dataAddrPtr of the object when using the temp, that caused issues one being symbols and temps were still marked as internal-pointers causing further opts to break. A better fix is to disable auto reassociation when its targeting dataAddrPtrs.

PR ready for merge.

@hzongaro hzongaro self-requested a review December 5, 2024 19:56
@hzongaro hzongaro self-assigned this Dec 5, 2024
Copy link
Contributor

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed a comment in TR_LoopStrider::reassociateAndHoistComputations that indicates trees like the following can be hoisted, but if I understand correctly, that's no longer true after these changes are applied.

      off-heap mode:
         aladd (internal pointer)
            contiguousArrayDataAddrFieldSymbol (dataAddrPointer, internal pointer)
               array_obj (pinning array pointer)
            mul/shift/integer offset

If I've understood that correctly, I would suggest adjusting the comment to indicate that that's no longer the case, at least temporarily.

May I also ask you to add a little more detail to the message for the commit that changes TR_LoopStrider::reassociateAndHoistComputations to describe the motivation for those changes?

Otherwise, I think the changes look correct.

Given issues when storing dataAddrPtr in temps and the dual purpose of
using that as the destination address (correct) and destination object
(incorrect) from some evaluators, the store of dataAddrPtr in temps
is disabled until its fixed (Issue: OMR eclipse-omr#7560)

Signed-off-by: Abdulrahman Alattas <rmnattas@gmail.com>
@rmnattas
Copy link
Contributor Author

rmnattas commented Dec 6, 2024

Update commit from feedback

Copy link
Contributor

@hzongaro hzongaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Thanks!

@hzongaro
Copy link
Contributor

hzongaro commented Dec 9, 2024

Jenkins build all

@rmnattas
Copy link
Contributor Author

Windows and RISC failures are CI related

@hzongaro hzongaro merged commit ad265f2 into eclipse-omr:master Dec 10, 2024
11 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants