C++/C#: Reuse some SSA def/use info from previous iteration #3340

dbartol · 2020-04-23T17:31:38Z

Leaving as a draft until I have performance numbers from a CPP-Differences job

This PR supercedes #3235. The necessary changes were extensive enough that it was too risky for rc/1.24, so this PR is against master.

We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA.

With this change, the SSA configuration can specify which definitions can be reused from the previous iteration of the IR. Unaliased SSA does not reuse anything from Raw IR, but Aliased SSA reuses any definition of a variable that did not escape. In addition, Unaliased SSA tells Aliased SSA which variables were did not escape, so that Aliased SSA can skip all alias analysis for those variables.

During SSA construction, instead of throwing away all Phi instructions and def/use information from the previous IR iteration, we reuse whatever we were told that we could. This is slightly complicated by the possibility of degenerate (single-operand) Phi instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate Phi instruction, we recurse to the definition of that Phi instruction's sole reachable input operand. See the new test cases for a couple examples.

In aliased SSA's AliasConfiguration.qll, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time.

I also tweaked the getInstructionUniqueId() predicate to reuse the unique ID from the previous stage, which preserves ordering of Phi instructions in a block to minimize test output diffs.

The points_to test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR.

Finally, I added PhiInstruction.getInputOperand(). I'm surprised we didn't have it already.

We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA. With this change, the SSA configuration can specify which definitions can be reused from the previous iteration of the IR. Unaliased SSA does not reuse anything from Raw IR, but Aliased SSA reuses any definition of a variable that did not escape. In addition, Unaliased SSA tells Aliased SSA which variables were did not escape, so that Aliased SSA can skip all alias analysis for those variables. During SSA construction, instead of throwing away all Phi instructions and def/use information from the previous IR iteration, we reuse whatever we were told that we could. This is slightly complicated by the possibility of degenerate (single-operand) Phi instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate Phi instruction, we recurse to the definition of that Phi instruction's sole reachable input operand. See the new test cases for a couple examples. In aliased SSA's AliasConfiguration.qll, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time. I also tweaked the getInstructionUniqueId() predicate to reuse the unique ID from the previous stage, which preserves ordering of Phi instructions in a block to minimize test output diffs. The points_to test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR. Finally, I added PhiInstruction.getInputOperand(). I'm surprised we didn't have it already.

jbj

Has anything major changed compared to #3235? Or is this just a rebase to master with the review comments from me and @rdmarsh2 addressed?

It's great to see that nothing at all changes in the test output. I look forward to seeing some CPP-Differences results.

jbj · 2020-04-27T08:03:22Z

cpp/ql/src/semmle/code/cpp/ir/implementation/aliased_ssa/internal/AliasAnalysis.qll

-  exists(IREscapeAnalysisConfiguration config |
-    config.useSoundEscapeAnalysis() and resultEscapesNonReturn(allocation.getABaseInstruction())
-  )
+  resultEscapesNonReturn(allocation.getABaseInstruction())


I don't understand why this predicate changed and whether the QLDoc on it is still accurate.

jbj · 2020-04-27T08:39:43Z

cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SimpleSSA.qll

 /**
 * Holds if the specified variable should be modeled in SSA form. For unaliased SSA, we only model a
 * variable if its address never escapes and all reads and writes of that variable access the entire
 * variable using the original type of the variable.
 */
 private predicate isVariableModeled(Allocation var) {
-  not allocationEscapes(var) and
+  not treatAllocationAsEscaped(var) and


This change means (when unsound) we're now modelling some variables in unaliased SSA that were previously modelled in aliased SSA, right? Why?

jbj · 2020-04-27T08:59:35Z

cpp/ql/src/semmle/code/cpp/ir/implementation/unaliased_ssa/internal/SSAConstruction.qll

@@ -199,9 +258,25 @@ private module Cached {
    )
  }

+  /**
+   * Gets the new definition instruction for the operand of `instr` that flows from the block


Suggested change

* Gets the new definition instruction for the operand of `instr` that flows from the block

* Gets the new definition instruction for the phi operand of `instr` that flows from the block

This would improve readability for me.

dbartol added 3 commits April 23, 2020 12:39

C++/C#: Add PhiInstruction::getInputOperand()

bf09258

C++/C#: Add combineOverlap() predicate

5463ce9

dbartol added C# C++ labels Apr 23, 2020

dbartol requested review from jbj, rdmarsh2 and MathiasVP April 23, 2020 17:31

dbartol mentioned this pull request Apr 23, 2020

C++/C#: Reuse unaliased SSA when building aliased SSA #3235

Closed

jbj reviewed Apr 27, 2020

View reviewed changes

adityasharad changed the base branch from master to main August 14, 2020 18:34

dbartol requested review from jbj and removed request for rdmarsh2 and MathiasVP January 25, 2021 14:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C++/C#: Reuse some SSA def/use info from previous iteration #3340

C++/C#: Reuse some SSA def/use info from previous iteration #3340

dbartol commented Apr 23, 2020

jbj left a comment

jbj Apr 27, 2020

jbj Apr 27, 2020

jbj Apr 27, 2020

	* Gets the new definition instruction for the operand of `instr` that flows from the block
	* Gets the new definition instruction for the phi operand of `instr` that flows from the block

C++/C#: Reuse some SSA def/use info from previous iteration #3340

Are you sure you want to change the base?

C++/C#: Reuse some SSA def/use info from previous iteration #3340

Conversation

dbartol commented Apr 23, 2020

jbj left a comment

Choose a reason for hiding this comment

jbj Apr 27, 2020

Choose a reason for hiding this comment

jbj Apr 27, 2020

Choose a reason for hiding this comment

jbj Apr 27, 2020

Choose a reason for hiding this comment