Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C++/C#: Reuse some SSA def/use info from previous iteration #3340

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dbartol
Copy link
Contributor

@dbartol dbartol commented Apr 23, 2020

Leaving as a draft until I have performance numbers from a CPP-Differences job

This PR supercedes #3235. The necessary changes were extensive enough that it was too risky for rc/1.24, so this PR is against master.

We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA.

With this change, the SSA configuration can specify which definitions can be reused from the previous iteration of the IR. Unaliased SSA does not reuse anything from Raw IR, but Aliased SSA reuses any definition of a variable that did not escape. In addition, Unaliased SSA tells Aliased SSA which variables were did not escape, so that Aliased SSA can skip all alias analysis for those variables.

During SSA construction, instead of throwing away all Phi instructions and def/use information from the previous IR iteration, we reuse whatever we were told that we could. This is slightly complicated by the possibility of degenerate (single-operand) Phi instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate Phi instruction, we recurse to the definition of that Phi instruction's sole reachable input operand. See the new test cases for a couple examples.

In aliased SSA's AliasConfiguration.qll, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time.

I also tweaked the getInstructionUniqueId() predicate to reuse the unique ID from the previous stage, which preserves ordering of Phi instructions in a block to minimize test output diffs.

The points_to test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR.

Finally, I added PhiInstruction.getInputOperand(). I'm surprised we didn't have it already.

We build SSA twice. The first iteration, "unaliased SSA", considers only those memory locations that are not aliased, do not have their address escape, and are always accessed in their entirety and as their underlying declared type. The second iteration, "aliased SSA", considers all memory locations. However, whatever defs and uses we computed for unaliased SSA are still valid for aliased SSA, because they never overlap with the aliased memory locations that aliased SSA adds into the mix. If we can reuse the unaliased SSA information directly, we can potentially save significant cost in building aliased SSA.

With this change, the SSA configuration can specify which definitions can be reused from the previous iteration of the IR. Unaliased SSA does not reuse anything from Raw IR, but Aliased SSA reuses any definition of a variable that did not escape. In addition, Unaliased SSA tells Aliased SSA which variables were did not escape, so that Aliased SSA can skip all alias analysis for those variables.

During SSA construction, instead of throwing away all Phi instructions and def/use information from the previous IR iteration, we reuse whatever we were told that we could. This is slightly complicated by the possibility of degenerate (single-operand) Phi instructions due to unreachable code being eliminated between iterations. If we would have wound up with a degenerate Phi instruction, we recurse to the definition of that Phi instruction's sole reachable input operand. See the new test cases for a couple examples.

In aliased SSA's AliasConfiguration.qll, I stopped creating allocations for variables that were already modeled in unaliased SSA. This in turn prevents us from creating memory locations for those variables and their defs and uses, which is where we hope to reduce evaluation time.

I also tweaked the getInstructionUniqueId() predicate to reuse the unique ID from the previous stage, which preserves ordering of Phi instructions in a block to minimize test output diffs.

The points_to test had to be updated to no longer expect points-to analysis on unaliased SSA to report results that were already reported when running on raw IR.

Finally, I added PhiInstruction.getInputOperand(). I'm surprised we didn't have it already.
Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has anything major changed compared to #3235? Or is this just a rebase to master with the review comments from me and @rdmarsh2 addressed?

It's great to see that nothing at all changes in the test output. I look forward to seeing some CPP-Differences results.

exists(IREscapeAnalysisConfiguration config |
config.useSoundEscapeAnalysis() and resultEscapesNonReturn(allocation.getABaseInstruction())
)
resultEscapesNonReturn(allocation.getABaseInstruction())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why this predicate changed and whether the QLDoc on it is still accurate.

/**
* Holds if the specified variable should be modeled in SSA form. For unaliased SSA, we only model a
* variable if its address never escapes and all reads and writes of that variable access the entire
* variable using the original type of the variable.
*/
private predicate isVariableModeled(Allocation var) {
not allocationEscapes(var) and
not treatAllocationAsEscaped(var) and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change means (when unsound) we're now modelling some variables in unaliased SSA that were previously modelled in aliased SSA, right? Why?

@@ -199,9 +258,25 @@ private module Cached {
)
}

/**
* Gets the new definition instruction for the operand of `instr` that flows from the block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Gets the new definition instruction for the operand of `instr` that flows from the block
* Gets the new definition instruction for the phi operand of `instr` that flows from the block

This would improve readability for me.

@adityasharad adityasharad changed the base branch from master to main August 14, 2020 18:34
@dbartol dbartol requested review from jbj and removed request for rdmarsh2 and MathiasVP January 25, 2021 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants