0% found this document useful (0 votes)
10 views7 pages

Compiler Design

The document discusses Data Flow Analysis (DFA) in compiler design, focusing on how programs manipulate data and the importance of understanding program behavior. It outlines various types of analysis, such as reaching definitions and iterative data flow analysis, and introduces concepts like lattices and constant propagation. The goal is to create a conservative yet aggressive analysis to accurately represent program definitions and their reach within the code structure.

Uploaded by

Bijay Nag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views7 pages

Compiler Design

The document discusses Data Flow Analysis (DFA) in compiler design, focusing on how programs manipulate data and the importance of understanding program behavior. It outlines various types of analysis, such as reaching definitions and iterative data flow analysis, and introduces concepts like lattices and constant propagation. The goal is to create a conservative yet aggressive analysis to accurately represent program definitions and their reach within the code structure.

Uploaded by

Bijay Nag
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Data Flow Analysis

Why:
CS591-5 - Selected topics in Compiler Design Provide information about a program manipulates its data.
Data Flow Analysis
Study functions behavior.
To help build control flow information.
V. Krishna Nandivada Program understanding (a function sorts an array!).
Generating a model of the original program and verify the model.
IIT Madras
The DFA should give information about that program that does not
misrepresent what the procedure being analyzed does.
Program validation.

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 2/1

Reaching Definitions Different types of analysis

A particular definition of a variable is said to reach a given point if


there is an execution path from the definition to that point
the variable might may have the value assigned by the definition.
In general undecidable. Intra procedural analysis.
Whole program (inter-procedural) analysis.
Our goal:
Generate intra procedural analysis and extend it to whole
The analysis must be conservative – the analysis should not tell program.
us that a particular definition does not reach a particular use, if it
may reach. We will study an iterative mechanism to perform such analyses.
A ‘may’ conservative analysis gives us a larger set of reaching
definitions than it might, if it could produce the minimal result.
To make maximum benefit from our analysis, we want the analysis to
be conservative, but be as aggressive as possible.
* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 3/1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 4/1
Iterative Dataflow Analysis Example program

1 int g(int m, inti);


Build a collection of data flow equations – specifying which data
2 int f(int n)
may flow to which variable. 3 {
Solve it iteratively. 4 int i = 0, j = 1;
Start from a conservative set of initial values – and continuously 5 if (n == 1) i = 2;
6 while (n > 0) {
improve the precision.
7 j = i + 1;
Disadvantage: We may be handling large data sets. 8 n = g (n, i);
Start from an aggressive set of initial values – and continuously 9 }
improve the precision. 10 return j;
Advantage: Datasets are small to start with. 11 }

Choice – depends on the problem at hand.


Does def of i in line 4 reach the uses in line 7 and 8?
Does def of j in line 7 reach the use in line 10?
* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 5/1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 6/1

Definitions Representation and Initialization

Bit Pos Definition Basic Block


1 m in node 1 B1
GEN : GEN(b) returns the set of definitions generated in the basic 2 f0 in node 2
block b; assigned values in the block and not subsequently killed 3 f1 in node 3
in it. 4 i in node 5 B3
KILL : KILL(b) returns the set of definitions killed in the basic block 5 f2 in node 8 B6
b. 6 f0 in node 9
7 f1 in node 10
IN : IN(b) returns the set of definitions reaching the basic block b. 8 i in node 11
OUT : OUT(b) returns the set of definitions going out of basic Set rep Bit vector
block b. GEN(B1) = {1, 2, 3} h11100000i
PRSV : Negation of KILL GEN(B3) = {4} h00010000i
GEN(B6) = {5, 6, 7, 8} h00001111i
GEN(.) = {} h00000000i

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 7/1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 8/1
Populating PRSV, OUT and IN Dataflow equations
A definition may reach the end of a basic block i:

OUT(i) = GEN(i) ∪ (IN(i) ∩ PRSV(i))

or with bit vectors:


Set rep Bit vector
PRSV(B1) = {4, 5, 8} h00011001i OUT(i) = GEN(i) ∨ (IN(i) ∧ PRSV(i))
PRSV(B3) = {1, 2, 3, 5, 6, 7} h11101110i
A definition may reach the beginning of a basicblock i:
PRSV(B6) = {1} h10000000i
PRSV(.) = {1, 2, 3, 4, 5, 6, 7, 8} h11111111i
[
IN(i) = OUT(j)
j∈Pred(i)

GEN, PRSV and OUT are created in each basic block.


OUT(i) = {} // initialization
But IN needs to be initialized to something safe.
* IN(entry) = {} *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 9/1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 10 / 1

Solving the Dataflow equations: example Dataflow equations: behavior

Itr 1:
OUT(entry) = h00000000i IN(entry) = h00000000i
OUT(B1) = h11100000i IN(B1) = h00000000i
We specify the relationship between the data-flow values before
OUT(B2) = h11100000i IN(B2) = h11100000i
OUT(B3) = h11110000i IN(B3) = h11100000i
and after a block – transfer or flow equations.
OUT(B4) = h11110000i IN(B4) = h11110000i Forward: OUT(s) = f (IN(s), · · · )
OUT(B5) = h11110000i IN(B5) = h11110000i Backward: IN(s) = f (OUT(s), · · · )
OUT(B6) = h00001111i IN(B6) = h11110000i The rules never change a 1 to 0. They may only change a 0 to a 1.
OUT(entry) = h11110000i IN(exit) = h11110000i
Itr 2:
They are monotone.
OUT(entry) = h00000000i IN(entry) = h00000000i Implication – the iteration process will terminate.
OUT(B1) = h11100000i IN(B1) = h00000000i
OUT(B2) = h11100000i IN(B2) = h11100000i Q: What good is reaching definitions? undefined variables.
OUT(B3) = h11110000i IN(B3) = h11100000i Q: Why do the iterations produce an acceptable solution to the set
OUT(B4) = h11111111i IN(B4) = h11111111i
OUT(B5) = h11111111i IN(B5) = h11111111i
of equations? – lattices and fixed points.
OUT(B6) = h10001111i IN(B6) = h11111111i
OUT(entry) = h11111111i IN(exit) = h11111111i

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 11 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 12 / 1
Lattice Lattice definition

A lattice L consists of a set of values, and two operations called meet


(t) and join (u). Satisfies properties:
What : Lattice is an algebraic structure closure: For all x, y ∈ L, ∃ a unique z and w ∈ L, such that x u y = z
Why : To represent abstract properties of variables, expressions, and x t y = w – each pair of elements have a unique lub and glb.
functions, etc etc. commutative: For all x, y ∈ L, x u y = y u x, and x t y = y t x.
Values
associative: For all x, y, z ∈ L, (x u y) u z = x u (y u z), and
Attributes
... (x t y) t z = x t (y t z)
Why “abstract? Exact interpretation (execution) gives exact There exists two special elements of L called bottom (⊥), and top
values, abstract interpretation gives abstract values. (>).
∀x ∈ L, x u ⊥ = ⊥ and x t > = >.
distributive : (optional). ∀x, y, z ∈ L, x t (y u z) = (x t y) u (x t z), and
x u (y t z) = (x u y) t (x u z)

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 13 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 14 / 1

Lattice properties Monotones and fixed point

Meet (and join) induce a partial order (v):


∀x, y ∈ L, x v y, iff x u y = x. A function f : L → L, is a monotone, if for all x, y ∈ L,
x v y ⇒ f (x) v f (y).
Transitive, antisymmetry and reflexive.
Example: bit-vector lattice:
Example Lattices: f (x1 x2 x3 ) = hx1 1x2 i
f (x1 x2 x3 ) = hx2 x3 x1 i
A flow function models the effect of a programming language
construct. as a mapping from the lattice for that particular analysis
to itself.
We want the flow functions to be monotones. Why?
A fixed point of a function f : L → L is an element z ∈ L, such that
f (z) = z.
For a set of data-flow equations, a fixed-point is a solution of the
set of equations – cannot generate any further refinement.
* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 15 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 16 / 1
Meet Over All Paths solutions A worklist based implementation (a forward analysis)

The value we wish to compute in solving data-flow equations is –


meet over all paths (MOP) solution.
Start with some prescribed information at the entry (or exit
depending on forward or backward).
Repeatedly apply the composition of the appropriate flow
functions.
For each node form the meet of the results.

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 17 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 18 / 1

Example: Constant Propagation Constant Propagation lattice meet rules

Goal: Discover values that are constants on all possible executions of


a program and to propagate these constant values as far forward
through the program as possible ⊥ = Constant value cannot be guaranteed.
Conservative: Can discover only a subset of all the possible > = May be a constant, not yet determined.
constants. ∀x
Lattice:
xu> = x
xu⊥ = ⊥
c1 u c1 = c1
c2 u c1 = ⊥

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 19 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 20 / 1
Simple constant propagation Kildall’s algorithm

Gary A. Kildall: A Unified Approach to Global Program Start with an entry node in the program graph.
Optimization - POPL 1973.
Process the entry node, and produce the constant propagation
Reif, Lewis: Symbolic evaluation and the global value graph - information. Send it to all the immediate successors of the entry
POPL 1977. node.
Simple constant Constants that can be proved to be constant At a merge point, get an intersection of the information.
provided,
no information is assumed about which direction branches will take.
If at any successor node, if for any variable the value is “reduced,
Only one value of each variable is maintained along each path in the process the successor, similar to the processing done for entry
the program. node.

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 21 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 22 / 1

Constant propagation - equations Constant propagation: equations (contd)

Start with the entry node.


If s is not an assignment statement, then fs is simply the identity
function.
Let us assume that one basic block per statement.
If s is an assignment statement to variable v, then fs (m) = m0 ,
Transfer functions set F - a set of transfer functions. where:
fs ∈ F is the transfer function for statement s. For all v0 6= v, m0 (v0 ) = m(v0 ).
The dataflow values are given by a map: m: Vars → ConstantVal If the RHS of the statement is a constant c, then m0 (v) = c.
If the RHS is an expression (say y op z),
If m is the set of input dataflow values, then m0 = fs (m) gives the

output dataflow values.  m(y) op m(z) if m(y) and m(z) are constant values
Generate equations like before. m0 (v) = ⊥ if either of m(y) and m(z) is ⊥
> Otherwise

If the RHS is an expression that cannot be evaluated, then


m0 (v) = ⊥.
At a merge point, get a meet of the flow maps.
* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 23 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 24 / 1
Constant Propagation - example I Constant Propagation - example II

x = 10;
y = 1; x = 10;
z = 5; y = 1;
if (cond) { z = 1;
y = y / x; while (x > 1) {
x = x - 1; y = x * y;
z = z + 1; x = x - 1;
} else { z = z * z;
z = z + y; }
y = 0; A[x] = y + z;
}
print x + y + z;

* *

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 25 / 1 V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 26 / 1

constant propagation: Non distributive

Say f1 , f2 , and f3 represent the


transfer functions of B1 , B2 and B3 ,
respectively.
f3 (f1 (m0 ) ∧ f2 (m0 )) v
f3 (f1 (m0 )) ∧ f3 (f2 (m0 ))

V.Krishna Nandivada (IIT Madras) CS591-5 - May 2018 27 / 1

You might also like