0% found this document useful (0 votes)
14 views31 pages

Foundation of Data Flow Analysis

Uploaded by

kevin5027boy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views31 pages

Foundation of Data Flow Analysis

Uploaded by

kevin5027boy
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 31

Lecture 3

Foundation of Data Flow Analysis

I Semi-lattice (set of values, meet operator)


II Transfer functions
III Correctness, precision and convergence
IV Meaning of data flow solution

Reading: Chapter 9.3


Recapping Lecture 2: Data Flow Framework

Reaching Definitions Live Variables


Domain Sets of definitions Sets of variables
Direction forward: backward:
out[b] = fb(in[b]) in[b] = fb(out[b])
in[b] = Ù out[pred(b)] out[b] = Ù in[succ(b)]
Transfer function fb(x) = Genb È (x –Killb) fb(x) = Useb È (x -Defb)
Meet Operation (Ù) È È
Boundary Condition out[entry] = Æ in[exit] = Æ
Initial interior points out[b] = Æ in[b] = Æ

CS243: Foundation of Data Flow 2 M. Lam


Thought Problem 1. “Must-Reach” Definitions

• A definition D (a = b+c) must reach point P iff


– D appears at least once along on all paths leading to P
– a is not redefined along any path after last appearance of D
and before P
• How do we formulate the data flow algorithm for this problem?

MAY Reach MUST Reach


Domain Sets of definitions Sets of definitions
Direction forward: forward:
out[b] = fb(in[b]) out[b] = fb(in[b])
in[b] = Ù out[pred(b)] in[b] = Ù out[pred(b)]
Transfer function fb(x) = Genb È (x –Killb) fb(x) = Genb È (x –Killb)
Meet Operation (Ù) È ∩
Boundary Condition out[entry] = Æ out[entry] = Æ
Initial interior pts out[b] = Æ out[b] = U

CS243: Foundation of Data Flow 3 M. Lam


Problem 1: Example

d1: b = 1

• Quiz 1: Does the precise answer satisfy the equations?


• Quiz 2: Are there alternative solutions that satisfy the equations?
• Quiz 3: Are the alternative solutions incorrect?
• Quiz 4: How do you get the precise answer?
• Quiz 5: What is a wrong answer?
CS243: Foundation of Data Flow 4 M. Lam
Problem 2: A legal solution to (May) Reaching Def?

• Quiz 1: Is this a solution to the equation?


• Quiz 2: Will the worklist algorithm generate this answer?
• Quiz 3: Is this solution safe or incorrect?
CS243: Foundation of Data Flow 6 M. Lam
Summary
• There may be multiple (fixed-point) solutions to a set of equations
– Non-solutions are incorrect
– All solutions to the equations are considered safe, but maybe imprecise
– Initialization of the interior points gives different solutions
• There is a direction to the solutions
• Answer is precise only when the initialization of interior points is
– U for must-reach
– {} for may-reach

Precise Answer

Must Reach {} U Definitions


Safe Incorrect
Imprecise
Precise Answer

May Reach {} U Definitions


Incorrect Safe
Imprecise

CS243: Foundation of Data Flow 7 M. Lam


Problem 3. What are the algorithm properties?

• Correctness

• Precision: how good is the answer?

• Convergence: will the analysis terminate?

• Speed: how long does it take?

CS243: Foundation of Data Flow 8 M. Lam


I. Purpose of a Framework

• Purpose 1
– Prove properties of entire family of problems once and for all

• Will the program converge?


• What does the solution to the set of equations mean?

• Purpose 2:
– Aid in software engineering: re-use code

CS243: Foundation of Data Flow 9 M. Lam


The Data-Flow Framework
• Data-flow problems (F, V, Ù) are defined by
– A semi-lattice
• domain of values V
• meet operator Ù: V x V à V
– A family of transfer functions F: V à V

CS243: Foundation of Data Flow 10 M. Lam


Lecture Outline

I Semi-lattice (set of values, meet operator)


II Transfer functions
III Correctness, precision and convergence
IV Meaning of data flow solution

CS243: Foundation of Data Flow 11 M. Lam


Semi-lattice: Structure of the Domain of Values

• A semi-lattice S = <a set of values V, a meet operator Ù>

• Properties of the meet operator


– idempotent: x Ù x = x
– commutative: x Ù y = y Ù x
– associative: x Ù (y Ù z) = (x Ù y) Ù z

• Examples of meet operators ?


• Non-examples ?

CS243: Foundation of Data Flow 12 M. Lam


Example of a Semi-Lattice Diagram
• (V, Ù ) : V = {x | such that x Í {d1,d2,d3}}, Ù = U

{} (T)

{d1} {d2} {d3}

{d1,d2} {d1,d3} {d2,d3}

{d1,d2,d3} (^)

• x Ù y = first common descendant of x & y important


• A meet semi-lattice is bounded if there exists a top element T,
such that x Ù T = x for all x.
• A bottom element ^ exists, if x Ù ^ = ^ for all x.

CS243: Foundation of Data Flow 13 M. Lam


A Meet Operator Defines a Partial Order
y
• Partial order of a meet semi-lattice
(x Ù y = x) (x≤ y)

path
≡ ≡
≤ : x ≤ y if and only if x Ù y = x x

{} (T) (max)

• Meet operator: U
{d1} {d2} {d3}

Partial order ≤ :
{d1,d2} {d1,d3} {d2,d3}

{d1,d2,d3} (^) (min)


• Properties of meet operator guarantee that ≤ is a partial order
– Reflexive: x ≤ x
– Antisymmetric: if x ≤ y and y ≤ x then x = y
– Transitive: if x ≤ y and y ≤ z then x ≤ z

CS243: Foundation of Data Flow 14 M. Lam


Another Example

• Semi-lattice
– V = {x | such that x Í {d1, d2, d3}}
– Ù=∩
{d1,d2,d3} (T)

{d1,d2} {d1,d3} {d2,d3}

{d1} {d2} {d3}

{} (^)

– ≤ is

CS243: Foundation of Data Flow 15 M. Lam


Meet Semi-Lattices vs Partially Ordered Sets
• A meet-semilattice is a partially ordered set which
has a meet (or greatest lower bound) for any nonempty finite subset.

{} (T)

{d1} {d2} {d3}

{d1,d2} {d1,d3} {d2,d3}

{d1,d2,d3} (^)

• Greatest lower bound: x Ù y = First common descendant of x & y


• Largest: top element T, if x Ù T = x for all x.
• Smallest: bottom element ^, if x Ù ^ = ^ for all x.

CS243: Foundation of Data Flow 16 M. Lam


Drawing a Semi-Lattice Diagram
• (x < y) ≡ (x ≤ y) Ù (x ≠ y)

• A semi-lattice diagram:
– Set of nodes: set of values
– Set of edges {(y, x): x < y and ¬ $z s.t. (x < z) Ù (z < y)}

CS243: Foundation of Data Flow 17 M. Lam


Summary
Three ways to define a semi-lattice:
• Set of values + meet operator
– idempotent: x Ù x = x
– commutative: x Ù y = y Ù x
– associative: x Ù (y Ù z) = (x Ù y) Ù z

• Set of values
+ partial order with a greatest lower bound for any nonempty subset
– Reflexive: x ≤ x
– Antisymmetric: if x ≤ y and y ≤ x then x = y
– Transitive: if x ≤ y and y ≤ z then x ≤ z

• A semi-lattice diagram

CS243: Foundation of Data Flow 18 M. Lam


One Element at a Time

• A semi-lattice for data flow problems can get quite large:


2n elements for n var/definition
• A useful technique:
– define semi-lattice for 1 element
– product of semi-lattices for all elements
– <x1, x2> ≤ <y1, y2> iff x1 ≤ y1 and x2 ≤ y2
• Example: Union of definitions
– For each element
def1 def2 def1 x def2

{} {} {},{}

{d1} {d2}
{d1},{} {},{d2}

{d1},{d2}

CS243: Foundation of Data Flow 19 M. Lam


Descending Chain
• Definition
– The height of a lattice is the largest number of >
relations that will fit in a descending chain.
x 0 > x1 > …

• Height of values in reaching definitions?

• Important property: finite descending chains

CS243: Foundation of Data Flow 20 M. Lam


II. Transfer Functions

• A family of transfer functions F


• Basic Properties f : V à V

– Has an identity function


• $f such that f(x) = x, for all x.

– Closed under composition


• if f1,f2Î F, f1•f2Î F

CS243: Foundation of Data Flow 21 M. Lam


Monotonicity: 2 Equivalent Definitions
• A framework (F, V, Ù) is monotone iff
– x ≤ y implies f(x) ≤ f(y)

• Equivalently,
a framework (F, V, Ù) is monotone iff
– f(x Ù y) ≤ f(x) Ù f(y),
– meet inputs, then apply f

apply f individually to inputs, then meet results

CS243: Foundation of Data Flow 22 M. Lam


Example

• Reaching definitions: f(x) = Gen U (x - Kill), Ù = U


– Definition 1:
• Let x1 ≤ x2,
f(x1): Gen U (x1 - Kill)
f(x2): Gen U (x2 - Kill)

– Definition 2:
• f(x1 Ù x2) = (Gen U ((x1 U x2) - Kill))
f(x1) Ù f(x2) = (Gen U (x1 - Kill) ) U (Gen U (x2 - Kill) )

CS243: Foundation of Data Flow 23 M. Lam


Distributivity
• A framework (F, V, Ù) is distributive if and only if
f(x Ù y)= f(x) Ù f(y),

meet input, then apply f is equal to


apply the transfer function individually then merge result

CS243: Foundation of Data Flow 24 M. Lam


Important Note

• Monotone framework does not mean that f(x) ≤ x


– e.g. Reaching definition for two definitions in program
– suppose: f: Gen = {d1} ; Kill = {d2}
– Quiz: What are the inputs and outputs of f?

CS243: Foundation of Data Flow 25 M. Lam


III. Properties of Iterative Algorithm

• Given
A monotone data flow framework
With finite descending chains

• The iterative algorithm where all interior points are initialized to T


– Converges
– To the Maximum Fixed Point (MFP) solution of equations

CS243: Foundation of Data Flow 26 M. Lam


Key Concept

• The answer is a set of values for all basic block boundaries:


{ in[b], out[b] | b in the program}

• We need to prove the invariant:


– Values assigned to the same in[b] or out[b]
cannot increase in each iteration of the algorithm

• The algorithm converges


if the semilattice has finite descending chains

• Given an initialization of T,
the answer is the MFP (Maximum Fixed Point),
because any larger value is not a solution.

CS243: Foundation of Data Flow 27 M. Lam


Sketch of Inductive Proof
The boundary is set to the correct answer and never changed

For each IN/OUT of an interior program point:


• Invariant: new value ≤ old value in any step
• Initialize interior points with T (largest value)
• Proof by induction
– Base case:
1st transfer function or meet operator: new value ≤ old value (T)

– Meet operation:
• Assume new inputs ≤ old inputs, new output ≤ old output
– Transfer function (in a monotone framework):
• Assume new inputs ≤ old inputs, new output ≤ old output

CS243: Foundation of Data Flow 28 M. Lam


IV. What Does the Solution Mean?
• IDEAL data flow solution
– Let f1, ..., fm : Î F, fi is the transfer function for node i

fp = fnk•… • fn1, p is a path through nodes n1, ..., nk

fp = identify function, if p is an empty path

– For each node n: Ùfpi (boundary value),


for all possibly executed paths pi reaching n
– Example
if sqr(y) >= 0

false true
x=0 x=1

• Determining all possibly executed paths is undecidable

CS243: Foundation of Data Flow 29 M. Lam


Meet-Over-Paths MOP
• Err in the conservative direction

• Meet-Over-Paths MOP
– Assume every edge is traversed
– For each node n:

– MOP(n) = Ùfpi (boundary value), for all paths pi reaching n

• Compare MOP with IDEAL


– MOP includes more paths than IDEAL
– MOP = IDEAL Ù Result(Unexecuted-Paths)
– MOP ≤ IDEAL
– MOP is a “smaller” solution, more conservative, safe

• MOP ≤ IDEAL
– Goal: as close to MOP from below as possible

CS243: Foundation of Data Flow 30 M. Lam


Solving Data Flow Equations
• What is the difference between MOP and MFP of data flow
equations?

F1 F2

MOP: f3 ( f1 (x)) Ù f2 ( f1 (x))


F3
Worklist algorithm: f3 ( f1(x) Ù f2 (x))

• Therefore
– FP ≤ MFP ≤ MOP ≤ IDEAL
– FP, MFP, MOP are safe
– If framework is distributive, FP ≤ MFP = MOP ≤ IDEAL

CS243: Foundation of Data Flow 31 M. Lam


Summary
• A data flow framework
– Semi-lattice
• set of values (top)
• meet operator
• finite descending chains?
– Transfer functions
• summarizes each basic block
• boundary conditions
• Properties of data flow framework:
– Monotone framework and finite descending chains

⇒ iterative algorithm converges


⇒ finds maximum fixed point (MFP)
⇒ FP ≤ MFP ≤ MOP ≤ IDEAL

– Distributive framework
⇒ FP ≤ MFP = MOP ≤ IDEAL

CS243: Foundation of Data Flow 32 M. Lam

You might also like