4. Live Variables Analysis (2.1.4) a. idea and goals ------------------------------------------ LIVE VARIABLES def (p. 47): A variable x is *live* at exit from label l iff there is a path from l to a possible use of x (in an expression). This assumes that no variables are live at the end of the program. Which variables are live at exit from 1? (A) [x := 3]^1; if [z > 0]^2 then [y := x+2]^3 else [q := q+1]^4 (B) [x := 3]^1; [y := x+2]^2; [y := y+1]^3 (C) [x := 3]^1; [z := 4]^2; [x := z+2]^3 while [z > 0]^4 do ([y := x+2]^5; [z := z-1]^6) ------------------------------------------ ------------------------------------------ LIVE VARIABLES (LV) ANALYSIS Analysis question (p. 47): "For each program point, which variables may be live at the exit from the point." Example: [x := 3]^1; [z := 4]^2; [x := z+2]^3 ------------------------------------------ Which variables are live at exit of label 1? Label 2? What can we use this for? What is the bad outcome to avoid? What kinds of analysis information should we track in the analysis? Should this analysis go backwards or forwards? b. definitions and formalization ------------------------------------------ FORMAL DEFINITION LVexit(l) = if l = final(S*) then {} else LVentry(l) = kill: Blocks* -> Powerset(Var*) killLV([x:= a]^l) = killLV([skip]^l) = {} killLV([b]^l) = {} genLV: Blocks* -> Powerset(Var*) genLV([x:= a]^l) = genLV([skip]^l) = {} genLV([b]^l) = ------------------------------------------ Does this analysis need isolated exits? If so, why? Why is there a union for LVexit? ------------------------------------------ EXAMPLE [x := 3]^1; [z := 4]^2; [x := z+2]^3 l killLV(l) genLV(1) ========================== 1 2 3 LVentry(1) = LVexit(1) = LVentry(2) = LVexit(2) = LVentry(3) = LVexit(3) = ------------------------------------------ ------------------------------------------ THE FUNCTIONAL F1(LV1,...,LV6) // entry(1) = (LV2 - killLV([x := 3]^1) \cup genLV([x := 3]^1) = (LV2 - {x}) \cup {} = LV2 - {x} F2(LV1,...,LV6) // exit(1) = LV3 F3(LV1,...,LV6) // entry(2) = (LV4 - killLV([z := 4]^2) \cup genLV([z := 4]^3) = (LV4 - {z}) \cup {} = LV4 - {z} F4(LV1,...,LV6) // exit(2) = LV5 F5(LV1,...,LV6) // entry(3) = (LV6 - killLV([x := z+2]^3) \cup genLV([x := z+2]^3) = (LV6 - {x}) \cup {z} F6(LV1,...,LV6) = {} // exit(3) ------------------------------------------ What makes the analysis information unsafe? What makes the analysis imprecise? Do we want the largest or the smallest solution? So, what would be the initial values for LV1,...,LV6? 5. Derived Data Flow Information (2.1.5) ------------------------------------------ LINKING DEFINITIONS AND USES Use-definition (ud) chain: links use of var (in an expression) to its last assignment Definition-use (du) chain: links last assignment of var to a use (in an expression) ------------------------------------------ What might this be useful for? a. formal definitions ------------------------------------------ DEFINITIONS AND USES def (p., 50): (l1, ..., ln) is a *definition clear path for x* iff 1. no block labeled {l1, ..., l(n-1)} assigns a value to x, and 2. the block labeled ln uses x (as an expression) clear(x, l, l') = (\exists l1, ..., ln :: l = l1 & l' = ln & n > 0 & (\forall i : 1 <= i < n : (li, l(i+1)) \in flow(S*)) & (\forall i : 1 <= i < n : not(def(x, li))) & use(x, ln)) def(x, l) = (\exists B : [B]^l \in blocks(S*) : x \in killLV([B]^l)) use(x, l) = (\exists B : [B]^l \in blocks(S*) : x \in genLV([B]^l)) ------------------------------------------ How do you interpret the notion of a definition clear path for x? Why are the def and use functions correct? Does clear(y, 3, 7) tell you anything about the use of y? ------------------------------------------ UD AND DU ANALYSIS ud: Var* x Lab* -> Powerset(Lab*?) ud(x, l') = {l | def(x, l), (\exists l2 : (l, l2) \in flow(S*): clear(x, l2, l'))} \cup {? | clear(x, init(S*), l')} du: Var* x Lab*? -> Powerset(Lab*) du(x, l) = if l != ? then {l'| def(x, l), (\exists l2 : (l, l2) \in flow(S*): clear(x, l2, l'))} else {l'| clear(x, init(S*), l')} ------------------------------------------ What does ud(x, l') = {l1, l2, l3} mean? What does du(x, l) = {l1, l2, l3} mean? Can ud(x,l') be empty? What would it mean if du(x,l) is empty? What is the analysis domain? Do these require isolated entries? Are these must or may analyses? So what would a bad outcome be for a UD or DU analysis? Would we want the largest or smallest solution for a UD or DU analysis? Can we define du in terms of ud? b. example ------------------------------------------ EXAMPLE [z := 3]^1; if [y > 0]^2 then [y := z+2]^3 else [y := y+1]^4 ud(x, l) l \ x | y z ======================== 1 2 3 4 du(x, l) l \ x | y z ===================== 1 2 3 4 ------------------------------------------ c. computation How could we use RD and LV to compute ud chains? II. Monotone Frameworks (2.3) Can we identify some commonalities between different analyses? Would doing that help implement them? A. general pattern ------------------------------------------ GENERAL PATTERN A_o(l) = if l \in E then i else A_.(l) = where \bigsqcup is either \bigcup or \bigcap F is either flow(S*) or flow^R(S*) E is {init(S*)} or final(S*) i is initial/final information f(l) is the transfer function for blocks B^l \in blocks(S*) For a may analysis: \bigsqcup is For a must analysis: \bigsqcup is For a forward analysis: F is flow(S*) E is A_o gives the entry conditions A_. gives the exit conditions For a backward analysis: F is flow^R(S*) E is A_o gives the exit conditions A_. gives the entry conditions ------------------------------------------ B. basic definitions (2.3.1) 1. property space ------------------------------------------ PROPERTY SPACES def (p. 65): a *property space*, L = (L, \bigsqcup), is a set with \bigsqcup: Powerset(L) -> L a join operation (\sqcup) that makes it a complete lattice. Thus: l1 \sqcup l2 = \bigsqcup { l1, l2 } \bot = \bigsqcup {} l1 \sqsubseteq l2 = (l1 \sqcup l2 = l2) Examples: For reaching definitions: L = Powerset(Var* x Lab^?_*) \sqcup = \cup \sqsubseteq = \subseteq For available expressions: L = Powerset(AExp*) \sqcup = \cap \sqsubseteq = \supseteq ------------------------------------------ a. lattice theory i. partial orders, lub, joins, lattices ------------------------------------------ POSETS, LUBS, AND LATTICES (APPENDIX A) def: A *partially ordered set (poset)* is a pair (L, <=) of a set, L, together and a binary relation, <=, on L such that for all x,y,z in L: (reflexive) x <= x, (antisymmetric) x <= y and y <= x implies x = y, (transitive) x <= y and y <= z implies x <= z. def: for all x,y,u in L, u is an *upper bound of x and y* iff x <= u and y <= u. def: for all x,y,u in L, u is the *least upper bound of x and y* written lub(x,y) = u iff u is an upper bound of x and y and z such that z is an upper bound of x and y, u <= z. def: For all X, Y in L, the *join of X and Y*, is X \sqcup Y = lub(X,Y). This is also written as sup(X,Y). ------------------------------------------ does the lub of two elements x,y in L need to be in the set L? ------------------------------------------ LATTICES, COMPLETE LATTICES (2.3.1, A.1-2) def: A *lattice* is a poset in which each pair of elements has a least upper bound. def: A *complete lattice* is a lattice L such that every nonempty subset of L has a least upper bound (in L). ------------------------------------------ So, what is a property space? ii. meets, glbs ------------------------------------------ GLBS and COMPLETE LATTICES (APPENDIX A) Assume (L, <=) is a poset. def: for all x,y,n in L, n is a *lower bound of x and y* iff n <= x and n <= y. def: for all x,y,n in L, n is the *greatest lower bound of x and y* written glb(x,y) = n iff n is a lower bound of x and y and for all z such that z is a lower bound of x and y, z <= n. def: For all X, Y in L, the *meet of X and Y*, is X \sqcap Y = glb(X,Y). This is also written as inf(X,Y). Thm: If (L, \sqsubseteq) is a lattice, then the following hold for all x, y, z in L: (commutative) x \sqcup y = y \sqcup x, (associative) x \sqcup (y \sqcup z) = (x \sqcup y) \sqcup z, (absoption) x \sqcup (y \sqcap x) = (x \sqcup y) \sqcap x = x ------------------------------------------ b. chains and the ascending chain condition ------------------------------------------ CHAINS, ASCENDING CHAIN CONDITION (A.3) def: A subset Y of a poset (L,\sqsubseteq) is a *chain* iff for all l_1, l_2 in Y, l_1 \sqsubseteq l_2 or l_2 \sqsubseteq l_1 def: An *ascending chain* in L is a sequence (l_n)_{n \in N} such that each l_i in L and for all n,m <= N, n <= m ==> l_n \sqsubseteq l_m. def: a sequence (l_n)_{n \in N} *stabilizes* iff there is some number n0 such that for all n >= n0, l_n = l_n0. def: L *satisfies the ascending chain condition* iff every ascending chain in L eventually stabilizes. def: a *domain* is a lattice that satisfies the ascending chain condition. ------------------------------------------ If a chain is finite, does it stabilize? Is it necessary for lattice to be finite in order to satisfy the ascending chain condition? 2. transfer functions ------------------------------------------ TRANSFER FUNCTION SPACE def: Let L be a partially-ordered set. Then Funs is a *transfer function space for L* iff f \in Funs ==> f : L -> L and f is monotone, f,g \in Funs ==> f o g \in Funs, and id_L \in Funs. ------------------------------------------ 3. monotone framework ------------------------------------------ MONOTONE FRAMEWORK def: (L, Funs) is a monotone framework iff L is a property space and Funs is a transfer function space for L. def: (L, Funs, F, E, i, f_.) is an *instance of a monotone framework* if and only if: - (L, Funs) is a monotone framework, - F is a finite set of pairs (of flows), - E is a finite set of extremal labels, - i \in L is an extremal value, - f : (dom(F) \cup E) -> (L -> L) s.t. for l in (dom(F) \cup E) f_l \in Funs ------------------------------------------ C. examples (2.3.2) D. predicate abstraction (new topic) ------------------------------------------ PREDICATE ABSTRACTION Goal: verify program properties Idea: Use property space of the form L = Powerset(Preds) Preds = {P1, ..., Pn} where each Pi is a nullary predicate Interpretation: {P3,P5} means P3 and P5 may (must) be true (depending on kind of analysis) \sqcup is \cup (or \cap) \sqsubseteq is \subseteq (or \supseteq) Funs = monotonic (in \sqsubseteq) functions on L ------------------------------------------ For a may analysis, what's the bottom element? The top? How can you represent states? ------------------------------------------ PREDICATE ABSTRACTION EXAMPLE IsZero Analysis: At a given program point, which variables may be 0. "Abstract States" s \in L = Powerset(Preds) where Preds = {IsZero_x | x in Vars*} IsZero_y means y may be 0 F is flow(S*) E is {init(S*)} i is Preds fIZ(l) : L -> L, for l in Lab* fIZ(l)(s) = (s \ kill_IZ(B^l)(s)) \cup gen_IZ(B^l)(s) where B^l in blocks(S*) kill_IZ([x := a]^l)(s) = {IsZero_x} kill_IZ([skip]^l)(s) = {} kill_IZ([b]^l)(s) = {} gen_IZ([x := a]^l)(s) = {IsZero_x | (\exists cs \in \gamma(s) :: A[[a]]cs == 0)} gen_IZ([skip]^l)(s) = {} gen_IZ([b]^l)(s) = {} \gamma: L -> Store \gamma(s) = {cs | cs: Var* -> Int, IsZero_x \in s ==> cs(x) == 0} ------------------------------------------ What kind of analysis is this? Why this initial value i? What do the gen and kill functions do? What are the equations for IZ_entry(l) and IZ_exit(l)? ------------------------------------------ EXAMPLE [y := 3]^1; while [y>0]^2 do ([q := y-2]^3; [y := y-1]^4); [q := q div y]^5 Var* = {y, q} Preds = {IsZero_y, IsZero_q} IZ_entry(1) = IZ_exit(1) = IZ_entry(2) = IZ_exit(2) = IZ_entry(3) = IZ_exit(3) = IZ_entry(4) = IZ_exit(4) = IZ_entry(5) = IZ_exit(5) = ------------------------------------------ E. equation solving (2.4) 1. MFP (Maximal Fixed Point) solution (2.4.1) 2. MOP solution (2.4.2) (skip)