I. Motivation for abstract interpretation (transition from shape analysis) What's a representation function? What's a Galois connection? What's a Galois insertion? II. A Mundane Approach to Correctness (4.1) A. semantics and its relation to abstract interpretation ------------------------------------------ SEMANTICS AND ABSTRACT INTERPRETATION Language semantics (big step) -->* s' P |- s ~~> s' Abstract interpretatrion P |- l_1 |> l_2 f_p(l_1) = l_2 ------------------------------------------ ------------------------------------------ MUNDANE Means first-order properties describe sets of values e.g., shape analysis, constant propagation ------------------------------------------ What classical analyses are not mundane? ------------------------------------------ EXAMPLE 4.1 For constant propagation: Semantics is: p |- s1 ~~> s2 means -->* s2 Analysis is: p |- \hat{s1} |> \hat{s2} means i = \hat{s1} /\ s2 = \bigsqcup {CP.(l) | l in final(S*)} ------------------------------------------ What is the set of values for this example? What is the property space for constant propagation? What would this be for shape analysis? B. correctness relations (4.1.1) ------------------------------------------ CORRECTNESS RELATIONS (4.1.1) def: a *correctness relation* has type V x L -> Boolean It says what properties safely describe a given value, and must be preserved by computation: (v1 R l1 /\ p |- v1 ~~> v2 /\ p |- l1 |> l2) ==> v2 R l2 (4.3) Picture: p |- l1 |> l2 R ==> R p |- v1 ~~> v2 ------------------------------------------ ------------------------------------------ CORRECTNESS FOR ORDERED PROPERTY SPACES Suppose L = (L, <=) is a complete lattice, Then we require: v R l1 /\ l1 <= l2 ==> v R l2 (4.4) (\forall l \in L' <= L :: v R l) ==> v R (\bigmeet L') (4.5) ------------------------------------------ What do these mean? ------------------------------------------ CONSTANT PROPAGATION (EXAMPLE 4.3) s R_CP \hat{s} iff (\forall x \in Var* :: s(x) = T \/ s(x) = \hat{s}(x)) ------------------------------------------ What does that mean? Why do the properties (4.4) and (4.5) hold? What is correctness for shape analysis? C. representation functions (4.1.2) ------------------------------------------ REPRESENTATION FUNCTIONS (4.1.2) def: a *representation function* maps a value to the best property describing it. It must be preserved by computation in the following sense: (b(v1) <= l1 /\ p |- v1 ~~> v2 /\ p |- l1 |> l2) ==> b(v2) <= l2 (4.6) Picture: p |- l1 |> l2 ^ ^ b| ==> |b | | p |- v1 ~~> v2 ------------------------------------------ What does this property mean? Can we define a correctness relation, R, using b? And vice versa? ------------------------------------------ CORRECTNESS VIA REPRESENTATION AND VICE VERSA def: R_b is the correctness relation generated by b: v R_b l <==> b(v) <= l def: b_R is the representation function generated by R: b_R(v) = \bigmeet { l | v R l } Lemma 4.5 (i) R_b satisfies (4.4) and (4.5), and b_{R_b} = b (ii) if R satisfies (4.4) and (4.5), then b_R is well-defined and R_{b_R} = R ------------------------------------------ How would you prove this? ------------------------------------------ CONSTANT PROPAGATION (EXAMPLE 4.6) b_CP: State -> \hat{State_CP} b_CP(s) = s So R_CP is defined by: ------------------------------------------ What does that mean? What's the relationship between b_SA and R_SA for shape analysis? D. generalization (4.1.3) ------------------------------------------ GENERALIZATION (4.1.3) In p |- v1 ~~> v2 allow v1 in V1, v2 in V2, and V1 <> V2 In f_p(l1) = l2 allow l1 in L1, l2 in L2, and L1 <> L2 So get 2 correctness relations: R1: V1 x L1 -> Boolean generated by b1: V1 -> L1 R2: V2 x L2 -> Boolean generated by b2: V2 -> L2 Logical relationship: f_p l1 --> l2 R1 ==> R2 p |- v1 ~~> v2 def: (R1 ->> R2) is a relation defined by (p |- . ~~> .) (R1 ->> R2) f_p <==> (\forall v1, v2, l1 :: (p |- v1 ~~> v2) /\ v1 R1 l1 ==> v2 R2 f_p(l1)) ------------------------------------------ III. Approximation of Fixed Points (4.2) A. Example lattice (4.10) ------------------------------------------ INTERVAL LATTICE (EXAMPLE 4.10) Interval = { _|_ } \cup {[z1,z2] | z1 <= z2, z1 in Z-, z2 in Z+} Z- = Z \cup {-\infty} Z+ = Z \cup {\infty} _|_ denotes the empty interval <= ordering on Interval is: where inf(_|_) = \infty int([z1,z2]) = z1 sup(_|_) = -\infty sup([z1,z2]) = z2 ------------------------------------------ Why is Interval a lattice? How to define |_| ? What is T in this lattice? B. Why fixed points? ------------------------------------------ WHY FIXED POINTS? Analysis transforms properties: f: L -> L where f is monotone. E.g., for reaching definitions: F(RD_1,...,RD_n) = (F_1(RD_1,...,RD_n), F_n(RD_1...,RD_n)) Want least fixed point, lfp(f) for: - recursive programs - programs with loops But iterating doesn't necessarily: - reach a fixed point (stabilize) - stabalize at the least fixed point ------------------------------------------ Why not? C. Widening Operators (4.2.1) 1. idea ------------------------------------------ IDEA How to approximate lfp(f)? use sequence (f^n_V)n - which must stabalize - which will safely approximate lfp(f) The V (\nabla) is a widening operator ------------------------------------------ 2. upper bound operators ------------------------------------------ UPPER BOUND OPERATORS def: Suppose L is a complete lattice. Then an operation ub: L x L -> L is an upper bound operator iff for all l1, l2 in L, l1 <= ub(l1,l2) and l2 <= ub(l1,l2). Example (4.12): Let int be a fixed interval e.g., int = [0,2] define: ub^int(int1, int2) = if int1 <= int or int2 <= int1 then int1 |_| int2 else [-\infty, \infty] ------------------------------------------ Is an upper bound operator monotone? commutative? associative? Is ub^int symmetric? Why is ub^int an upper bound operator? ------------------------------------------ MAKING ASCENDING CHAINS def: Let (l_n)n = (l_0, l_1, ...) be a sequence of elements in L. Let phi: L -> L be a total function. Then bapply(phi, (l_n)n) = (m_n)n where m_0 = l_0 m_n = phi(m_{n-1}, l_n), for n > 0 Notation: (bapply(phi, (l_n)n) is written (l^{phi}_n)n Fact 4.11 If (l_n)n is a sequence and ub is an upper bound operator, then (bapply(ub, (l_n)n) is an ascending chain. ------------------------------------------ What happens if we bapply an an upper bound operator to a sequence? Does that chain eventually stabalize? 3. widening operators ------------------------------------------ WIDENING OPERATORS def: Let L be a complete lattice. Then V: L x L -> L is a widening operator iff: - V is an upper bound operator, and - for all ascending chains (l_n)n, the chain bapply(V, (l_n)n) eventually stabilizes ------------------------------------------ Is bapply(V, (l_n)n) an ascending chain? ------------------------------------------ USING WIDENING TO SAFELY APPROXIMATE LFP Given: monotone f: L -> L widening operator V: L x L -> L Goal: find lfp_V(f), such that: (a) f(lfp_V(f)) <= lfp_V(f), and (b) lfp_V(f) >= lfp(f) Define lfp_V(f) = f_V^m, where m >= 0 is the least number such that: f(f_V^m) <= f_V^m where for all n >= 0 f_V^0 = _|_ f_V^{n+1} = f_V^{n}, if f(f_V^{n}) <= f_V^{n} f_V^{n+1} = f_V^{n} V f(f_V^{n}), otherwise ------------------------------------------ Why does f_V^n eventually become reductive? ------------------------------------------ EXAMPLE 4.15 Consider lattice Interval. For K a finite set of integers, widening operator V_K defined by: _|_ V_K int2 = _|_ int1 V_K _|_ = _|_ int1 V_K int2 = [LB_K(inf(int1), inf(int2)), UB_K(sup(int1), sup(int2))] where LB_K(z1,z3) = z1, if z1 <= z3 k, if z3 < z1 /\ k = max{k \in K | k <= z3} -\infty, if z3 < z1 /\ (k \in K ==> z3 < k) UB_K(z2,z4) = z2, if z4 <= z2 k, if z2 < z4 /\ k = min{k \in K | z4 <= k} \infty, if z2 < z4 /\ (k \in K ==> k < z4) E.g., suppose K = {5, 4, 1}, and consider (int_n)n defined by [0,1],[0,2],[0,3],... then (int^V_n)n is: ------------------------------------------ What set of integers would work? Why is V_K an upper bound operator? Why is V_K a widening operator? D. narrowing operators (4.2.2) ------------------------------------------ NARROWING OPERATORS (4.2.2) Widening operator V gives an m such that f(f_V^m) <= f_V^m Note that - f_V^m may not be a fixed point of f - f_V^m >= lfp(f) Goal: get better approx to lfp(f) Idea: f_V^m in Red(f) So search by computing f(f_V^m) f(f(f_V^m)) ... f^n(f_V^m) ------------------------------------------ Will this stabilize? When can we stop? ------------------------------------------ NARROWING OPERATOR def: D: L x L -> L is a narrowing operator iff: - for all l1, l2 in L, l2 <= l1 ==> l2 <= (l1 D l2) and (l1 D l2) <= l1 - for all descending chains (l_n)n, the sequence bapply(D, (l_n)n) eventually stabalizes. ------------------------------------------ IV. Galois Connections (4.3) A. motivation ------------------------------------------ GALOIS CONNECTIONS (4.3) Motivation: Collecting semantics: - obviously correct, - view as a lattice, L but it's - costly, and/or - nonterminating So do analysis in another lattice, M Relationship: abstraction function a: L -> M concretization function g: M -> L ------------------------------------------ 1. definition ------------------------------------------ DEFINITION Def: Let L and M be complete lattices. Then (L, a, g, M) is a Galois connection iff a: L -> M and g: M -> L are monotone and g o a >= id_L (4.8) a o g <= id_M (4.9) ------------------------------------------ 2. adjunctions ------------------------------------------ ADJUNCTIONS Def: Let L and M be complete lattices. Then (L, a, g, M) is an adjunction iff a: L -> M and g: M -> L are total and for all l in L and m in M: a(l) <= m iff l <= g(m). Prop 4.20. (L, a, g, M) is a Galois connection iff it is an adjunction. ------------------------------------------ 3. galois connections defined by extraction functions ------------------------------------------ GALOIS CONNECTIONS DEFINED BY EXTRACTION FUNCTIONS Fact: Suppose b: V -> L is a representation function. Then (Powerset(V), a, g, L) is a Galois connection between Powerset(V) and L, where for all V' <= V and l \in L: a(V') = \join {b(v) | v \in V'} g(l) = {v \in V | b(v) <= l} def: Suppose L = (Powerset(D), <=) and eta: V -> D. We define b_{eta}: V -> Powerset(D) by b_{eta}(v) = {eta(v)} Fact: (Powerset(V),a_{eta},g_{eta},Powerset(D)) is a Galois connection, where a_{eta}(V') = {eta(v) | v \in V'} g_{eta}(D') = {v | eta(v) \in D'} ------------------------------------------ B. properties (4.3.1) ------------------------------------------ PROPERTIES Lemma 4.22: If (L, a, g, M) is a Galois connection, then: (i) a uniquely determines g by g(m) = |_| { l | a(l) <= m } and g uniquely determines a by _ a(l) = | | { m | l <= g(m) } (ii) a is completely additive and g is completely multiplicative Lemma 4.23: If a: L -> M is completely additive, then there exists g: M -> L such that (L, a, g, M) is a Galois connection Fact 4.24: If (L, a, g, M) is a Galois connection then a o g o a = a and g o a o g = g ------------------------------------------ Does a correctness relation from V to L extend to M? C. Galois Insertions (4.3.2) ------------------------------------------ GALOIS INSERTIONS (4.3.2) In a Galois connection (L, a, g, M) the set M may contain "junk" elements. def: Let L and M be complete lattices. Then (L, a, g, M) is a Galois insertion iff it is a Galois connection and a o g = id_M ------------------------------------------ So what happens if we first concretize then abstract? In a Galois insertion (L, a, g, M), is a surjective? injective? In a Galois insertion (L, a, g, M), is g surjective? injective?