I. Overview of Semantics (based on the book's section 2.2) Why are formal correctness proofs useful for program analysis? A. kinds of semantics ------------------------------------------ KINDS OF SEMANTICS Axiomatic Semantics as declarative specifications predicates + predicate transformers Denotational Semantics as functional programming domains + meaning functions Operational Semantics as logic programming configurations + rewrite rules ------------------------------------------ II. Denotational Semantics (2.2) 1. denotational example ------------------------------------------ DENOTATIONAL SEMANTICS OF EXPRESSIONS FOR WHILE LANGUAGE Syntax (1.2): x,y \in Var n \in NumericLiteral a \in AExp b \in BExp op_a \in Op_a op_b \in Op_b op_r \in Op_r a ::= x | n | a1 op_a a2 b ::= true | false | not b | b1 opb b2 | a1 op_r a2 Domains: Z = { ..., -1, 0, 1, ... } T = { true, false } s \in State = Var -> Z Meaning Functions: A : Aexp -> State -> Z A[[x]]s = s(x) A[[n]]s = N(n) A[[a1 op_a a2]]s = (O_a[[op_a]]) (A[[a1]]s) (A[[a2]]s) where N : NumericLiteral -> Z N[[n]] = the value of literal n O_a : Op_a -> (Z x Z -> Z) O_a[[+]] = \n1 n2 . n1 + n2 ... ------------------------------------------ Why is the state a parameter to the meaning of arithmetic expressions? How would you define B : Bexp -> (State -> T)? How would you define a tagged union type with operations left, right, and case? III. Structural Operational Semantics (2.2.1) 1. varieties ------------------------------------------ VARIATIONS ON STRUCTURAL OPERATIONAL SEMANTICS Big step = relation between initial and final state Small step = relation between a state and next state ------------------------------------------ 2. terminal transition system, little step, computation semantics a. idea ------------------------------------------ COMPUTATION (LITTLE STEP) SEMANTICS Meaning Programs <-------> Answers | ^ input | | output | | v reducesto* | Gamma <-----------> T def: a in Meaning[[P]] iff ------------------------------------------ b. terminal transition systems ------------------------------------------ TERMINAL TRANSITION SYSTEM (TTS) (Gamma, -->, Terminal) Gamma : --> : Terminal : reducesto* reflexive, transitive closure ------------------------------------------ 3. semantics of the while language (2.2.1) ------------------------------------------ CONFIGURATIONS AND TRANSITIONS Configurations (Gamma): Gamma = (Stmt x State) + State s in State = Var -> Z Terminal Configurations: Terminal = State ------------------------------------------ What does a configuration of the form (S, s) mean? What does a configuration of the form s mean? Must every execution reach a final state? What part of the state does an expression depend on? ------------------------------------------ TRANSITIONS (Table 2.6) [ass] ([x := a]^l, s) --> s[x |-> A[[a]]s] [skip] ([skip]^l, s) --> s (S1, s) --> (S1', s') [seq1] -------------------------- (S1;S2, s) --> (S1';S2, s') (S1, s) --> s' [seq2] -------------------------- (S1;S2, s) --> (S2, s') [if1] (if [b]^l then S1 else S2, s) --> (S1, s) if B[[b]]s = true [if2] (if [b]^l then S1 else S2, s) --> (S2, s) if B[[b]]s = false [wh1] (while [b]^l do S, s) --> (S; while [b]^l do S, s) if B[[b]]s = true [wh2] (while [b]^l do S, s) --> s if B[[b]]s = false ------------------------------------------ What do the sequence rules do? Do the rules allow evaluation of the true or false part of an if-statement before evaluating the condition? ------------------------------------------ EXAMPLE Program [q := 0]^1; [r := x]^2; while [r >= y]^3 do ([r := r-y]^4; [q := q+1]^5) Name each statement S_i if it has label i Let S_123 be the whole program Let S_3 be the while loop Let S_23 be the sequential composition of S_2 and S_3 Let S_45 be the body of the while loop Let sqrxy be the state in which q has value q, ..., y has value y E.g, s7062(q) = 7, s0062(x) = 6. --> {by [seq2]} * ([q := 0]^1, s7062> --> {by [ass]} s0062 . --> {by [seq2]} * <[r := x]^2, s0062> --> {by [ass]} s0662 . --> {by [wh1], since A[[r >= y]](s0662) = true} --> {by [seq1]} * --> {by [seq2]} * <[r := r-y]^4, s0662> --> {by [ass]} s0462 . . --> ------------------------------------------ How are the labels used in the semantics? What would be the rule(s) for a one-armed if-statement? How would you add rule(s) for a for loop? What would be the rule(s) for a exitblock statement, with block syntax begin S end? How would you add rule(s) for a case statement that observes a tagged union with left and right expressions? 4. properties of WHILE's semantics As the configurations evolve, does the flow graph for the statement in the configuration change? If so, how? ------------------------------------------ PROPERTIES OF WHILE's SEMANTICS How does the flow graph change as the configurations change? Case 1: --> Compare vs. final(S) final(S') flow(S) flow(S') blocks(S) blocks(S') Case 2: --> s' what can we say about the graph of S? ------------------------------------------ IV. Correctness of the Live Variables Analysis (2.2.2) A. failed attempt 1. characterizing the dataflow equations as generators ------------------------------------------ EQUATION SYSTEM LV^=(S*) defined by: LVexit(l) = if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) = (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*: F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^=(S*), written live |= LV^=(S*), iff live is a fixpoint of F^S*_LV. ------------------------------------------ Why use functions instead of tuples for solutions? ------------------------------------------ EXAMPLE Q: What is F^S*_LV for: [x := 3]^1; [y := x+2]^2; [y := y+1]^3 ------------------------------------------ What do we want to prove for correctness? 2. Specification of Live Variables Analysis ------------------------------------------ DEFINING LIVE VARIABLES SEMANTICALLY What should a solution to LV mean? At a given point, only the live variables matter. def: States s1 and s2 are *similar with respect to a set of variables V*, written s1 ~_V s2, iff (\forall x \in V :: s1(x) = s2(x)). Conjecture: Let S be label consistent. Suppose live |= LV^=(S) and s1 ~_{live(entry)(init(S))} s2. Then: (1) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (2) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ What happens to ~_V as V shrinks? a. failed proof How would we prove the theorem? B. fix using constraint system 1. Constraint characterization of live variables ------------------------------------------ CONSTRAINT SYSTEM LV^{\subseteq}(S*) defined by: LVexit(l) \supseteq if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) \supseteq (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*, as above F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) ------------------------------------------ 2. strengthening the induction ------------------------------------------ Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^{\subseteq}(S*), written live |= LV^{\subseteq}(S*), iff live \sqsupseteq F^S*_LV(live). def: lv1 \sqsupseteq lv2 iff (\forall e \in {entry,exit} :: (\forall l in Lab* :: lv1(e)(l) \supseteq lv2(e)(l))) Lemma 2.15: If S* is label consistent, then live |= LV^=(S*) implies live |= LV^{\subseteq}(S*). Furthermore, the least solutions coincide. ------------------------------------------ What lemma would give us the property where we got stuck? ------------------------------------------ SOLUTIONS WORK FOR SUBSTATEMENTS Lemma 2.16: Suppose S1 is label consistent, and live |= LV^{\subseteq}(S1). If flow(S1) \supseteq flow(S2), blocks(S1) \supseteq blocks(S2), then S2 is label consistent and live |= LV^{\subseteq}(S2). ------------------------------------------ Why is this true? ------------------------------------------ SOLUTIONS PRESERVED Corollary 2.17: Suppose S is label consistent, and live |= LV^{\subseteq}(S). If (S,s) --> (S',s'), then live |= LV^{\subseteq}(S'). ------------------------------------------ Why does that follow? ------------------------------------------ SOLUTION CAN ONLY SHRINK FORWARD Lemma 2.18: Suppose S is label consistent, and live |= LV^{\subseteq}(S). Then for all (l,l') \in flow(S), live(exit)(l) \supseteq live(entry)(l'). Proof: construction of LV^{\subseteq}(S). ------------------------------------------ 3. A proper proof ------------------------------------------ GETTING CORRECTNESS RIGHT Theorem 2.21: Let S be label consistent. Suppose live |= LV^{\subseteq}(S) and s1 ~_{live(entry)(init(S))} s2. Then: (1) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (2) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ What does this theorem tell us about execution sequences?