I. Overview of Semantics (based on the book's section 2.2) Why are formal correctness proofs useful for program analysis? A. kinds of semantics ------------------------------------------ KINDS OF SEMANTICS Axiomatic Semantics as declarative specifications - predicates + predicate transformers Denotational Semantics as functional program domains + meaning functions Operational Semantics as logic programming configurations + rewrite rules ------------------------------------------ II. Denotational Semantics (2.2) 1. denotational example ------------------------------------------ SYNTAX OF WHILE LANGUAGE EXPRESSIONS Abstract Syntax (1.2): x,y \in Var n \in NumericLiteral a \in AExp b \in BExp op_a \in Op_a op_b \in Op_b op_r \in Op_r a ::= x | n | a1 op_a a2 b ::= true | false | not b | b1 opb b2 | a1 op_r a2 Lexical syntax: n \in NumericLiteral dgs \in Digits d \in Digit n ::= d | dgs d dgs ::= d | dgs d d ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |8 | 9 ------------------------------------------ ------------------------------------------ DENOTATIONAL SEMANTICS OF EXPRESSIONS Domains: Z = { ..., -1, 0, 1, ... } T = { true, false } s \in State = Var -> Z Meaning Functions: A : Aexp -> State -> Z A[[x]]s = s(x) A[[n]]s = N(n) A[[a1 op_a a2]]s = (O_a[[op_a]]) (A[[a1]]s) (A[[a2]]s)) where N : NumericLiteral -> Z D : Digit -> Z D[[0]] = 0, D[[1]] = 1, D[[2]] = 2, D[[3]] = 3, D[[4]] = 4, D[[5]] = 5, D[[6]] = 6, D[[7]] = 7, D[[8]] = 8, D[[9]] = 9, N[[n]] = Nval[[n]](1) Nval: Digits -> Z -> Z Nval[[d]](p) = p*D[[d]] Nval[[dgs d]](p) = Nval[[dgs]](10*p) + Nval[[d]](p) O_a : Op_a -> (Z x Z -> Z) O_a[[+]] = \n1 n2 . n1 + n2 O_a[[-]] = \n1 n2 . n1 - n2 ... ------------------------------------------ Why is the state a parameter to the meaning of arithmetic expressions? How would you define B : Bexp -> (State -> T)? III. Structural Operational Semantics (2.2.1) 1. varieties ------------------------------------------ KINDS OF STRUCTURAL OPERATIONAL SEMANTICS Big step: specifies relation between initial and final state Small step: specifies relation between a state and next state ------------------------------------------ How would infinite loops be treated in each type of semantics? 2. terminal transition system, little step, computation semantics a. idea ------------------------------------------ COMPUTATION (LITTLE STEP) SEMANTICS Meaning Programs <- -- - -> Answers | ^ input | | output | | v reducesto* | Gamma <-----------> T def: a in Meaning[[P]] iff ------------------------------------------ b. terminal transition systems ------------------------------------------ TERMINAL TRANSITION SYSTEM (TTS) (Gamma, -->, Terminal) Gamma : --> : Terminal : reducesto* reflexive, transitive closure ------------------------------------------ 3. semantics of the while language (2.2.1) ------------------------------------------ CONFIGURATIONS AND TRANSITIONS Configurations (Gamma): Gamma = (Stmt x State) + State s in State = Var -> Z Terminal Configurations: Terminal = State ------------------------------------------ What does a configuration of the form (S, s) mean? What does a configuration of the form s mean? Must every execution reach a final state? What part of the state does an expression depend on? ------------------------------------------ TRANSITIONS (Table 2.6) [asg] ([x := a]^l, s) --> s[x |-> A[[a]]s] [skip] ([skip]^l, s) --> s (S1, s) --> (S1', s') [seq1] -------------------------- (S1;S2, s) --> (S1';S2, s') (S1, s) --> s' [seq2] -------------------------- (S1;S2, s) --> (S2, s') [if1] (if [b]^l then S1 else S2, s) --> (S1, s) if B[[b]]s = true [if2] (if [b]^l then S1 else S2, s) --> (S2, s) if B[[b]]s = false [wh1] (while [b]^l do S, s) --> (S; while [b]^l do S, s) if B[[b]]s = true [wh2] (while [b]^l do S, s) --> s if B[[b]]s = false ------------------------------------------ What do the sequence rules do? Do the rules allow evaluation of the true or false part of an if-statement before evaluating the condition? How do the wh1 and wh2 rules work? ------------------------------------------ EXAMPLE Program [q := 0]^1; [r := x]^2; while [r >= y]^3 do ([r := r-y]^4; [q := q+1]^5) Name each statement S_i if it has label i Let S_123 be the whole program (a sequence) Let S_3 be the while loop Let S_23 be the sequential composition of S_2 and S_3 Let S_45 be the body of the while loop Let sqrxy be the state in which q has value q, ..., y has value y E.g, s7062(q) = 7, s0062(x) = 6. (S_123, s7062) --> {by [seq2]} * ([q := 0]^1, s7062) --> {by [asg]} s0062 . (S_23, s0062) --> {by [seq2]} * <[r := x]^2, s0062) --> {by [asg]} s0662 . (S_3, s0662) --> {by [wh1], since B[[r >= y]](s0662) = true} (S_45; S_3, s0662) --> {by [seq1]} * (S_45, s0662) --> {by [seq2]} * <[r := r-y]^4, s0662) --> {by [asg]} s0462 . (S_5, s0462) . (S_5; S_3, s0462) --> ------------------------------------------ How are the labels used in the semantics? What would be the rule(s) for a one-armed if-statement? How would you add rule(s) for a for loop of the form for x := a1..a2 do S ? What would be the rule(s) for a exitblock statement, where blocks have the syntax begin S end and the exitblock statement has the syntax exitblock ? What would need to be done to have a semantics of assert statements with syntax: assert [b]^l ? What would need to be done to have a semantics of case statements with syntax case e of left(x) then S1 [] right(y) of S2 end that observes a tagged union expression (e) ? 4. properties of WHILE's semantics As the configurations evolve, does the CFG for the statement in the configuration change? If so, how? ------------------------------------------ PROPERTIES OF WHILE's SEMANTICS How does the flow graph change as the configurations change? Case 1: (S,s) --> (S',s') Compare vs. final(S) final(S') flow(S) flow(S') blocks(S) blocks(S') Case 2: (S,s) --> s' what can we say about the graph of S? ------------------------------------------ IV. Correctness of the Live Variables Analysis (2.2.2) A. failed attempt 1. characterizing the dataflow equations as generators ------------------------------------------ EQUATION SYSTEM LV^=(S*) defined by: LVexit(l) = if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) = (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*: F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) - killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^=(S*), written live |= LV^=(S*), iff live is a fixpoint of F^S*_LV. ------------------------------------------ Why use functions instead of tuples for solutions? ------------------------------------------ EXAMPLE Q: What is F^S*_LV for: [x := 3]^1; [y := x+2]^2; [y := y+1]^3 ------------------------------------------ What do we want to prove for correctness? 2. Specification of Live Variables Analysis ------------------------------------------ DEFINING LIVE VARIABLES SEMANTICALLY What should a solution to LV mean? At a given point, only the live variables can affect the computation def: States s1 and s2 are *similar with respect to a set of variables V *, written s1 ~_V s2, iff (\forall x \in V :: s1(x) = s2(x)). def: functional liv: {entry,exit} -> Lab -> State *solves a set LV^=(S)* iff for all d in {entry,exit}, l in Lab: if V = liv(d)(l), then liv(d)(l) satisfies LV^=(S)(d)(l) Conjecture: Let S be label consistent. Suppose live |= LV^=(S) and s1 ~_{live(entry)(init(S))} s2. Then: (1) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (2) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ What happens to ~_V as V shrinks? a. failed proof How would we prove the conjecture? B. fix using constraint system 1. Constraint characterization of live variables ------------------------------------------ CONSTRAINT SYSTEM LV^{\subseteq}(S*) defined by: LVexit(l) \supseteq if l \in final(S*) then {} else \bigcup { LVentry(l') | (l', l) \in flow^R(S*) } LVentry(l) \supseteq (LVexit(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) Functional form, for a given S*, as above F^S*_LV: ({entry,exit} -> Lab* -> P(Var*)) -> ({entry,exit} -> Lab* -> P(Var*)) F^S*_LV(F)(exit)(l) = if l \in final(S*) then {} else \bigcup { F(entry)(l') | (l', l) \in flow^R(S*) } F^S*_LV(F)(entry)(l) = (F(exit)(l) \ killLV(B^l)) \cup genLV(B^l) where B^l \in blocks(S*) ------------------------------------------ 2. strengthening the induction ------------------------------------------ Solutions: live: {entry,exit} -> Lab* -> P(Var*) def: live solves LV^{\subseteq}(S*), written live |= LV^{\subseteq}(S*), iff live \sqsupseteq F^S*_LV(live). def: lv1 \sqsupseteq lv2 iff (\forall e \in {entry,exit} :: (\forall l in Lab* :: lv1(e)(l) \supseteq lv2(e)(l))) Lemma 2.15: If S* is label consistent, then live |= LV^=(S*) implies live |= LV^{\subseteq}(S*). Furthermore, the least solutions coincide. ------------------------------------------ What lemma would give us the property where we got stuck? ------------------------------------------ SOLUTIONS WORK FOR SUBSTATEMENTS Lemma 2.16: Suppose S1 is label consistent, and live |= LV^{\subseteq}(S1). If flow(S1) \supseteq flow(S2) and blocks(S1) \supseteq blocks(S2), then S2 is label consistent and live |= LV^{\subseteq}(S2). ------------------------------------------ Why is this true? ------------------------------------------ SOLUTIONS PRESERVED Corollary 2.17: Suppose S is label consistent, and live |= LV^{\subseteq}(S). If (S,s) --> (S',s'), then live |= LV^{\subseteq}(S'). ------------------------------------------ Why does that follow? ------------------------------------------ SOLUTION CAN ONLY SHRINK FORWARD Lemma 2.18: Suppose S is label consistent, and live |= LV^{\subseteq}(S). Then for all (l,l') \in flow(S), live(exit)(l) \supseteq live(entry)(l'). Proof: construction of LV^{\subseteq}(S). ------------------------------------------ 3. A proper proof ------------------------------------------ GETTING CORRECTNESS RIGHT Theorem 2.21: Let S be label consistent. Suppose live |= LV^{\subseteq}(S) and s1 ~_{live(entry)(init(S))} s2. Then: (1) if (S, s1) --> (S', s1'), then there is some s2' such that (S, s2) --> (S', s2') and s1' ~_{live(entry)(init(S'))} s2'. (2) if (S, s1) --> s1' then there is some s2' such that (S, s2) --> s2' and s1' ~_{live(exit)(init(S))} s2'. ------------------------------------------ What does this theorem tell us about execution sequences?