I. Intraprocedural Analysis (2.1)
 A. definitions and notation
  1. initial and final labels
------------------------------------------
       INITIAL LABEL

init: Stmt -> Lab

  init([x := a]^l) = l
  init([skip]^l) = l
  init(S1; S2) =
  init(if [b]^l then S1 else S2) =
  init (while [b]^l do S) =

------------------------------------------
      What would the initial label of a nondeterministic choice
         statement be?  Of a parallel composition statement?
      How would we generalize the formalism to handle such statements?
------------------------------------------
      FINAL LABELS

final: Stmt -> Powerset(Lab)

  final([x := a]^l) = {l}
  final([skip]^l) = {l}
  final(S1; S2) =
  final(if [b]^l then S1 else S2) =
  final (while [b]^l do S) =

------------------------------------------
      What would the final label set of a nondeterministic choice
         statement be?  Of a parallel composition statement?
  2. blocks and labels
------------------------------------------
      ELEMENTARY BLOCKS

blocks: Stmt -> Powerset(Block)


------------------------------------------
      How would you define the set of elementary blocks in a statement?
      What would the set of elementary blocks in an assert statement be?
      What would be set of elementary blocks in a nondeterministic choice
          statement be?
      How would you define the set of labels in a statement?
  3. flows and reverse flows
------------------------------------------
      FLOWS

flow: Stmt -> Powerset(Lab x Lab)

  flow([x := a]^l) = {}
  flow([skip]^l) = {}
  flow(S1; S2) =

  flow(if [b]^l then S1 else S2) =

  flow(while [b]^l do S) =

------------------------------------------
    How would we use these functions to represent the nodes and
   edges of a control flow graph?
   What are the nodes of the control flow graph?
   What are labels and edges of
         if [x > 3]^1 then [y:=2]^2 else [z:=3]^3
      ?
    How you formulate a set of reverse flows?
   What is flow^R of
         if [x > 3]^1 then [y:=2]^2 else [z:=3]^3
      ?
   What is flow^R of
         (while [x <= 4]^1 do [x := x+1]^2); [assert x == 4]^3
      ?
   What are the initial nodes of flow^R?
  4. program of interest
------------------------------------------
           PROGRAM OF INTEREST

 S* = the top level statement

 Lab* = labels(S*)

 Var* = FV(S*)

 Blocks* = blocks(S*)

------------------------------------------
------------------------------------------
      ISOLATED ENTRIES AND EXITS

def: S* has isolated entries iff


def: S* has isolated exits iff


------------------------------------------
   What kind of programs would not have isolated entries?
   What kind of programs would not have isolated exits?
   Could we convert programs to have both isolated entries and exits?
------------------------------------------
           LABEL CONSISTENT

def: S is label consistent if and only if
     no two blocks in S


Formally:


------------------------------------------
    How would you formalize that?
    Is there any reason not to have label consistent programs?
 B. Example analyses
  1. Available Expressions (AE) analysis (2.1.1)
   a. definitions: trivial and non-trivial expressions
       What's a trivial expression?
------------------------------------------
       SUBEXPRESSIONS

def: An expression is *trivial* iff


 Aexp(a) = non-trivial arithmetic
           subexpressions of expression, a

E.g., Aexp(-x + -y) =


 Aexp* = nontrivial arithmetic expressions
             in S*

------------------------------------------
   b. idea, goal
------------------------------------------
    AVAILABLE EXPRESSIONS (AE) ANALYSIS

Analysis question (p. 37):

 "For each program point,
  which [non-trivial] expressions
  must have already been
  computed, and not later modified,
  on all paths to that program point."

Example:

   [k := i*j-1]^1;
   while [i*j-1 < n]^2
   do {  [t := a+k]^3;
         [j := j+1]^4;
         [k := i*j-1]^5  }
------------------------------------------
    Why would a compiler writer care about non-trivial expressions?
    Why would trivial expressions not matter?
    What non-trivial expressions are available at entry to block 2?
   c. observations
    What would be the bad outcome, which should be avoided?
    What makes a "solution" unsafe?
    What makes it imprecise?
    This is a forward analysis, Why?
    Note the use of the word "must", what impact does that have?
   d. formalization
------------------------------------------
        FORMAL DEFINITION

AEentry(l) =
   if l = init(S*) then {}
   else \bigcap {AEexit(l')
                   | (l',l) \in flow(S*)}

AEexit(l) =
(AEentry(l) - killAE(B^l)) \cup genAE(B^l)
   where B^l \in blocks(S*)


killAE: Blocks* -> Powerset(Aexp*)

killAE([x:= a]^l) =

killAE([skip]^l) = {}
killAE([b]^l) = {}


genAE: Blocks* -> Powerset(Aexp*)

genAE([x:= a]^l) =

genAE([skip]^l) = {}
genAE([b]^l) = 
------------------------------------------
    What does the kill function mean?
    What does the gen function mean?
    Why don't we have to define the analysis for while loops and if
       statements?
   e. example revisited
------------------------------------------
        EXAMPLE

   [k := i*j-1]^1;
   while [i*j-1 < n]^2
   do ([t := a+k]^3;
       [j := j+1]^4;
       [k := i*j-1]^5)

What's the control flow graph?


------------------------------------------
    What is Aexp* in this example?
    What's the role of the control flow graph here?
------------------------------------------
       KILL AND GEN FOR THE EXAMPLE

What are killAE and genAE in the example?

l  killAE(l)      genAE(l)
============================
1
2
3
4
5
------------------------------------------
    Do the kill and gen functions depend on what happens at other nodes?
------------------------------------------
        EXAMPLE'S AE EQUATIONS

AEentry(1) = 
AEentry(2) =
AEentry(3) =
AEentry(4) =
AEentry(5) =

AEexit(1) =

AEexit(2) =

AEexit(3) =

AEexit(4) =

AEexit(5) =

------------------------------------------
    What are we assuming with this formalism?
    How would we adjust this if we didn't have isolated entries?
    What's the bad outcome we want to avoid?
    What makes the analysis information more imprecise?
    What makes the analysis unsafe?
    So what do we want for a solution the largest or smallest?
    i. Finding a Solution using Chaotic Iteration (1.7)
    How can we get rid of the recursion in the dataflow equations?
    What sets do use to start iterating from to find a solution?
------------------------------------------
 FORMALIZING THE ANALYSIS AS AN ITERATION

Let \vec{AE} be (AE1, AE2, ..., AE10)
   where:
    AE1 is the entry info for label 1
        from the previous iteration
    AE2 is the exit info for label 1
        from the previous iteration
    AE3 is entry info. for label 2
        from the previous iteration
    AE4 is exit info for label 2
        from the previous iteration
    AE5 is entry info for label 3
    AE6 is exit info for label 3
    ...
    AE10 is exit info for label 5
    
define F(v) = (F1(v), F2(v), ..., F10(v))
  where:
    F1(AE1, AE2, ..., AE10) // entry(1)
     = {}                   
    F2(AE1, AE2, ..., AE10) // exit(1)
     = (AE1 - killAE([k:=i*j-1]^1))
       \cup genAE([k:=i*j-1]^1)
     = (AE1 - {a+k}) \cup {i*j,i*j-1}
    F3(AE1, AE2, ..., AE10) // entry(2)
     = AE2 \cap AE10
    F4(AE1, AE2, ..., AE10) // exit(2)
     = (AE3 - killAE([i*j-1 < n]^2))
       \cup genAE([i*j-1 < n]^2)
     = (AE3 - {}) \cup {i*j,i*j-1}
    F5(AE1, AE2, ..., AE10) // entry(3)
     = AE4
    F6(AE1, AE2, ..., AE10) // exit(3)
     = (AE5 - killAE([t := a+k]^3))
       \cup genAE([t := a+k]^3)
     = (AE5 - {}) \cup {a+k}
    F7(AE1, AE2, ..., AE10) // entry(4)
     = AE6
    F8(AE1, AE2, ..., AE10) // exit(4)
     = (AE7 - killAE([j := j+1]^4))
       \cup genAE([j := j+1]^4)
     = (AE7 - {i*j,i*j-1,j+1}) \cup {}
    F9(AE1, AE2, ..., AE10) // entry(5)
     = AE8
    F10(AE1, AE2, ..., AE10) // exit(5)
     = (AE9 - killAE([k := i*j-1]^5))
       \cup genAE([k := i*j-1]^5)
     = (AE9 - {a+k}) \cup {i*j, i*j-1}

Then a fixed-point of F is a solution.
------------------------------------------
------------------------------------------
         CORRECTNESS OF ITERATION

Def 0 (\sqsubseteq):
  tuples of available expressions are
  ordered pointwise by superset inclusion:
i.e.,
 (v1,...,v10) \sqsubseteq (v1',...,v10')
   iff v1 \supseteq v1' and ...
       and v10 \supseteq v10'

Exercise 1: F is monotonic
       (in \sqsubseteq):
 i.e., if v \sqsubseteq v'
       then F(v) \sqsubseteq F(v').

Corollary 2:
Since (Aexp*,...,Aexp*)
      \sqsubseteq (Aexp*,...,Aexp*),
    F(Aexp*,...,Aexp*)
      \sqsubseteq F(Aexp*,...,Aexp*)

Corollary 3:
    (Aexp*,...,Aexp*)
      \sqsubseteq F(Aexp*,...,Aexp*)

Exercise 4: For all natural numbers n:
    (Aexp*,...,Aexp*)
      \sqsubseteq F^n(Aexp*,...,Aexp*)

Exercise 5: there is some natural number n
    such that 
      F^n(Aexp*,...,Aexp*)
       = F(F^n(Aexp*,...,Aexp*))
    (so that F^n(Aexp*,...,Aexp*) is a
     fixed point of F).
------------------------------------------
      Why do these hold?
------------------------------------------
       USING THE ITERATION DIRECTLY

Iterate F, starting with
  (Aexp*, Aexp*, ..., Aexp*)
 at each iteration get a better
   approximation (smaller set) until reach
   a fixed point

First iteration:
F(Aexp*,Aexp*,...,Aexp*)
 = (F1(Aexp*), F2(Aexp*), ..., F10(Aexp*))

F1(Aexp*,Aexp*,...,Aexp*)
 = {}
F2(Aexp*,Aexp*,...,Aexp*)
 = (Aexp* - {a+k}) \cup {i*j, i*j-1}
 = {i*j, i*j-1, j+1}
F3(Aexp*,Aexp*,...,Aexp*)
 = Aexp* \cap Aexp*
 = Aexp*
F4(Aexp*,Aexp*,...,Aexp*)
 = (Aexp* - {}) \cup {i*j,i*j-1}
 = Aexp*
F5(Aexp*,Aexp*,...,Aexp*)
 = Aexp*
F6(Aexp*,Aexp*,...,Aexp*)
 = (Aexp* - {}) \cup {a+k}
 = Aexp*
F7(Aexp*,Aexp*,...,Aexp*)
 = Aexp*
F8(Aexp*,Aexp*,...,Aexp*)
 = (Aexp* - {i*j,i*j-1,j+1}) \cup {}
 = {a+k}
F9(Aexp*,Aexp*,...,Aexp*)
 = Aexp*
F10(Aexp*,Aexp*,...,Aexp*)
 = (Aexp* - {a+k}) \cup {i*j, i*j-1}
 = ({i*j, i*j-1, j+1}) \cup {i*j, i*j-1}
 = {i*j, i*j-1, j+1}
------------------------------------------
------------------------------------------
       SECOND ITERATION

Let \vec{v_1} // output of first iteration
  = ({}, {i*j, i*j-1, j+1}, Aexp*, Aexp*,
     Aexp*, Aexp*, Aexp*, {a+k}, Aexp*,
     {i*j, i*j-1, j+1})

The second iteration computes:
F1(\vec{v_1})
 = {}
F2(\vec{v_1})
 = ({} - {a+k}) \cup {i*j, i*j-1}
 = {i*j, i*j-1}
F3(\vec{v_1})
 = {i*j, i*j-1, j+1}
   \cap {i*j, i*j-1, j+1}
 = {i*j, i*j-1, j+1}
F4(\vec{v_1})
 = (Aexp* - {}) \cup {i*j,i*j-1}
 = Aexp*
F5(\vec{v_1})
 = Aexp*
F6(\vec{v_1})
 = (Aexp* - {}) \cup {a+k}
 = Aexp*
F7(\vec{v_1})
 = Aexp*
F8(\vec{v_1})
 = (Aexp* - {i*j,i*j-1,j+1}) \cup {}
 = {a+k}
F9(\vec{v_1})
 = {a+k}
F10(\vec{v_1})
 = (Aexp* - {a+k}) \cup {i*j, i*j-1}
 = ({i*j, i*j-1, j+1}) \cup {i*j, i*j-1}
 = {i*j, i*j-1, j+1}
------------------------------------------
   Compared to the inputs to the second iteration, did any outputs change?
------------------------------------------
         THIRD ITERATION

Let \vec{v_2} // output of 2nd iteration
  = ({}, {i*j,i*j-1}, {i*j,i*j-1,j+1},
     Aexp*, Aexp*, Aexp*, Aexp*, {a+k},
     {a+k}, {i*j, i*j-1, j+1})

The third iteration computes:
F1(\vec{v_2})
 = {}
F2(\vec{v_2})
 = ({} - {a+k}) \cup {i*j, i*j-1}
 = {i*j, i*j-1}
F3(\vec{v_2})
 = {i*j, i*j-1} \cap {i*j, i*j-1, j+1}
 = {i*j, i*j-1}
F4(\vec{v_2})
 = ({i*j, i*j-1, j+1} - {})
   \cup {i*j,i*j-1}
 = {i*j, i*j-1, j+1}
F5(\vec{v_2})
 = Aexp*
F6(\vec{v_2})
 = (Aexp* - {}) \cup {a+k}
 = Aexp*
F7(\vec{v_2})
 = Aexp*
F8(\vec{v_2})
 = (Aexp* - {i*j,i*j-1,j+1}) \cup {}
 = {a+k}
F9(\vec{v_2})
 = {a+k}
F10(\vec{v_2})
 = ({a+k} - {a+k}) \cup {i*j, i*j-1}
 = {i*j, i*j-1}
------------------------------------------
    What changed in the third iteration?
    Is this a lot of bookkeeping?
    Is there a more efficient way to do this?
------------------------------------------
        CHAOTIC ITERATION ALGORITHM (1.7)

Goal: compute the desired fixed point
      of dataflow equations (quickly)

Idea:
  - use a functional (F =(Fi), as before)
    to make the current approximation
    more and more precise
  - initialize the argument vector
    to be either:
       - a vector of empty sets, or
       - a vector of complete sets
    depending on if want to find the
       - least fixed point, or
       - greatest fixed point

For our example in the AE analysis,
   we want greatest fixed point, so:

Algorithm (Table 1.5):

Step 1: initialize
          AEi := Aexp* for i in {1,...,10}

Step 2:
       while \exists j such that
               AEj \neq Fj(AE1, ..., AE10)
       do
           let j be such that
               AEj \neq Fj(AE1, ..., AE10)
           in
               AEj := Fj(AE1, ..., AE10)

Step 3:
       return (AE1, ..., AE10)
------------------------------------------
      Why is this algorithm correct?
------------------------------------------
  RECALL THE FUNCTIONAL FOR THE EXAMPLE

Recall that F(v) = (F1(v),...,F10(v))
where

F1(AE1,AE2,...,AE10)
 = {}
F2(AE1,AE2,...,AE10)
 = (AE1 - killAE([k:= i*j-1]^1))
   \cup genAE([k:= i*j-1]^1)
 = (AE1 - {a+k}) \cup {i*j, i*j-1}
F3(AE1,AE2,...,AE10)
 = AE2 \cap AE10
F4(AE1,AE2,...,AE10)
 = (AE3 - killAE([i*j-1<n]^2))
   \cup genAE([i*j-1<n]^2))
 = (AE3 - {}) \cup {i*j,i*j-1}
 = AE3 \cup {i*j,i*j-1}
F5(AE1,AE2,...,AE10)
 = AE4
F6(AE1,AE2,...,AE10)
 = (AE5 - killAE([t:=a+k]^3))
   \cup genAE([t:=a+k]^3)
 = (AE5 - {}) \cup {a+k}
 = AE5 \cup {a+k}
F7(AE1,AE2,...,AE10)
 = AE6
F8(AE1,AE2,...,AE10)
 = (AE7 - killAE([j := j+1]^4))
   \cup genAE([j := j+1]^4)
 = (AE7 - {i*j,i*j-1,j+1}) \cup {}
F9(AE1,AE2,...,AE10)
 = AE8
F10(AE1,AE2,...,AE10)
 = (AE9 - killAE([k:=i*j-1]^5))
   \cup genAE([k:=i*j-1]^5))
 = (AE9 - {a+k}) \cup {i*j, i*j-1}
------------------------------------------
------------------------------------------
     USING CHAOTIC ITERATION
     
Initialize:
     AEi := Aexp* for i in {1,...,10}

(Possible) Steps
 AE1 \neq
   F1(AE1,...,AE10)
   = {}
 so AE1 := {}

(Now AE1 is different than initially)

 AE2 \neq
   F2(AE1,...,AE10)
   = (AE1 - {a+k}) \cup {i*j, i*j-1}
   = ({} - {a+k}) \cup {i*j, i*j-1}
   = {i*j, i*j-1}
 so AE2 := {i*j, i*j-1}

 AE3 \neq
   F3(AE1,...,AE10)
   = AE2 \cap AE10
   = {i*j, i*j-1} \cap Aexp*
   = {i*j, i*j-1}
 so AE3 := {i*j, i*j-1}

 AE4 \neq
   F4(AE1,...,AE10)
   = AE3 \cup {i*j,i*j-1}
   = {i*j, i*j-1} \cup {i*j,i*j-1}
   = \cup {i*j,i*j-1}
 so AE4 := {i*j,i*j-1}

 AE5 \neq
   F5(AE1,...,AE10)
   = AE4
   = {i*j,i*j-1}
 so AE5 := {i*j,i*j-1}

 AE6 \neq
   F6(AE1,...,AE10)
   = AE5 \cup {a+k}
   = {i*j,i*j-1} \cup {a+k}
   = {i*j,i*j-1,a+k}
 so AE6 := {i*j,i*j-1,a+k}

 AE7 \neq
   F7(AE1,...,AE10)
   = AE6
   = {i*j,i*j-1,a+k}
 so AE7 := {i*j,i*j-1,a+k}

 AE8 \neq
   F8(AE1,...,AE10)
   = (AE7 - {i*j,i*j-1,j+1}) \cup {}
   = ({i*j,i*j-1,a+k} - {i*j,i*j-1,j+1})
     \cup {}
   = {a+k}
 so AE8 := {a+k}

 AE9 \neq
   F9(AE1,...,AE10)
   = AE8
   = {a+k}
 so AE9 := {a+k}

 AE10 \neq
   F10(AE1,...,AE10)
   = (AE9 - {a+k}) \cup {i*j, i*j-1}
   = ({a+k} - {a+k}) \cup {i*j, i*j-1}
   = {i*j, i*j-1}
 so AE10 := {i*j, i*j-1}
------------------------------------------
   Could this algorithm take any more steps?
   f. Summary
  2. Reaching Definitions Analysis (2.1.2)
------------------------------------------
REACHING DEFINITIONS (RD) ANALYSIS (2.1.2)

def (p. 5): a *definition of x*,
    where x in Var is


def (p. 5): a definition of x at label l
     *may reach* label l' iff there is
     an execution of the program where


Analysis question (p. 41):

"For each program point,
 which assignments may have been made
 and not overwritten,
 when program execution reaches that point
 along some path?"


------------------------------------------
    What does "definition" mean for this analysis?
    What does it mean for a definition to reach a program point?
    Why would a compiler writer care about reaching definitions?
   a. observations
    What would be the bad outcome, which should be avoided?
    What makes a "solution" unsafe?
    What makes it imprecise?
   So what solution do we want?
    This is a forward analysis, Why?
   Note the use of the word "may" in the analysis statement,
      what impact does that have on the analysis?
   What sets would we use for the RD analysis?
------------------------------------------
     FORMAL DEFINITIIONS FOR RD ANALYSIS

RDentry(l) =
  if l = init(S*) then {(v,?) | v in Var*}
  else 


RDexit(l) =
(RDentry(l) - killRD(B^l)) \cup genRD(B^l)
    where B^l \in blocks(S*)


killRD: Blocks* -> Powerset(Var* x Lab^?*}

killRD([x:= a]^l) =

killRD([skip]^l) = {}
killRD([b]^l) = {}

genRD: Blocks* -> Powerset(Var* x Lab^?*}

genRD([x:= a]^l) =

genRD([skip]^l) = {}
genRD([b]^l) = {}
------------------------------------------
   b. example revisited
------------------------------------------
        EXAMPLE

   [k := i*j-1]^1;
   while [i*j-1 < n]^2
   do ([t := a+k]^3;
       [j := j+1]^4;
       [k := i*j-1]^5)
------------------------------------------
    What is Var* for this example?
    What is Lab*? for this example?
------------------------------------------
           KILL AND GEN FOR THE EXAMPLE

What are killRD and genRD in the example?

l  killRD(l)                genRD(l)
====================================
1
2
3
4
5
------------------------------------------
------------------------------------------
        EXAMPLE'S RD EQUATIONS

Recall that Var* is {a,i,j,k,n,t}

RDentry(1) =

RDexit(1)
 = 

RDentry(2) =

RDexit(2)
 =

RDentry(3) =

RDexit(3)
 =

RDentry(4) =

RDexit(4)
 =

RDentry(5) =

RDexit(5)
 = 


------------------------------------------
    Are larger or smaller sets more precise?
    What is the bad outcome to avoid?
    So what sets do we start with to find a solution?
     So what would the solution be?
------------------------------------------
     SOLUTION USING CHAOTIC ITERATION

let RD1 = {}, ..., RD10 = {};

RD1 \neq F1(RD1,...,RD10)  // entry(1)
   = {(a,?),(i,?),(j,?),(k,?),(n,?),(t,?)}
so RD1 := {(a,?),(i,?),(j,?),(k,?),
           (n,?),(t,?)}

RD2 \neq F2(RD1,...,RD10)  // exit(1)
 = (RD1 - killRD([k := i*j-1]^1))
   \cup genRD([k := i*j-1]^1)
 = ({(a,?),(i,?),(j,?),(k,?),
           (n,?),(t,?)}) - ({k} x Lab*?))
   \cup {(k,1)}
 = {(a,?),(i,?),(j,?),(k,1),(n,?),(t,?)}
so RD2 := {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,?)}

RD3 \neq F3(RD1,...,RD10)  // entry(2)
  = RD2 \cup RD10
  = {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,?)} \cup {}
so RD3 := {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,?)}

RD4 \neq F4(RD1,...,RD10)  // exit(2)
  = RD3
  = {(a,?),(i,?),(j,?),(k,1),(n,?),(t,?)}
so RD4 := {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,?)}

RD5 \neq F5(RD1,...,RD10)  // entry(3)
  = RD4
  = {(a,?),(i,?),(j,?),(k,1),(n,?),(t,?)}
so RD5 := {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,?)}

RD6 \neq F6(RD1,...,RD10)  // exit(3)
  = (RD5 - killRD([t := a+k]^3))
    \cup genRD([t := a+k]^3)
  = ({(a,?),(i,?),(j,?),(k,1),(n,?),(t,?)}
     - ({t} x Lab*?)) \cup {(t,3)}
  = {(a,?),(i,?),(j,?),(k,1),(n,?),(t,3)}
so RD6 := {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,3)}

RD7 \neq F7(RD1,...,RD10)  // entry(4)
  = RD5
  = {(a,?),(i,?),(j,?),(k,1),(n,?),(t,3)}
so RD7 := {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,3)}

RD8 \neq F8(RD1,...,RD10)  // exit(4)
  = (RD7 - killRD([j := j+1]^4))
    \cup genRD([j := j+1]^4)
  = ({(a,?),(i,?),(j,?),(k,1),(n,?),(t,3)}
     - ({j} x Lab*?)) \cup {(j,4)}
  = {(a,?),(i,?),(j,4),(k,1),(n,?),(t,3)}
so RD8 := {(a,?),(i,?),(j,4),(k,1),
           (n,?),(t,3)}

RD9 \neq F9(RD1,...,RD10)  // entry(5)
  = RD8
  = {(a,?),(i,?),(j,4),(k,1),(n,?),(t,3)}
so RD9 := {(a,?),(i,?),(j,4),(k,1),
           (n,?),(t,3)}

R10 \neq F10(RD1,...,RD10)  // exit(5)
  = (RD9 - killRD([k := i*j-1]^5))
    \cup gen([k := i*j-1]^5)
  = ({(a,?),(i,?),(j,4),(k,1),(n,?),(t,3)}
     - ({k} x Lab*?)) \cup {(k,5)}
  = {(a,?),(i,?),(j,4),(k,5),(n,?),(t,3)}
------------------------------------------
    Are there more steps to take?
------------------------------------------
       CHAOTIC ITERATION (CONTINUED)

RD3 \neq F3(RD1,...,RD10)  // entry(2)
  = RD2 \cup RD10
  = {(a,?),(i,?),(j,?),(k,1),
           (n,?),(t,?)}
    \cup {(a,?),(i,?),(j,4),(k,5),
          (n,?),(t,3)}
so RD3 := {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,?),(t,3)}

RD4 \neq F4(RD1,...,RD10)  // exit(2)
  = RD3
  = {(a,?),(i,?),(j,?),(j,4),(k,1),
     (k,5)(n,?),(t,?),(t,3)}
so RD4 := {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,?),(t,3)}

RD5 \neq F5(RD1,...,RD10)  // entry(3)
  = RD4
  = {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,?),(t,3)}
so RD5 := {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,?),(t,3)}

RD6 \neq F6(RD1,...,RD10)  // exit(3)
  = (RD5 - killRD([t := a+k]^3))
    \cup genRD([t := a+k]^3)
  = ({(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,?),(t,3)}
     - ({t} x Lab*?)) \cup {(t,3)}
  = {(a,?),(i,?),(j,?),(j,4),(k,1),(k,5),
     (n,?),(t,3)}
so RD6 := {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,3)}

RD7 \neq F7(RD1,...,RD10)  // entry(4)
  = RD6
  = {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,3)}
so RD7 := {(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,3)}

RD8 \neq F8(RD1,...,RD10)  // exit(4)
  = (RD7 - killRD([j := j+1]^4))
    \cup genRD([j := j+1]^4)
  = ({(a,?),(i,?),(j,?),(j,4),(k,1),
           (k,5),(n,?),(t,3)}
     - ({j} x Lab*?)) \cup {(j,4)}
  = {(a,?),(i,?),(j,4),(k,1),(k,5),
      (n,?),(t,3)}
so RD8 := {(a,?),(i,?),(j,4),(k,1),(k,5),
      (n,?),(t,3)}

RD9 \neq F9(RD1,...,RD10)  // entry(5)
  = RD8
  = {(a,?),(i,?),(j,4),(k,1),(k,5),
      (n,?),(t,3)}
so RD9 := {(a,?),(i,?),(j,4),(k,1),(k,5),
      (n,?),(t,3)}
------------------------------------------
        Now is it finished?
        So what is the fixed point?
  3. Very Busy Expressions Analysis (2.1.3)
   a. idea and goals
------------------------------------------
   VERY BUSY (VB) EXPRESSIONS ANALYSIS

def (p. 44): A non-trivial expression e
     is *very busy*
     at exit from a block labeled l if,
     e must always be used before any
     x \in FV(e) is assigned.

Examples:
At what points is a+b very busy in:

  [x := a+b]^1;
  [y := a+b]^2

  if [a-b > a+b]^3
  then [x := a+b]^4 else [y := a+b]^5

  [q := r]^7;
  [z := a+b]^8;

  if [a>b]^9
  then [x := a+b]^10 else [y := a+b]^11

  if [a>b]^12
  then [x := a+b]^13 else [y := 5021]^14

------------------------------------------
------------------------------------------
      VERY BUSY EXPRESSIONS ANALYSIS

Analysis question (p. 44):

 "For each program point,
  which [non-trivial] expressions
  must be very busy
  at the exit from that point."

------------------------------------------
  When should expressions be very busy at the entry to a block?
  Would it make sense to have the analysis go forwards or backwards?
  Is this a "may" or a "must" analysis? What does that mean?
  Where do control flows join in this analysis?
   b. formal definition
------------------------------------------
         FORMAL DEFINITION

VBexit(l) =
  if l \in final(S*) then {}
  else 


VBentry(l) =
 (VBexit(l) \ killVB(B^l)) \cup genVB(B^l)
   where B^l \in blocks(S*)


killVB: Blocks* -> Powerset(Aexp*)

killVB([x:= a]^l) =

killVB([skip]^l) = {}
killVB([b]^l) = {}

genVB: Blocks* -> Powerset(Aexp*)

genVB([x:= a]^l) = Aexp(a)
genVB([skip]^l) = {}
genVB([b]^l) = Aexp(b)

------------------------------------------
   Have we seen these kill and gen functions before?
   Does this analysis need isolated exits? If so, why?
   Do we want the largest or the smallest solution?
   c. example
------------------------------------------
              EXAMPLE

  if [a-b > a+b]^1
  then [x := a+b]^2
  else [y := a+b]^3;
  [z := a]^4

Aexp* is {

l  killVB(l)   genVB(l)
=======================
1
2
3
4

VBentry(1) =
VBexit(1) =
VBentry(2) =
VBexit(2) =
VBentry(3) =
VBexit(3) =
VBentry(4) =
VBexit(4) =
------------------------------------------
      What would change if the condition at label 1 did not use a+b?
------------------------------------------
  FUNCTIONAL REPRESENTING THE EQUATIONS

Recall Aexp* = {a+b}
F(v) = (F1(v), F2(v), ..., F8(v))

F1(VB1,...,VB8)       // entry(1)
 = (VB2 - killVB([a-b > a+b]^1)
   \cup gen([a-b > a+b]^1)
 = (VB2 - {}) \cup {a-b,a+b}
 = VB2 \cup {a-b,a+b}
F2(VB1,...,VB8)       // exit(1)
 = VB3 \cap VB5
 
F3(VB1,...,VB8)       // entry(2)
 = (VB4 - killVB([x:=a+b]^2))
   \cup genVB[x:=a+b]^2))
 = (VB4 - {}) \cup {a+b}
 = VB4 \cup {a+b}
F4(VB1,...,VB8)       // exit(2)
 = VB7

F5(VB1,...,VB8)       // entry(3)
 = (VB6 - killVB([y:=a+b]^3))
   \cup genVB[y:=a+b]^3))
 = (VB6 - {}) \cup {a+b}
 = VB6 \cup {a+b}
F6(VB1,...,VB8)       // exit(3)
 = VB7

F7(VB1,...,VB8)       // entry(4)
 = (VB8 - killVB([z:=a]^4))
   \cup genVB([z:=a]^4)
 = (VB8 - {}) \cup {}
 = VB8
F8(VB1,...,VB8) = {}  // exit(4)
------------------------------------------
    Are larger or smaller sets more precise?
    What is the bad outcome to avoid?
    So what sets do we start with to find a solution?
------------------------------------------
     SOLVING USING CHAOTIC ITERATION

Recall Aexp* = {a+b}

Let VB1 = Aexp*, ..., VB8 = Aexp*

VB8 \neq F8(VB1,...,VB8)     // exit(4)
  = {}
so VB8 := {}

VB7 \neq F7(VB1,...,VB8)     // entry(4)
  = VB8
  = {}
so VB7 := {}

VB6 \neq F6(VB1,...,VB8)     // exit(3)
  = VB7
  = {}
so VB6 := {}

VB5 \neq F5(VB1,...,VB8)     // entry(3)
  = VB6 \cup {a+b}
  = {} \cup {a+b}
  = {a+b}
so VB5 := {a+b}

VB4 \neq F4(VB1,...,VB8)     // exit(2)
  = VB7
  = {}
so VB4 := {}

VB3 \neq F3(VB1,...,VB8)     // entry(2)
  = VB4 \cup {a+b}
  = {} \cup {a+b}
  = {a+b}
so VB3 := {a+b}

VB2 \neq F2(VB1,...,VB8)     // exit(1)
  = VB3 \cap VB5
  = {a+b} \cap {a+b}
  = {a+b}
so VB2 := {a+b}

------------------------------------------
   Now, does VB1 change?
   Can anything else change?
  4. Live Variables Analysis (2.1.4)
   a. idea and goals
------------------------------------------
           LIVE VARIABLES

def (p. 47): A variable x is *live*
     at exit from label l
     iff there is a path from l
     to a possible use of x
     (in an expression).

     This assumes that no variables are
     live at the end of the program.
     
Which variables are live at exit from 1?

(A)
   [x := 3]^1;
   if [z > 0]^2
   then [y := x+2]^3
   else [q := q+1]^4

(B)
   [x := 3]^1;
   [y := x+2]^2;
   [y := y+1]^3

(C)
   [x := 3]^1;
   [z := 4]^2;
   [x := z+2]^3
   while [z > 0]^4
   do ([y := x+2]^5;
       [z := z-1]^6)

------------------------------------------
------------------------------------------
       LIVE VARIABLES (LV) ANALYSIS

Analysis question (p. 47):
 "For each program point,
  which variables may be live
  at the exit from the point."

Example:

   [x := 3]^1;
   [z := 4]^2;
   [x := z+2]^3

------------------------------------------
   Which variables are live at exit of label 1? Label 2?
    What can we use this for?
   What is the bad outcome to avoid?
   What kinds of analysis information should we track in the analysis?
   Should this analysis go backwards or forwards?
   b. definitions and formalization
------------------------------------------
         FORMAL DEFINITION

LVexit(l) =
  if l = final(S*) then {}
  else

LVentry(l) =


kill: Blocks* -> Powerset(Var*)

killLV([x:= a]^l) = 
killLV([skip]^l) = {}
killLV([b]^l) = {}

genLV: Blocks* -> Powerset(Var*)

genLV([x:= a]^l) = 
genLV([skip]^l) = {}
genLV([b]^l) =
------------------------------------------
   Does this analysis need isolated exits? If so, why?
    Why is there a union for LVexit?
------------------------------------------
            EXAMPLE

   [x := 3]^1;
   [z := 4]^2;
   [x := z+2]^3

l  killLV(l)    genLV(1)
==========================
1
2
3

LVentry(1) =
LVexit(1) =
LVentry(2) =
LVexit(2) =
LVentry(3) =
LVexit(3) =
------------------------------------------
------------------------------------------
         THE FUNCTIONAL

F1(LV1,...,LV6)        // entry(1)
 = (LV2 - killLV([x := 3]^1)
   \cup genLV([x := 3]^1)
 = (LV2 - {x}) \cup {}
 = LV2 - {x}
F2(LV1,...,LV6)        // exit(1)
  = LV3

F3(LV1,...,LV6)        // entry(2)
 = (LV4 - killLV([z := 4]^2)
   \cup genLV([z := 4]^3)
 = (LV4 - {z}) \cup {}
 = LV4 - {z}
F4(LV1,...,LV6)        // exit(2)
  = LV5

F5(LV1,...,LV6)        // entry(3)
 = (LV6 - killLV([x := z+2]^3)
   \cup genLV([x := z+2]^3)
 = (LV6 - {x}) \cup {z}
F6(LV1,...,LV6) = {}   // exit(3)
------------------------------------------
   What makes the analysis information unsafe?
   What makes the analysis imprecise?
   Do we want the largest or the smallest solution?
    So, what would be the initial values for LV1,...,LV6?
  5. Derived Data Flow Information (2.1.5)
------------------------------------------
  LINKING DEFINITIONS AND USES

Use-definition (ud) chain:
   links use of var (in an expression)
   to its last assignment

Definition-use (du) chain:
   links last assignment of var
   to a use (in an expression)
------------------------------------------
     What might this be useful for?
   a. formal definitions
------------------------------------------
           DEFINITIONS AND USES

def (p., 50):
  (l1, ..., ln) is a
  *definition clear path for x* iff
   1. no block labeled {l1, ..., l(n-1)}
      assigns a value to x, and
   2. the block labeled ln uses x
      (as an expression)

  clear(x, l, l') =
    (\exists l1, ..., ln ::
         l = l1 & l' = ln & n > 0
       & (\forall i : 1 <= i < n :
          (li, l(i+1)) \in flow(S*))
       & (\forall i : 1 <= i < n :
          not(def(x, li)))
       & use(x, ln))

  def(x, l) = (\exists B :
                 [B]^l \in blocks(S*) :
                  x \in killLV([B]^l))

  use(x, l) = (\exists B :
                 [B]^l \in blocks(S*) :
                  x \in genLV([B]^l))

------------------------------------------
     How do you interpret the notion of a definition clear path for x?
     Why are the def and use functions correct?
     Does clear(y, 3, 7) tell you anything about the use of y?
------------------------------------------
           UD AND DU ANALYSIS

ud: Var* x Lab* -> Powerset(Lab*?)
ud(x, l') =
   {l | def(x, l),
       (\exists l2 : (l, l2) \in flow(S*):
                     clear(x, l2, l'))}
  \cup
   {? | clear(x, init(S*), l')}

du: Var* x Lab*? -> Powerset(Lab*)
du(x, l) =
 if l != ?
 then
   {l'| def(x, l),
       (\exists l2 : (l, l2) \in flow(S*):
                     clear(x, l2, l'))}
 else
   {l'| clear(x, init(S*), l')}

------------------------------------------
     What does ud(x, l') = {l1, l2, l3} mean?
     What does du(x, l) = {l1, l2, l3} mean?
     Can ud(x,l') be empty?
     What would it mean if du(x,l) is empty?
     What is the analysis domain?
     Do these require isolated entries?
     Are these must or may analyses?
     So what would a bad outcome be for a UD or DU analysis?
     Would we want the largest or smallest solution for a UD or DU analysis?
     Can we define du in terms of ud?
   b. example
------------------------------------------
        EXAMPLE

   [z := 3]^1;
   if [y > 0]^2
   then [y := z+2]^3
   else [y := y+1]^4

    ud(x, l)
l \ x |   y        z
========================
1
2
3
4

    du(x, l)
 l \ x |   y    z
=====================
   1
   2
   3
   4
------------------------------------------
   c. computation
    How could we use RD and LV to compute ud chains?
II. Monotone Frameworks (2.3)
  Can we identify some commonalities between different analyses?
  Would doing that help implement them?
 A. general pattern
------------------------------------------
          GENERAL PATTERN

A_o(l) = if l \in E then i
         else
         

A_.(l) = 

where

  \bigsqcup is either \bigcup or \bigcap

  F is either flow(S*) or flow^R(S*)

  E is {init(S*)} or final(S*)

  i is initial/final information

  f(l) is the transfer function for
       blocks B^l \in blocks(S*)


For a may analysis:
  \bigsqcup is
For a must analysis:
  \bigsqcup is

For a forward analysis:

  F is flow(S*)
  E is 
  A_o gives the entry conditions
  A_. gives the exit conditions

For a backward analysis:

  F is flow^R(S*)
  E is 
  A_o gives the exit conditions
  A_. gives the entry conditions

------------------------------------------
 B. basic definitions (2.3.1)
  1. property space
------------------------------------------
         PROPERTY SPACES

def (p. 65): a *property space*,
     L = (L, \bigsqcup), is a set with
     \bigsqcup: Powerset(L) -> L
     a join operation (\sqcup)
     that makes it a complete lattice.

Thus:

  l1 \sqcup l2 = \bigsqcup { l1, l2 }

  \bot = \bigsqcup {}

  l1 \sqsubseteq l2 = (l1 \sqcup l2 = l2)

Examples:

 For reaching definitions:

     L = Powerset(Var* x Lab^?_*)
     \sqcup = \cup
     \sqsubseteq = \subseteq

 For available expressions:

     L = Powerset(AExp*)
     \sqcup = \cap
     \sqsubseteq = \supseteq

------------------------------------------
   a. lattice theory
    i. partial orders, lub, joins, lattices
------------------------------------------
  POSETS, LUBS, AND LATTICES (APPENDIX A)

def: A *partially ordered set (poset)*
     is a pair (L, <=) of a set, L,
     together and a binary relation, <=,
     on L such that for all x,y,z in L:
  (reflexive)     x <= x,
  (antisymmetric) x <= y and y <= x
                    implies x = y,
  (transitive)    x <= y and y <= z
                    implies x <= z.

def: for all x,y,u in L,
     u is an *upper bound of x and y* iff
     x <= u and y <= u.
     
def: for all x,y,u in L,
     u is the *least upper bound
     of x and y* written lub(x,y) = u
     iff u is an upper bound of x and y
     and z such that z is an upper bound
                     of x and y, u <= z.

def: For all X, Y in L,
     the *join of X and Y*,
     is X \sqcup Y = lub(X,Y).
     This is also written as sup(X,Y).

------------------------------------------
     does the lub of two elements x,y in L need to be in the set L?
------------------------------------------
LATTICES, COMPLETE LATTICES (2.3.1, A.1-2)

def: A *lattice* is a poset in which
     each pair of elements
     has a least upper bound.

def: A *complete lattice* is a lattice L
     such that every nonempty subset of L
     has a least upper bound (in L).
------------------------------------------
   So, what is a property space?
    ii. meets, glbs
------------------------------------------
  GLBS and COMPLETE LATTICES (APPENDIX A)

Assume (L, <=) is a poset.

def: for all x,y,n in L,
     n is a *lower bound of x and y* iff
     n <= x and n <= y.
     
def: for all x,y,n in L,
     n is the *greatest lower bound
     of x and y* written glb(x,y) = n
     iff n is a lower bound of x and y
     and for all z
     such that z is a lower bound
                      of x and y, z <= n.

def: For all X, Y in L,
     the *meet of X and Y*,
     is X \sqcap Y = glb(X,Y).
     This is also written as inf(X,Y).

Thm: If (L, \sqsubseteq) is a lattice,
     then the following hold
     for all x, y, z in L:
      (commutative)
                  x \sqcup y = y \sqcup x,
      (associative)
         x \sqcup (y \sqcup z)
           = (x \sqcup y) \sqcup z,
      (absoption)
         x \sqcup (y \sqcap x)
           = (x \sqcup y) \sqcap x
           = x
------------------------------------------
   b. chains and the ascending chain condition
------------------------------------------
  CHAINS, ASCENDING CHAIN CONDITION (A.3)

def: A subset Y of a poset (L,\sqsubseteq)
     is a *chain* iff
        for all l_1, l_2 in Y,
           l_1 \sqsubseteq l_2
           or l_2 \sqsubseteq l_1
           
def: An *ascending chain* in L
     is a sequence (l_n)_{n \in N}
 such that each l_i in L
 and for all n,m <= N,
          n <= m ==> l_n \sqsubseteq l_m.

def: a sequence (l_n)_{n \in N}
     *stabilizes* iff there is some
     number n0 such that for all n >= n0,
             l_n = l_n0.
             
def: L *satisfies the ascending chain
       condition* iff
       every ascending chain in L
       eventually stabilizes.

def: a *domain* is a lattice
     that satisfies the
     ascending chain condition. 

------------------------------------------
   If a chain is finite, does it stabilize?
   Is it necessary for lattice to be finite
      in order to satisfy the ascending chain condition?
  2. transfer functions
------------------------------------------
          TRANSFER FUNCTION SPACE

def: Let L be a partially-ordered set.
     Then Funs is a 
     *transfer function space for L* iff

     f \in Funs
         ==> f : L -> L and f is monotone,

     f,g \in Funs ==> f o g \in Funs, and

     id_L \in Funs.
------------------------------------------
  3. monotone framework
------------------------------------------
         MONOTONE FRAMEWORK

def: (L, Funs) is a monotone framework
     iff L is a property space and Funs
     is a transfer function space for L.

def: (L, Funs, F, E, i, f_.) is an
 *instance of a monotone framework*
 if and only if:

 - (L, Funs) is a monotone framework,

 - F is a finite set of pairs (of flows),

 - E is a finite set of extremal labels,

 - i \in L is an extremal value,

 - f : (dom(F) \cup E) -> (L -> L)
        s.t. for l in (dom(F) \cup E)
              f_l \in Funs
------------------------------------------
 C. examples (2.3.2)
 D. predicate abstraction (new topic)
------------------------------------------
          PREDICATE ABSTRACTION

Goal: verify program properties
Idea: Use property space of the form

  L = Powerset(Preds)

  Preds = {P1, ..., Pn}
     where each Pi is a nullary predicate

Interpretation:

  {P3,P5} means P3 and P5 may (must) be true
      (depending on kind of analysis)

  \sqcup is \cup (or \cap)

  \sqsubseteq is \subseteq (or \supseteq)

  Funs = monotonic (in \sqsubseteq)
          functions on L
------------------------------------------
  For a may analysis, what's the bottom element?  The top?
  How can you represent states?
------------------------------------------
       PREDICATE ABSTRACTION EXAMPLE

IsZero Analysis:

    At a given program point,
    which variables may be 0.

"Abstract States"
  s \in L = Powerset(Preds)

  where
   Preds = {IsZero_x | x in Vars*}

   IsZero_y means y may be 0
   
 F is flow(S*)
 E is {init(S*)}
 i is Preds

 fIZ(l) : L -> L, for l in Lab*

 fIZ(l)(s) = (s \ kill_IZ(B^l)(s)) \cup gen_IZ(B^l)(s)
                where B^l in blocks(S*)

  kill_IZ([x := a]^l)(s) = {IsZero_x}
  kill_IZ([skip]^l)(s) = {}
  kill_IZ([b]^l)(s) = {}

  gen_IZ([x := a]^l)(s) = 
        {IsZero_x | (\exists cs \in \gamma(s) :: A[[a]]cs == 0)}
  gen_IZ([skip]^l)(s) = {}
  gen_IZ([b]^l)(s) = {}

  \gamma: L -> Store
  \gamma(s) = {cs | cs: Var* -> Int, IsZero_x \in s ==> cs(x) == 0}
 
------------------------------------------
        What kind of analysis is this?
        Why this initial value i?
        What do the gen and kill functions do?
        What are the equations for IZ_entry(l) and IZ_exit(l)?
------------------------------------------
          EXAMPLE

  [y := 3]^1;
  while [y>0]^2
  do ([q := y-2]^3;
      [y := y-1]^4);
  [q := q div y]^5

Var* = {y, q}
Preds = {IsZero_y, IsZero_q}

IZ_entry(1) = 
IZ_exit(1) = 
IZ_entry(2) =
IZ_exit(2) = 
IZ_entry(3) = 
IZ_exit(3) = 
IZ_entry(4) = 
IZ_exit(4) = 
IZ_entry(5) = 
IZ_exit(5) = 

------------------------------------------
 E. equation solving (2.4)
  1. MFP (Maximal Fixed Point) solution (2.4.1)
  2. MOP solution (2.4.2) (skip)