COP 5021 Lecture -*- Outline -*-

	meeting 1

* A View of Type Checking as Abstract Interpretation

    This lecture is from...
------------------------------------------
            SOURCE

Patrick Cousot.
Types as abstract interpretations.
In POPL'97, pp. 316-331, 1997.
https://doi.org/10.1145/263699.263744


------------------------------------------
   The Idea is to 
      Use what we understand
      about type checking
      to understand abstract interpretation

** Introduction
------------------------------------------
              INTRODUCTION

Idea of abstract interpretation:

  Program semantics,
  Program proof methods,
  Static analysis techniques,

all have a common structure
that


Slogan:


------------------------------------------
   ... abstracts the structure of a computation.

   ... It's all "semantic approximation".

** Example: Call-by-Value Lambda Calculus
    It's call-by-value because it evaluates arguments before calls,
    so it's "eager".

*** Denotational Semantics
    Q: What does a denotational semantics look like?
       It has abstract syntax, domains, and semantic functions,
       so it's like an interpreter written in a functional language.

    Q: Why not use operational semantics?
       - Denotational semantics was used in Milner's classic paper
       - Denotational semantics is an initial abstraction of the
          computation, where individual steps are ignored.
       - The denotational semantics is itself an appliation
         of abstract interpretation to the operational semanics  
       
**** abstract syntax
------------------------------------------
     ABSTRACT SYNTAX OF LAMBDA CALCULUS

  x, f, ... \in X    "program variables"

          e \in E          "expressions"

      e ::= x | fn x ==> e | e1(e2)
          | rec f . fn x ==> e
          | 1 | e1 - e2 | (e1 ? e2 : e3)
------------------------------------------

        where "in rec f . fn x ==> e the function f with formal
        parameter x is defined recursively. [And] (e1 ? e2 : e3) is
        the test for [the integer] zero"

**** semantic domains
------------------------------------------
         NOTATION FOR SEMANTIC DOMAINS

\bot  denotes non-termination (looping)
wrong denotes a type error

D_{\bot} is the domain D \cup {\bot}
  with up: D -> D_{\bot} defined by
       up(d) = d
     down: D_{bot} -> D defined by
     down(up(d)) = d
     down(\bot) is undefined

\Omega = up(wrong)

+ forms a "Coalesced sum of domains"
 (. :: D1)  : D1 -> D1 + D2
 (. :: D2)  : D2 -> D1 + D2

[. -> .] forms "continuous"
   \bot-strict and wrong-strict
   functions

   So if f \in [D1 -> D2],
   then f(\bot) = \bot
   and  f(wrong) = wrong
------------------------------------------

        Q: What does strictness mean?
           It means that these problems are preserved
        Q: Is it possible to tell if a computation produces an infinite loop?
           No, but we describe that in the math
           
------------------------------------------
               SEMANTIC DOMAINS

Following Guenter and Scott:

  wrong \in W                "type errors"
      z \in Z                   "integers"
       
   u, f \in U =iso=               "values"
     W_{\bot} + Z_{\bot} + [U -> U]_{\bot}

     r \in |R = X -> U      "environments"
   \phi \in S = |R -> U  "semantic domain"
------------------------------------------

     the notation =iso= means "is isomorphic to".
     Cousot uses a fancy typeset notation for this,
        because the domain is recursively defined.
     
**** denotational semantics with call-by-value, runtime type checking

------------------------------------------
    NOTATION FOR DENOTATIONAL SEMANTICS

 r[x <- u](y) = if y == x
                then u
                else r(y)

 lfp : (L -> L) -> (L -> L)
   is least fixed point operator
   on domain L
------------------------------------------
   lfp(F) is the least fixed point in L's ordering,
      which is defined when F is monotone

   This is what we want for recursive definitions of functions
      (following Scott's work)
      
------------------------------------------
    (DENOTATIONAL) SEMANTIC FUNCTIONS

 S: E -> S

 S[[x]](r) = r(x)
 
 S[[fn x ==> e]](r)
   =


 S[[e1(e2)]](r)
   = let f = S[[e1]](r)
     in if f = \bot or f = \Omega
        then f
        else (down(f))(S[[e2]](r))

 S[[rec f . fn x ==> e]](r)
   = 


 S[[1]](r) = up(1)::Z_{\bot}

 S[[e1 - e2]](r)
   = let z1 = S[[e1]](r) in
     let z2 = S[[e2]](r) in
        if z1 = \bot or z2 = \bot
        then \bot
        else if z1 = \Omega or z2 = \Omega
             then \Omega
             else up(down(z1) - down(z2))
                    :: Z_{\bot}

 S[[e1 ? e2 : e3]](r)
   = let u1 = S[[e1]](r)
     in if u1 = \bot
        then \bot
        else if u1 = \Omega
                or down(u1) \not\in Z
             then \Omega
             else if down(u1) = 0
                  then S[[e2]](r)
                  else S[[e3]](r)
------------------------------------------
  ... up(lambda u .
        if u = \bot or u = \Omega
        then u
        else S[[e]](R[x <- u])
          :: [U -> U]_{\bot}

        Why the call to up in the semantics of (fn x ==> e)?
            Because we need to inject this into a lifted domain (with \bot)
        Why is ::[U -> U]_{\bot} at the end?
            To inject the lifted value into the given summand of U.
        These questions are the same as for the semantics of 1.

  ... lfp(lambda \phi . S[[fn x ==> e]]
                        (R[f <- \phi]))

      Q: If you were implementing this, how would you implement tests
         such as e1 = \bot, found in the semantics of e1 - e2 ?
         You would just evaluate the semantics of e1,
         and if it goes into an infinite loop,
         then the interpreter goes into an infinite loop...

*** Collecting Semantics
------------------------------------------
   ELEMENTS OF ABSTRACT INTERPRETATION

An abstract interpretation has:

  - a collecting semantics, used for


  - an approximation, used for

  
  - relations between these, used for


The analysis is an approximation
   of the collecting semantics
   
------------------------------------------
      ... (clearly) specifying the semantics

      ... the analysis, computation

      ... correctness arguments
          and to derive the analysis

------------------------------------------
        SEMANTIC PROPERTIES

def: A *semantic property* is a set of
     semantic values

     P \in Prop = PowerSet(S)

It is a complete lattice
  (partially ordered by \subseteq)

------------------------------------------
   Note that \subseteq is the same as logical implication

------------------------------------------
    COLLECTING SEMANTICS FOR THE EXAMPLE

Standard collecting semantics:

   C : E -> Prop
   C[[e]] = {S[[e]]}


------------------------------------------
        Note that S[[e]] is a function
        that takes the environment as an argument

        The standard collecting semantics
        gives the strongest program property.

** Church/Curry Monotype Semantics

    This is a simple example
    to help us focus on the abstract interpretation part

*** Type Expressions and Semantic Domains
------------------------------------------
   TYPE EXPRESSIONS AND SEMANTIC DOMAINS

     m \in M                    "monotype"
     H \in H            "type environment"
             = X ->_{fin} M   
\theta \in I = H x M              "typing"
    T \in PT = PowerSet(I)  "program type"

m ::= int | m1 -> m2
------------------------------------------
         ->_{fin} is a finite function
         
       Q: What's the opposite of a monotype?
           A polymorphic type
       Q: Does this syntax allow types like (int->int) -> (int->int)?
           Yes, so they can be higher order
       Q: What's not allowed?
           polymorphic types like all 'a . 'a -> 'a

*** Meanings of Types, Concretization
    Outside of abstract interpretation,
    the following would be called "the semantics of types"
------------------------------------------
         MEANINGS OF TYPES

concretization function for monotypes:

          g1: M ->_{fig(n} PowerSet(U)

g1(int) = {up(z)::Z_{\bot} | z \in Z}
             \cup {\bot}

g1(m1 -> m2)
     = {up(\phi)::[U->U]_{\bot}
          | \phi \in [U -> U],
              (\forall u \in g1(m1) ::
                    \phi(u) \in g1(m2))}
                \cup {\bot}


for type environments:

          g2: H -> PowerSet(R)

g2(H) = {r \in |R | (\forall x \in X ::
                      r(x) \in g1(H(x)))}


for typings:

          g3 : I -> Prop

g3((H,m))
  = {\phi \in S |
     (\forall r \in g2(H)
                    :: \phi(r) \in g1(m))}


for program types:

            g : PT -> Prop

g(T) = \bigcap_{\theta \in T} g3(\theta)

g(\emptyset) = S

------------------------------------------
        The last one is the concretization function
        that is used for type checking,
        the others are auxiliary functions

*** Galois connection

    Named after Evariste Galois,
    these generalize the fundamental theorem of Galois theory
    (that relates groups and fields).

    Galois connections are used to prove correctness
       (for abstractions, like abstract interpretations)
    and to derive computations.
    
------------------------------------------
        DEFINITION OF GALOIS CONNECTION

Def: Suppose that L is
       partially ordered by <=,
   and M is
       partially ordered by \sqsubseteq,
   a : L -> M, and
   g : M -> L,
then (a,g) is a *Galois connection* iff
   for all l \in L, m \in M ::
      a(l) \sqsubseteq m
    iff
      l <= g(m)

Corollary (p. 232 in the textbook):
    Suppose that (a,g) is a
    Galois connection between (L,<=)
    and (M,\sqsubseteq),
    then for all l \in L, m \in M::
       l <= g(a(l))                 (4.8)
    and
       a(g(m)) \sqsubseteq m        (4.9)

Corollary:
   If (a,g) is a Galois connection,
   then a preserves existing lubs
   and g preserves existing glbs
------------------------------------------
        a and g are usually typeset as \alpha and \gamma

        Q: What's an "existing lub"?
           An element of the domain that is a lub

        Conditions 4.8 and 4.9 say that one does not lose precision
        by changing domains. 4.8 says that if we abstract by a,
        to get a(l), then we know that g(a(l)) is a safe approximation
        (via <=), although it may be less precise.
           
------------------------------------------
       GALOIS CONNECTION PROPERTIES

Lemma:
    g(\bigsqcup_{i \in \Delta} T_i)
  = \bigsqcap_{i \in \Delta} g(T_i)).

Corollary.
 If g is such that
    g(\bigsqcup_{i \in \Delta} T_i)
  = \bigsqcap_{i \in \Delta} g(T_i)),
 then there is a unique (upper) adjoint

     a : Prop -> PT

such that (a, g) is a Galois connection.
------------------------------------------
        The lemma means that the concretization of the best
        (over-)approximation (lub) of abstract elements is the
        most precise (under-)approximation of all of the individual
        abstract elements

        In our example:
            a maps sets of (semantic) values to types
        and g maps types to sets of values

        So in our example, the lemma says that
        if one takes the union of several types
        then the meaning of that is a value that has
        each of those types

        The following may help to understand this...
------------------------------------------
            CONSEQUENCES

If P is a property (set of values),
then a(P) is the most precise type
     of programs with property P

Implications of program properties
is abstracted by superset inclusion:

 E.g.,  let oddInt = {1, 3, 5, ...}
            int = {1,2,3,4, ...}

   then oddInt \subseteq int
   and g(int) \supseteq g(oddInt)

------------------------------------------
   Note that oddInt \subseteq int
      means that oddInt is a subtype (subset) of int
   Note that the last means g(oddInt) \subseteq g(int)
       (which also follows from the Galois connection)

*** Monotype semantics
     This is on p. 318 of Cousot's paper
------------------------------------------
          MONOTYPE SEMANTICS

  T : E -> PT

T[[x]] = {(H,H(x)) | H \in |H}

T[[fn x ==> e]]
    =


T[[e1(e2)]]
    = {(H, m2) | (H, m1->m2) \in T[[e1]],
                 (H, m1) \in T[[e2]]}

T[[rec f . fn x ==> e]]
    = 


T[[1]] = {(H,int) | H \in |H }

T[[e1 - e2]]
    = {(H,int) |
         (H,int) \in T[[e1]] \cap T[[e2]]}

T[[(e1 ? e2 : e3)]]
    = {(H,m) | (H,int) \in T[[e1]],
           (H,m) \in T[[e2]] \cap T[[e3]]}
                
------------------------------------------
    ...  {(H, m1->m2) |
            (H[x<-m1],m2) \in T[[e]]}

    ...  {(H,m) | (H[f<-m],m)
                \in T[[fn x ==> e]]}

        Q: What do the typings for e1-e2 and (e1?e2:e3) mean?
           That it's int (for e1-e2) and m (for (e1?e2:e3))
           for all type environments that make e1 and e2 have type int
            (for e1-e2)
            and e2 and e3 have the same type m (for (e1?e2:e3)).

        (Cousot's paper shows how to interpret this as a set of
        natural deduction rules)

        Could compare to the rules, e.g.,

          H |- x => H(x)  which means T[[x]] = {(H,H(x))}

          H |- e1 => m1 -> m2,
          H |- e2 => m1
    [app] ----------------------- 
          H|- e1(e2) => m2

           which means T[[e1(e2)]] = {(H,m2) | (H,m1->m2) \in T[[e1]],
                                               (H,m1) \in T[[e2]]}

          etc.

          So the typings are the pairs of the type envionment and
          (mono)type from the corresponding judgment,
          and the hypotheses are the conditions in the set comprehension.
        
------------------------------------------
           SOUNDNESS

def:
 The type semantics T[[.]] is *sound* iff
 for all programs e \in E ::
    S[[e]] \in g(T[[e]])

Corollary: (With the conventions above)
    S[[e]] \in g(T[[e]])
iff a(C[[e]]) \supseteq T[[e]]
iff C[[e]] \subseteq g(T[[e]]).

Corollary (soundness):
   Let e \in E be a program,
   and let H be a type environment.
   Then typeable programs cannot go wrong
   in the sense that:
(H,m) \in T[[e]] and r \in g2(H)
                 and S[[e]](r) \neq \bot
implies
  S[[e]](r) \neq \Omega

------------------------------------------
        Q: Where are the runtime type errors in the statement of soundness?
           They would be possible in S[[e]](r)

        Q: How could this theorem be proved using the Galois connection?
        Proof: S[[e]](r) \in g1(m) and \Omega \not\in g1(m).

** Example 2: Polymorphism
   Cousot calls this the "Church Curry Polytype Semantics"

   goal: allow for recursive calls to be assigned different monotypes,
          as in SML (i.e, as in Milner's work).

*** Polytype Syntax
------------------------------------------
         POLYTYPE SYNTAX

    m \in M                     "monotype"
    m ::= int | m1 -> m2

    p \in |P = PowerSet(M)      "polytype"

    H \in H             "type environment"
            = X ->_{fin} P

    t \in I = H x M               "typing"

   T \in PT = PowerSet(I)   "program type"


------------------------------------------
        The sets of monotypes in a polytype are thought of as
        conjunctions  I.e., if f has polytype
               {int -> int, (int->int) -> (int->int)}
               then it has both of those types.

        (This makes sense with the Galois connection, as the meaning
        of a union of types is the intersection of those meanings.)

*** Meanings of Types, Concretization
------------------------------------------
         MEANINGS OF TYPES

concretization function for monotypes:

          g1: M -> PowerSet(U)

g1(int) = {up(z)::Z_{\bot} | z \in Z}
             \cup {\bot}

g1(m1 -> m2)
     = {up(\phi)::[U->U]_{\bot}
          | \phi \in [U -> U],
              (\forall u \in g1(m1) ::
                    \phi(u) \in g1(m2))}
            \cup {\bot}


for polytypes:

          g2: P -> PowerSet(U)

 g2(p) = \bigcap_{m \in p} g1(m)

 g2(\emptyset) = U
 

for type environments:

          g3: H -> PowerSet(R)

g3(H) = {r \in R | (\forall x \in X ::
                      r(x) \in g2(H(x)))}


for typings:

          g4 : I -> Prop

g4((H,m))
  = {\phi \in S |
     (\forall r \in g3(H)
                    :: \phi(r) \in g1(m))}


for program types:

            g : PT -> Prop

g(T) = \bigcap {g4(\theta) | theta \in T}

g(\emptyset) = S

------------------------------------------
          Recall that Prop = PowerSet(S), so properties are sets of
          values and the meaning of a program typing is thus a set of
          values (as before)

          Q: Do we get a Galois connection?
             Yes, with a(P) being the most precise type of programs
             with property P.

*** Polymorphic Typing
------------------------------------------
          POLYMORPHIC TYPING

   T : E -> PT

T[[x]] = {(H,m) | m \in H(x)}

T[[fn x ==> e]]
    = {(H, m1->m2)
        | (H[x<-m1],m2) \in T[[e]]}

T[[e1(e2)]]
    = {(H,m2) | (H,m1->m2) \in T[[e1]],
                (H,m1) \in T[[e2]]}

T[[let x = e1 in e2]]
    = {(H,m2) |
         (e\xists p1 \neq {}
          : m1 \in p1 :
          (H,m1) \in T[[e1]] and
          (H[x<-p1],m2)\in T[[e2]])}

T[[rec f . fn x ==> e]]
    = 


T[[1]] = {(H,int) | H \in |H }

T[[e1 - e2]]
    = {(H,int) |
         (H,int) \in T[[e1]] \cap T[[e2]]}

T[[(e1 ? e2 : e3)]]
    = {(H,m) | (H,int) \in T[[e1]],
           (H,m) \in T[[e2]] \cap T[[e3]]}

------------------------------------------

    ...  {(H,m) | m \in gfp \Psi }
          where
            \Psi(p)
            = {m'| (H[f<-p],m')
                    \in T[[fn x ==> e]]}

          What does the gfp mean? The greatest fixed point,
          so the whole thing is:

T[[rec f . fn x ==> e]]
    = {(H,m) |
        (\exists p \subseteq M -> M
         :: m \in p and
         (\forall m' \in p ::
              (H[f<-p],m')
                \in T[[fn x ==> e]]))}


    Q: Why require that p \neq {} in the typing of let?
       If we allowed p = {}, then
            let x = e1 in 1 would always be typeable,
            since x is not used in 1, but this is not sound if
            we want let x = e1 in 1 to be equal to (fn x ==> 1)(e1),
            which will go wrong if e1 goes wrong (in call-by-value).

    Cousot shows (prop. 26) that the type system of SML is
    a (strict) abstraction of the above, as SML's type system does not
    type

          rec F f g n x = if n = 0 then g(x)
                          else F(f)(fun x ==> (fun h ==> g(h(x))))(n-1)(x)(f)

    but the above type system does type it.


    (Section 7 of the paper uses abstract interpretation techniques
     to derive the monotype system from an abstraction function,
     and a Galois connection.)

    Q: Is this type system sound?
       Yes, due to its construction as a Galois connection.
       Soundness means that:
         if (H,m) \in T[[e]] and r \in g3(H) and S[[e]](r) \neq \bot
         then S[[e]](r) \neq \Omega


** Abstract Semantics and Interpreters

*** compositional abstract semantics
      This is from section 8 of Cousot's paper

------------------------------------------
          ABSTRACT SEMANTICS

Def: an *abstract domain* is specified by
     a poset, (T#,<=#), and an
     abstract semantic function,
          T# : E -> T#.

Def: An abstract semantics (T#,<=#) and T#
     is *compositional* iff
   for all e \in E :
    e = [x1,...,xn](e1,...,en)
    where the xi \in X are locally bound,
      and the ei \in E are subexpression :

   T#[[e]] = \Psi#_e(T#[[e1]],...T#[[en]])

   so that T# is defined compositionally
   based on the primitives \Psi#_e.

Def: A compositional abstract semantics
   is monotone when all primitives are
   monotone, i.e.,
     (T1,...,Tn) <=#(T1',...,Tn')
  ==> (\Psi#_e(T1,...,Tn)
        <=# \Psi#_e(T1',...,Tn'))

Def: A monotone compositional abstract
  semantics is an *abstract interpreter*
  when the abstract domain (T#,<=#)
  is representable (in a computer).
------------------------------------------

*** abstraction and soundness
   This is section 9 of Cousot's paper

------------------------------------------
               ABSTRACTION

Def: An abstract semantics,
     with abstract domain (T#,<=#)
     and abstract semantic function T#
     is an *abstraction* of a (concrete)
     semantics (Tb,<=b)
     (with semantic function Tb)
     iff there is a concretization map
        g : T# -> Tb
     that is monotone and such that
     for all e \in E ::
        Tb[[e]] <=b g(T#[[e]]).

Def: An abstract semantics is *sound*
     if it is an abstraction of
     the collecting semantics.
------------------------------------------

        Q: What else can we say about g?
           It is the (upper adjoint) part of a Galois connection
           (because it is monotone, it uniquely determines a : Tb -> T#)

        Q: What needs to be checked about an abstraction if the
        semantics is compositional and monotone?
          Only that each semantic primitive preserves approximations

             (\Psib_e(g(T#[[e1]]),...,g(T#[[en]]))
                   <=b g(\Psi#_e(T#[[e1]],...,T#[[en]])))

          Since that implies the general compositional abstraction
          condition:
          
             (forall i : 1 <= i <=n : Tb[[ei]] <=b g(T#[[ei]]))
             ==> (\Psib_e(Tb[[e1]],...,Tb[[en]])
                      <=b g(\Psi#_e(T#[[e1]],...,T#[[en]])))

          so that the abstract semantics (#) is an abstraction of the
          collecting semantics.

        Q: Is an abstraction of a sound abstraction also sound?
           Yes.

        Q: Do Galois connections compose?
           Yes, they do.

*** design of abstract semantics

    When we have a compositional semantics, as we always do in a type
    system, we can calculate the abstract interpreter...
    
------------------------------------------
     DESIGN OF AN ABSTRACT INTERPRETER

A. Find a Galois connection (a,g)
   from the collecting semantics (Tb,<=b)
   to a suitable property space (T#,<=#)

B. Find a set of primitives, Pb and P#
   that are all monotone.

C. Check these make a monotone
   compositional abstraction:
    forall e \in E ::
     a(\Psib_e(Tb[[e1]],...,Tb[[en]]))
        <=# \Psi#_e(a(Tb[[e1]]),
                      ...,a(Tb[[en]]))
    
D. For each e \in E, calculate \Psi#_e,
   starting from
   \Psi#_e(a(Tb[[e1]]),
                      ...,a(Tb[[en]]))
   (approximating when necessary)
------------------------------------------
        Q: Why do we want to calculate the \Psi#_e?
           because those will be the transfer functions in our static analysis
            (i.e., the basis for the abstract interpreter)

        This methodology will make the abstract interpreter sound by construction