COP 5021 meeting -*- Outline -*-

* Setting the Scene (1.2)

** The While Language
*** concrete syntax

------------------------------------------
  CONCRETE SYNTAX FOR THE WHILE LANGUAGE

% ---- Regular (lexical) syntax ----

<LINETERMINATOR> ::= \r | \n | \r\n
<NONLINETERMINATOR> ::= "any character
                         except \r or \n"
<COMMENT> ::= % <NONLINETERMINATOR>
                <LINETERMINATOR>
<WHITESPACE> ::= <LINETERMINATOR>
               | " " | \t | \f

<OPRELATIONAL> ::=  != | == | < | >
                 | <= | >=
<OPPLUS> ::= + | -
<OPMUL> ::= * | /

<IDCHAR> ::= "any letter" | ' | _
<DIGIT> ::= 0 | 1 | 2 | 3 | 4 | 5 | 6
          | 7 | 8 | 9
<IDENTIFIER> ::= <IDCHAR>
                  ( <IDCHAR> | <DIGIT> )*
<NUMBER> ::= (-)?(<DIGIT>)+

% ---- Context-Free syntax ----

<Program> ::= <Stmt>

<Stmt> ::= <Identifier> := <AExpression> 
    | skip
    | <Block>
    | while <BExpression> do <Block>
    | if <BExpression> then <Block>
                       else <Block>

<Block> ::= '{' <BlockStmtList> '}'

<BlockStmtList> ::= <Stmt>
                | <BlockStmtList> ; <Stmt>

<AExpression> ::= <ATerm> 
         | <ATerm> <OPPLUS> <AExpression>
<ATerm> ::= <APrimary>
         | <APrimary> <OPMUL> <ATerm>
<APrimary> ::= <IDENTIFIER>
         | <NUMBER> | ( <AExpression> )

<BExpression> ::= <BTerm>
         | <BTerm> or <BExpression>

<BTerm> ::= <BRelExp>
         | <BRelExp> and <BTerm>

<BRelExp> ::= <BPrimary> 
         | not <BRelExp>
         | <AExpression>
           <OPRELATIONAL> <AExpression>
         | ( <BExpression> )

<BPrimary> ::= true | false
------------------------------------------

------------------------------------------
              EXAMPLE

    y := x;
    z := 1;
    while y>1
    do { z := z*y;
         y := y-1 };
    y := 0

------------------------------------------

        Note: operator precedence, parentheses, mandatory use of 
        { and } for blocks in bodies of if and while, semicolons
        separate statements in blocks, etc.

        All these are forgotten in the ...

*** abstract syntax
------------------------------------------
   ABSTRACT SYNTAX OF THE WHILE LANGUAGE

  a \in AExp   "arithmetic expressions"
  b \in BExp   "Boolean expressions"
  S \in Stmt   "statements"

x,y \in Var    "variables"
  n \in Num    "numeric literals"
  l \in Lab    "labels"
opa \in Op_a   "arithmetic operators"
opb \in Op_b   "Boolean operators"
opr \in Op_r   "relational operators"

  S ::= [x:= a]^l
      | [skip]^l
      | S1 ; S2
      | if [b]^l then S1 else S2
      | while [b]^l do S

  a ::= x
      | n
      | a1 opa a2

  b ::= true
      | false
      | not b
      | b1 opb b2
      | a1 opr a2
------------------------------------------

    There is an index of notations at the end of the book!

    Q:  What's the difference between abstract and concrete syntax?
    The abstract syntax only records the essential information;
       it describes the parse trees, 
       not the characters needed to reconstruct them

------------------------------------------
              EXAMPLE

    [y := x]^1;
    [z := 1]^2;
    while [y>1]^3
    do ([z := z*y]^4;
        [y := y-1]^5);
    [y := 0]^6

------------------------------------------

    Q:  Why are the labels only attached to certain places in the
        abstract syntax trees?

        These are the places in statements where execution actually
            happens, the rest is plumbing.

    Q:  Which blocks are elementary?
        the ones that can take labels: tests, assignments, skip

    Q:  What is used to disambiguate different parse trees?
        parentheses, i.e., ( and )

*** semantics
    Q:  What does the example above do?
    Q:  What does skip do?
    Q:  What kind of numeric literals are in this language?
        integers
    Q:  What would be some reasonable arithmetic operators?
         +, -, *, /
    Q:  One would be some reasonable relational operators?
         <, >, <=, >=, ==, !=
    Q:  What would be some reasonable Boolean operators?
         and, or
    Q:  What's the type of the relational operators?
         Boolean
    Q:  Can the labels be repeated?
        no!

    Q:  What's missing?
         classes, objects, exceptions, for loops, parallelism
         
*** variations
    Some variations to consider in class ...
------------------------------------------
VARIATION 1: CONTROL FEATURES

 S ::= ...
    | break
    | for x in a1 .. a2 do S
    | throw 
    | try S1 catch S2


VARIATION 2: TAINTING and INFORMATION FLOW

  S ::= ...
     | read x
     | sanitize x
     | print a


VARIATION 3: SPECIFICATION FEATURES

 S ::= ...
    | assert b
    | assume b
    | choose S1 or S2


VARIATION 4: PARALLELISM

 S ::= ...
    | S1 `||' S2


VARIATION 5: FUNCTIONS

 a ::= ...
    | fn x => a
    | a1 a2
    | let d in a

 d ::= x = a
    | d1; d2
 
------------------------------------------

   Q:  Which of these need labels, and where do the labels go?
       1. on break, throw on a1 and a2 in for,
       2. on read, print
       3. on the b's in assert and assume
       4. no new ones
       5. on each subexpression?

We'll leave procedures, methods, and object-oriented stuff for
   later...

** reaching definitions analysis

   This is just an example of a data flow analysis used in Chapter 1.

*** definition
------------------------------------------
   REACHING DEFINITIONS (ASSIGNMENTS)

def: Let P be a program.
  An assignment [x := a]^l at label l
  *may reach* a program point in P
  if in some execution of P,
  when execution reaches that point,
  the last assignment to x was done at l.

RD(P, point) says what (labels of)
         assignments may reach point in P.

What can we know for certain?


What kind of errors can this detect?


------------------------------------------
        Q: How to describe program points?
           entry to or exit from a labeled block (statement)
        Q: What can we tell for certain from this analysis's results?
           - if some definition reaches that point? no, just maybe
           - if some definition can't reach that point? yes!
           - if no definitions reach that point? yes!
        Q: What kind of errors could this analysis be used to detect/prevent?
           - undefined variables
           - flow from secret to public variables
           - flow from inputs to outputs (if extended to read/print)

        An example of the general idea of 
        designing the analysis 
        so it's *negation* is what you want to know for certain.

*** examples
------------------------------------------
      EXAMPLE

    [y := x]^1;
    [z := 1]^2;
    while [y>1]^3
    do ([z := z*y]^4;
        [y := y-1]^5);
    [y := 0]^6

 (y,1) reaches entry to 2
------------------------------------------
    Q:  What definitions reach the entry for label 3?
        (y,1), (y,5), (z,2), (z,4)
    Q:  What definitions can't reach the entry for label 3?
        (y,6)

    Q: How do we talk about uninitialized variables?
        use ? for the label, as in (x,?)

    See table 1.1 for RDentry and RDexit for this example

    Q:  In table 1.1, what if we add (y,1) to RDentry(5)?
    Q:  In table 1.1, what if we remove (y,5) from RDentry(6)?

------------------------------------------
      FOR YOU TO DO

Compute a table like table 1.1 for:

    [t := x]^1;
    [x := y]^2;
    [y := t]^3
    if [y>x]^4
    then [r := y]^5
    else [r := x]^6;
    assert [r >= x and r >= y]^7


------------------------------------------
  ...
   l RDentry(l)                    RDexit(l)
   ============================================================
   1 (r,?),(t,?),(x,?),(y,?)       (r,?),(t,1),(x,?),(y,?)
   2 (r,?),(t,1),(x,?),(y,?)       (r,?),(t,1),(x,2),(y,?)
   3 (r,?),(t,1),(x,2),(y,?)       (r,?),(t,1),(x,2),(y,3)
   4 (r,?),(t,1),(x,2),(y,3)       (r,?),(t,1),(x,2),(y,3)
   5 (r,?),(t,1),(x,2),(y,3)       (r,5),(t,1),(x,2),(y,3)
   6 (r,?),(t,1),(x,2),(y,3)       (r,6),(t,1),(x,2),(y,3)
   7 (r,5),(r,6),(t,1),(x,2),(y,3) (r,5),(r,6),(t,1),(x,2),(y,3)

*** variations

    Q:  What would the analysis look like if we only wanted to keep
        track of which variables were assigned?
    Use a different domain, just variable names.

    Q: What would the analysis look like if we want to keep track of
       what variables could influence the values of other variables?
    We would need to track where a boolean test (if/while) affects a
       computation...