UP | HOME

Type checking III
Lecture 9

Table of Contents

Review

Questions about the last class?

Quiz

How do we a define a type?

Quiz Discussion

Handling declarations

  • User may add new symbols to the language
  • Type-checking needs to proof type safety of these as well
  • Many language require declarations

Growing a Language

Some languages can infer types from operators used.

Type-checker records type information about each symbol

  • Maps symbol name to type
  • Declarations populate the symbol table
  • Relevant to scoping rules
    • Same name, different memory location
    • Disambiguate by scope

Symbol tables

  • Maps symbol name to its type name

    int x;
    int y;
    string z;
    
    y = 1 + x;
    y = z;
    
symbol type
x int
y int
z string

A symbol table is also used during code generation to track (relative) memory locations of symbols.

How do we handle scoping rules?

Same symbol have different types and be accessible in different sections of the source code text.

Symbol table designs

  • Add a field for the scope
  • Chained symbol tables

    int globalvar;
    
    def f (arg1 : bool) -> int {
      int x;
      int y;
      string z;
    
      y = 1 + x;
      return y;
    }
    

Single symbol table

symbol type scope
globalvar int GLOBAL
f (bool) -> int GLOBAL
x int GLOBAL.f
y int GLOBAL.f
z string GLOBAL.f

Note that f is in the global scope, but it's parameters are in f's scope. Constructing these scopes requires care to make sure that f is added to the global scope before entering f's scope.

Chained symbol tables

GLOBAL (parent_scope=none)

symbol type
globalvar int
f (bool) -> int

f (parent_scope=GLOBAL)

symbol type
x int
y int
z string

The project's skeleton provides a symbol table implementation using this chained technique.

Specifying type checkers

Typing judgments

  • Based on "proof rules"
    • Systematic notation for deriving proofs from logical rules
  • Popular notation in academia
    • Sort of an esoteric description of an evaluator
  • Described in Type Systems

2017 ACM PPoPP Keynote: It's Time for a New Old Language by Guy Steele

  • A discussion of the history and issues with the formal notation for programming languages.

The Cool Reference Manual by Alex Aiken

  • An example of defining type rules (and operational semantics) for an object-oriented language.

Syntax vs. semantics

  • I use angle brackets, e.g., <expression>, to denote the semantic value of syntax
    • The map is not the territory, ASCII numbers are not machine representations of numbers
  • Example
    • ASCII: 12+3
    • Semantic value: <12+3>

Notation

hypotheses
----------
conclusion

Examples

-----------
<1> : int


-----------
<2> : int


-----------
<+> : (int, int) -> int


<1> : int     <2> : int     + : (int, int) -> int
-------------------------------------------------
<1 + 2> : int


// using a metavariable (think nonterminal in syntax) to avoid writing rules for each possible element of the language
<n1> : int    <n2> : int      op : (int, int) -> int
----------------------------------------------------
<n1 op n2> : int

Deduction via proof rules

Prove that "1 + 2 * 3" is well-typed

                <2> : int   <3> : int   <*> : (int, int) -> int
               ------------------------------------------------
<1> : int      <2 * 3> : int                                       * : (int, int) -> int
----------------------------------------------------------------------------------------
<1 + 2 * 3> : int
Implied by this notation is an automated process that matches hypotheses with conclusions and handles matching metavariable to inputs (i.e., parsing).

Type rules for SimpleC

// abstract syntax: abstract syntax does not capture all of the concrete syntax's language restrictions.  boolean operators use and, or, and not to remove conflict with the pipe symbol.

program ::= (d | f)+                   // a program 

f       ::= def f(fs) -> t { d* st* }  // function definition
fs      ::= (x : t (, x : t)*)?        // formal arguments

d       ::= x : t;                     // a declaration

t       ::= int                        // integer type
          | bool                       // boolean type
          | string                     // string type
          | (t+) -> t                  // function type

st      ::= x = e;                     // assignment statement
          | while (e) st               // while statement
          | if (e) st else st          // if-then-else statement
          | if (e) st                  // if-then statement
          | return expr ;              // return statement
          | e;                         // expression statement
          | { st* }                    // compound statement

e       ::= f(actuals)                 // function call expression
actuals ::= (e (, e)*)?                // function args
e       ::= e op e | op e | (e)        // arithmetic expressions
op      ::= + | - | * | /              // numeric operators
op      ::= && | "||" | !              // boolean operators
op      ::= == | != | < | <= | > | >=  // relational operators
e       ::= x                          // variable usage
e       ::= n | b | S                  // literals

n       ::= [0-9]+                     // numeric values
b       ::= true | false               // boolean values
s       ::= " characters "             // string values
x       ::= [A-Za-z0-9]+               // identifiers


// environment (symbol table): this holds the type information for the symbols in scope

S
the environment is an ordered list of (name, type) pairs

S' = [(name, type)] + S
means S' is identical to S, but has the new mapping (name, type).

unique(S)
means S has only unique names

t = lookup(name, S)
means lookup names' type in S, checking the parent as well if needed

parent(S) = S'
means S has parent scope S'

// notation

<e>
refers to e's semantic value, e.g., ASCII numbers vs. Java integers.

<e> : int
says that e's semantic value is the type int.

S |- <e> : int
says that, given the environment (symbol table) S, e has type int.

S |- <st> ~> S'
says that given the environment (symbol table) S, st produces a new environment S'.


hypotheses
----------
conclusion

means that if the hypotheses are true we can conclude that the conclusion is true.


// literals: the symbols mean their equivalent mathmetical values for numbers and boolean true/false.

--------------
<n> : int

--------------
<b> : bool

--------------
<s> : string


// operators: the symbols for arithmetic, boolean, and relationship operators each have a function type

S |- <e1> : int     S |- <e2> : int     op \in { "+", "-", "*", "/" }
---------------------------------------------------------------------
S |- <e1 op e2> : int

S |- <e1> : bool     S |- <e2> : bool     op \in { "&&", "||" }
----------------------------------------------------------------
S |- <e1 op e2> : bool

S |- <e1> : bool
--------------------
S |- <!e1> : bool

S |- <e1> : int
--------------------
S |- <-e1> : int


// variables: variables assignments evaluate their right-hand side at define-time and are stored and looked up in a storage context.

S' = [(x, t)] + S   unique(S)
-----------------------------  [declaration]
S |- <x : t;> : S'

t = lookup(x, S)
---------------    [substitution]
S |- <x> : t

S |- <e> : t1     t2 = lookup(x, S)    t1 = t2
---------------------------------------------  [assignment]
S |- <x = e;> : S


// control-flow: conditionals and iteration are statements that update state but produce no value.

S |- <e> : bool     S |- st1 ~> S'     S |- st1 ~> S''   // type-check both branches
----------------------------------------------------
S |- <if e st1 else st2> ~> S       // nested scope doesn't affect parent scope

S |- <e> : bool     S |- <st> ~> S'
----------------------------------
S |- <while e st> ~> S


// functions: functions are call-by-value, have a local storage context, and produce a return value


S' = [(f, fs -> t)] + S  // add function to the current scope
parent(S'') = S'         // define a new scope with the old scope as parent note that the
                         // parent scope has the function definition in it so recursion is acceptable
S'' = [(fs[i].name, fs[i].type)] for all fs[i] in fs // add parameter types to symbol table
S'' |- d* ~> S'''        // add declarations to the symbol table
S''' |- st* ~> S'''      // evaluate statements under the updated symbol table.  note that
                         // statements do not update the symbol table, so it is S''' out as well
----------------------------------  [definition]
S |- <def f(fs) -> t { d* st* }> ~> S'

(t1, ..., tN) -> t = lookup(f, S)
S |- actuals[i] : ti for all i = 1 to N
---------------------------------------------  [call]
S |- <f(actuals)> : t


// lookup

there exists (name, t) in S
------------
S |- lookup(name, S) ~> t

S' = parent(S)
there exists (name, t) in S'
------------
S |- lookup(name, S) ~> t

Author: Paul Gazzillo

Created: 2023-04-13 Thu 14:59