CS 641 Lecture -*- Outline -*-

	Lecture adapted from Mads Tofte's "Four lectures on Standard ML"
	Edinburgh ECS-LFCS-89-73

	(This description of the static semantics of modules is fundamentally
	unsatisfactory....)

* Static Semantics of Modules in SML
	signature matching and sharing

** Elaboration
	compare following signatures
	-------------------
	signature S =
	sig
	 type table
	 exception Lookup
	 val lookup: table * Identifier.sym -> Data.value
	 val update: table * Identifier.sym * Data.value -> table
	end;

	signature S' =
	sig
	 type table
	 exception Lookup
	 val lookup: table * string -> real
	 val update: table * string * real -> table
	end;
	-------------------
	differences:
		first depends on free structures Data and Identifier
		if Identifier.sym = string and Data.value = real
			then same semantics
*** signature expression (syntax)
*** signature (meaning)
*** elaboration (translation)
	static evaluation
	depends on meaning of free identifiers in expression (environment)

	every signature expression elaborates to 0 or infinitely many
		signatures in a given context

	principal signature: one which all other elaborations are instances
		taken as the meaning

	also applies to structure expressions and functor declarations
		meanings are structures and functor signatures

	goal: explain principles that govern elaboration

** Problem: deciding sharing
	------------------
	structure Stack =
	struct
	  type elt = int
	  datatype stack = ST of elt list ref
	  val initStack = ST(ref[])
	end;
	structure StackUser1 =
	struct
	  structure Stack1 = Stack
	  ...
        end;
	structure StackUser2 =
	struct
	  structure Stack2 = Stack
	  ...
	  datatype stack = ST of elt list ref
        end;
	------------------
	note: valid sharing equations
		StackUser1.Stack1 = StackUser2.Stack2
		Stack.elt = int
		Stack.stack = StackUser1.Stack1.stack
	but
		StackUser1 <> StackUser2
		Stack.stack <> Stack2.stack

	How to decide such equations?

** Names
	idea: decorate expressions with "names" (semantic objects, not idents)

*** structure names
	n1, n2, ...
	m1, m2, ...
*** type names
	t1, t2, ...
	s1, s2, ...
	unit, int, bool, ->

*** sharing
	2 structure expressions share if decorated by the same structure name
	2 type expressions share if decorated with same type name

** Decorating
*** Decorating Structures
	- base types (bool, int, ...) have themselves as decoration
	- each elaboration of a "generative structure expression"
		struct ... end
	  yields a new structure, so is decorated by a new name.
	
	- each elaboration of a
		datatype ...
	  yields a new type, so is decorated by a new name

	- values decorated by their type

	- decoration of declared identifier copied from value (expression)

	In these notes, decoration written after two underbars atttached
			to identifier
		e.g., Stack__n1

	e.g., val x = 3       ==>  val x__int = 3
	      type elt = int  ==>  type elt__int = int

	------------------
	structure Stack__n1 =
	struct
	  type elt__int = int
	  datatype stack__t1 = ST of elt list ref
	  val initStack__t1 = ST(ref[])
	end;
	structure StackUser1__n2 =
	struct
	  structure Stack1__n1 = Stack
	  ...
        end;
	structure StackUser2__n38 =
	struct
	  structure Stack2__n1 = Stack
	  ...
	  datatype stack__t23 = ST of elt list ref
        end;
	------------------

*** Decorating signatures
	signature expressions are not generative
		-compared structurally (so keep info on substructures)
		-template for comparison with structure,
			so some names "bound" not free (from environment)
	-------------------
	signature StackSig__(m1,s1,s2) =
	sig__m1
	  type elt__s1
	  type stack__s2
	  val new__(unit->s2) : unit -> stack
	end;

	signatue TranspSig__(m1,s1) =
	sig__m1
	  type elt__s1
	  type stack__t1
	  sharing type stack__t1 = Stack.stack__t1
	  val new__(unit->t1) : unit -> stack
	-------------------

	Bound names collected at the signature identifier,
		merely place holders
		in StackSig bound names are m1, s1, s2;
			free names are unit and ->
		in TranspSig bound names are m1, s1
			free names are t1, unit and -> (why t1?)

	Identifier from environment (e.g., Stack, hence Stack.stack),
		must get decoration of that name, since will share at run-time

	Substructures in signatures:
		need tree showing components of substructure and types,
			in decoration on the structure identifier
		Why? need to record denotation (components, types)
			specified by the subsignature, don't want to have to
			find the signature again later (it is not kept around)
		(alternative: during comparison for substructure,
			recursively compare against subsignature;
			keep "closure" to remember the subsignature name's
			denotation with the signature.)

	-------------
	signature SymSig__(m1,s1) =
	sig__m1
	  eqtype sym__s1
	  val hash__(s1->int) : sym -> int
	end;

	signature LexSig__(m2,m1,s1,...)
	sig__m2
	  structure Sym__(m1(sym__s1,hash__(s1->int))) : SymSig_(m1,s1)
	  ...
	end;
	-------------
	note: can draw decoration of Sym substructure as a tree
		this is a rather ad hoc way of remembering the components
		declared in SymSig

	exercise: decorate the signatures from the previous lecture.

** Signature instantiation
	comparison of decoration of structure and signature
		(is this structure an instantitation of that signature?)
	------------------
	structure Stack__n1 =
	struct
	  type elt__int = int
	  datatype stack__t1 = ST of elt list ref
	  fun new__(unit->t1) () = ST(ref[])
	end;

	signature StackSigA__(m1,s1,s2) =
	sig__m1
	  type elt__s1
	  type stack__s2
	  val new__(unit->s2) : unit -> stack
	end;
	------------------

	substitute n1 for m1, int for s1, t1 for s2
		in decorations of StackSigA to get deocrations of Stack
	so Stack is an instance of StackSigA

	realization: map from bound names of signature to names

	instance: structure ST is instance of signature SIG
		if there is a realization of SIG so that
		ST and the realized SIG have the same decorations for
		each component
		- order does not matter

	----------------
	signature StackSigB__(m1,s1) =
	sig__m1
	  type elt__s1
	  datatype stack__t1 = ST of elt list ref
	  sharing type stack__t1 = Stack.stack__t1
	  val new__(unit->t1) : unit -> stack
	end;
	----------------
	Stack is an instance of StackSigB by the realization
		{m1 |-> n1, s1 |-> int}

	----------------
	structure OldStr__n4 =
	struct
	  type elt__int = int
	  val test__bool = false
	end;

	signature WrongSig__(m1,s1) =
	sig__m1
	  type elt__s1
	  val test__s1 : elt
	end;
	----------------
	OldStr is not an instance of WrongSig: why?
		s1 would have to be realized by int
		but test would be decorated by int in the sig, bool in struct

** Signature matching
	Combination of: instantiation and abstraction (ignoring excess stuff)

	abstraction
		1. forgetting components
			struct type x = int type y = int end
			 ==> struct type x = int end
		2. forgetting polymorphism of type variables
			struct type 'a z = 'a list end
			 ==> struct type int z = int list end

	A structure matches a signature
		if it can be abstracted to an instance of a the signature

	----------------
	structure Tree__n1 =
	struct
	  datatype 'a tree__t1 = LEAF of 'a | NODE of 'a tree * 'a tree
	  type intTree__(int t1) = int tree
	  fun max(a:int, b:int) = if a > b then a else b
	  fun depth__('a t1->int) (LEAF _) = 1
	    | depth (NODE(left,right) = max(depth left, depth right)
	end;

	signature TreeSig__(m1,s1,s2) =
	sig__m1
	  type 'a tree__s1
	  type intTree__s2
	  fun depth__(s2->int): intTree -> int
	end;
	----------------

	Tree matches TreeSig as follows
		forget the constructors LEAF and NODE, the function max
		abstract 'a t1 to int t1 (in type of depth)
		use realization {m1 |-> n1, s1 |-> t1, s2 |-> int t1}
			on TreeSig

	exercise:
		does a datatype declaration match a type specification?
		does a type declaration match a datatype specification?

*** Signature constraints on structure declarations (abstraction)
	----------------
	structure Tree: TreeSig = struct ... end;
	----------------
	the signature of the structure Tree is given by TreeSig,
		hides NODE, LEAF, max
		intTree is decorated by int t1 (where t1 decorates 'a tree)
			since this is realization of s2
	Note: int t1 = Tree.intTree__(int t1) = int Tree.tree__t1
		sharing obtained through the realization,
				not explict in TreeSig
		So signature constraints do not remove existing sharing.

** Decorating Functors
	elaborate body of functor each time it is applied
	----------------
	functor StackFct() =
	struct
	  datatype stack = ST of int list ref
	  val data = ST(ref [])
	  ...
	end;

	struct Stack1 = StackFct();
	struct Stack2 = StackFct();
	----------------
	Stack1 does not share with Stack2
		e.g., create distinct types and references

	decoration: decorate the struct ... end expression
			but record bound names at the = in the first line
				the "generative names" of the functor

	for nullary functors, the result structure of an application
		decorated by replacing all the generative names with new names.

	---------------
	functor StackFct() =__(m1,s1)
	struct__m1
	  datatype stack__s1 = ST of int list ref
	  val data__s1 = ST(ref [])
	  ...
	end;
	struct Stack1__n7(stack__t10,data__t10) = StackFct();
	struct Stack2__n8(stack__t11,data__t11) = StackFct();
	---------------

*** External sharing
	free identifiers of a functor may result in external sharing
		external names are left unchanged when functor is applied
	---------------
	structure MyPervasives__n9 =
	struct__n9
	  datatype num__t12 = NUM of int
	  ...
	end;

	functor StackFct'() =__(m2)
	struct__m2
	  structure MyPer__n9 = MyPervasives
	  type stack__(t12 list ref) = MyPer.num list ref
	  val data__(t12 list ref) : stack = ref []
	end;

	struct Stack1'__n10(MyPer__n9,stack__(t12 list ref),
				data__(t12 list ref)) = StackFct'();
	struct Stack2'__n11(MyPer__n9,stack__(t12 list ref),
				data__(t12 list ref)) = StackFct'();
	---------------

	exercise: which of the following sharing equations hold?
		Stack1' = Stack2'
		Stack1'.MyPer = Stack2'.MyPer
		type Stack1'.stack = Stack2'.stack

*** Functors with arguments
	----------
	signature SymSig__(m1,s1) =
	sig__m1
	  eqtype sym__s1
	end;

	functor SymDir(Sym: SymSig) =__(m2,s2)
	struct__m2
	  datatype dir__s2 = DIR of Sym.sym -> int
	  fun update ...
	end;
	----------
	only assume one has a structure (with formal argument name)
		that exactly matches the signature (no additional assumptions)
		i.e., name of Sym is m1, name of Sym.sym is s1
			don't reuse (to avoid capture)

	------------
	structure Actual__n12 = struct type sym__string = string end;

	structure Result__n13(dir__t13,update__...) = SymDir(Actual)
	------------
	note: actual matches SymSig (why?)

*** Sharing between arguments and results
	----------
	functor SymDir'(Sym: SymSig__(m1,s1)) =__(m2)
	struct__m2
	  type dir__(s1->int) = Sym.sym -> int
	  fun update ...
	end;
	----------
	the type name s1 shared between argument and the body
		when functor is applied, this makes sharing between
			actual argument and result

	-------------
	structure Result2__n14(dir__string,update__...) = SymDir'(Actual);
	-------------
	note: Result2.dir = string = Actual.sym

*** Explicit result signatures
	If the result signature is explicit:
		1. decorate the functor without the result signature
		2. decorate the result signature, match against body
			(using abstraction as before)
	
*** summary: full decoration of result of functor application
	1. match actual arg against the formal signature
		get realization (map from bound names to actual names)
	2. apply the realization to the decoration of the functor body
	3. substitute fresh names for the generative names of the functor body

* Evaluation
	Is the above a satisfactory as a formal explanation of the concepts?
		generative structure declarations, functors,
		coercive signature matching
		sharing
	Does it make the rules precise?
		e.g., does order matter in signature matching?
		what happens if change signature named in subsignature?
	Does it really explain things or just name them?
		(I have reworded some things so that it explains more,
			e.g., "have to give things that will share same name"
				to "if id exists in environment,
					reuse its decoration")
		algorithm for decoration? (could it be formalized, how?)
		how does the algorithm "know" when to assign new names,
			and when to reuse old names?
	Is the description independent?  Is it circular?
	Does it ignore inessential aspects?