A TYPE NOTATION FOR SCHEME by Gary T. Leavens Department of Computer Science, Iowa State University $Date: 1999/04/19 17:24:43 $ This note defines some notation for Scheme types. This notation is identical to that used by the type checker that is used by default in the scm342 version of the interpreter. A type can be thought of as a set of values with certain associated procedures. For example we use ``number'' as the name of the type of all numbers (both exact and inexact, real and integer) in Scheme. The procedures that work on numbers include + and *. Other standard type names are "string", "char", "boolean", and "symbol". Following the Scheme prefix notation idea, we write ``(list number)'' for the type of lists of numbers. For example, (1 2 3) and (3.14 2.73) both have this type. Similarly, we write (vector number) for the type of vectors of numbers, and (pair number boolean) for the type of cons cells (pairs) such as (3 . #t) or (7 . #f). 1. BASIC NOTATION FOR PROCEDURE TYPES We write the types of procedures using a notation like (-> (S T U) V), which is the type of a procedure that takes 3 arguments (the first of type S, the second of type T, and the third of type U) and returns a result of type V. For example, substring is a procedure of type (-> (string number number) string). We use a colon (:) for the phrase ``is of type'' (or ``has type''). For example, we write substring : (-> (string number number) string) to mean that substring is a procedure that takes three arguments, the first a string, the second and third numbers, and which returns a string. You can derive such a type from a lambda expression systematically. Take the lambda expression, replace "lambda" by "->", replace the formal paremeter names by the types of the formals, and replace the body by its type. For example, the type of (lambda (n m) (+ 2 (+ n m))) is given by (-> (number number) number) As another example, the type of sum in the definition (define sum (lambda (ls) (if (null? ls) 0 (+ (car ls) (sum (cdr ls)))))) is given by (-> ((list number)) number). 2. ESSENTIAL ENHANCEMENTS 2.1 datum Since Scheme does not do type checking before a program is run, the elements of a list do not all have to be of the same type. Such a list is *heterogeneous*; a list, all of whose elements are of the same type is *homogeneous*. We use types of the form (list T) for homogeneous lists, whose elements all have type T; for example, we may write (list integer), (list number), or (list symbol). We use the type (list datum) for the type of heterogeneous lists, because we think of the type ``datum'' as the type of all the values in Scheme. For example, lists, procedures, symbols, and numbers all have type datum. As another example, we could write for cons: cons : (-> (datum (list datum)) (list datum)). The names of other types with components are written similarly; for example, we write (tree number) or (pair symbol symbol). 2.2 Polymorphism For higher-order functions, one often needs to describe procedure types that are not as specific as the ones used above, but which are also not so general as ``datum''. For this it is convenient to use type variables, which are capital letters such as: T, S, U, V, and W. These are consistently replaced in a type when that type is used. For example, Scheme's built in procedure map satisfies the following. map : (-> ((-> (S) T) (list S)) (list T)) This type has several instances, that is consistent replacements of actual types for the type variables. For example, some instances are: (-> ((-> (boolean) integer) (list boolean)) (list integer)) (-> ((-> (number) number) (list number)) (list number)) (-> ((-> ((list symbol)) (list datum)) (list (list symbol))) (list (list datum))) The first of the above examples is what one would use when checking the types of the application (map positive? '(3 -2 1)). The second of these examples would be used in checking the types of the application (map add1 '(3 -2 1)). Since ``datum'' is a type, if we write such a type for cons, cons : (-> (T (list T)) (list T)) then it has the type given earlier as an instance. Hence this type is a better type for cons. 3. EXTENSIONS In some situations, other notations are useful. 3.1 And types, Or types Some procedures, like cons, can be used in several ways. That is, cons can make both homogeneous lists, heterogeneous lists, and pairs. To record this, use an "and-type" (or "intersection type"). For example, the type of cons is more properly thought of as follows: cons : (and (-> (T (list T)) (list T)) (-> (S (list U)) (list datum)) (-> (V W) (pair V W))) The idea is that, if x : (and S T), then x has both type S and type T. In some cases you may also need to use "or-types" (or "union types"). The idea is that, if y : (or S T), then y is either of type S or type T. 3.2 void The type void, as in C or C++, is used to represent the type of a procedure that returns but doesn't return a useful result. For example, Scheme's procedure display satisfies the following. display : (-> (datum) void) 3.3 Variable argument procedure types (...) For variable argument procedures, write the type with ``...''. For example, the type of writeln is given by the following. writenln : (-> (datum ...) void) This means that writeln can take 0 or more arguments (of type datum). 3.4 poof The type poof is used for the return types of procedures that do not return to their caller. The standard example is the procedure error, which satisfies the following. error : (-> (datum ...) poof) ACKNOWLEDGEMENTS Thanks to Curtis Clifton for help with the current implementation of the type checker. Thanks to Steven Jenkins for work on an earlier version of the type checker and for discussions about these ideas. A more extensive (but slightly dated) discussion of the type checker is found in the following paper. Steven Jenkins and Gary T. Leavens. Polymorphic Type-Checking in Scheme. Computer Languages, 22(4):215-223. Also avilable from ftp://ftp.cs.iastate.edu/pub/techreports/TR95-21/TR.ps.Z