Larch/C++ Reference Manual - Larch-style Specifications

Go to the first, previous, next, last section, table of contents.

1.1 Larch-style Specifications

Larch/C++ is a Larch-style behavioral interface specification language [Guttag-Horning93]. In this style of specification, which might be called model-oriented [Wing90a], one specifies both the interface of a procedure or abstract data type (see section 2.2 Interfaces), and its behavior. The behavior of a procedure is specified by describing: for what states the procedure is defined, what the procedure is allowed to change, and the relation between the states before and after the procedure is invoked. The states for which the procedure is defined is described by a logical assertion, called the precondition; the allowed relationships between these states and the states that may result from a call are described with another logical assertion called the postcondition. The pre- and postconditions are expressed using a mathematical vocabulary, which has a formal meaning specified, in part by the user of Larch/C++, in the Larch Shared Language (LSL) [Guttag-Horning93].

The behavior of an abstract data type (ADT), which is implemented by a class in C++, is specified by describing a set of abstract values for its objects, and by specifying each of its operations (C++ member functions) as a procedure. (Unlike other specification languages, such as Z [Hayes93] [Spivey92], or Fresco [Wills92b] Larch/C++ does not require the user to model an instance as a collection of data members. However, that is allowed and, as will be explained below, is the default if the user does not specify some other model.)

For example, consider the following Larch/C++ specification of a simple C++ class IntHeap. This file could be used as a header file in a C++ program. (An explanation of the notation follows the specification.)

// @(#)$Id: IntHeap.lh,v 1.8 1997/06/03 20:29:38 leavens Exp $

//@ uses IntHeapTrait;                    //  3 connection to LSL

/*@ abstract @*/ class IntHeap {          //  5 no constructors, C++ interface
public:                                   //  6   
  virtual int largest() const throw() = 0;//  7 C++ interface
  //@ behavior {                          //  8 starts largest's specification
  //@   requires len(self^) >= 1;         //  9 precondition
  //@   modifies nothing;                 // 10 what can change
  //@   ensures result = head(self^);     // 11 postconditon
  //@ }
};

In C++, comments start with either //, and extend to the end of a line, or they start with /* and extend to the next */. In this specification the forbidding first line is a comment that is part of a version control system, and should be ignored. If, however, the first character of what looks to C++ like a comment is the at-sign character (@), then the comment start sequence, and the at-sign are all dropped by Larch/C++. For example, on line 3, the characters //@ are dropped by Larch/C++. Thus, when this line is used in a C++ program, the keyword uses is considered to be part of a comment, but it is significant when passed to Larch/C++. The comment sequences /*@ and @*/ are also treated in this way. Thus what Larch/C++ sees is the following input. (If one wishes to have specifications in separate files from header files, then one can also pass this form directly to Larch/C++.)

// @(#)$Id: IntHeap.lcc,v 1.11 1997/06/03 20:29:37 leavens Exp $

//@ uses IntHeapTrait;                    //  3 connection to LSL

abstract class IntHeap {                  //  5 no constructors, C++ interface
public:                                   //  6   
  virtual int largest() const throw() = 0;//  7 C++ interface
  behavior {                              //  8 starts largest's specification
    requires len(self^) >= 1;             //  9 precondition
    modifies nothing;                     // 10 what can change
    ensures result = head(self^);         // 11 postconditon
  }
};

Line 3 of this specification says that abstract values of IntHeap objects are specified in the LSL trait named IntHeapTrait; this gives a mathematical vocabulary for talking about such values. Line 5 of this specification states that the class being specified is an abstract class; that is, that it is a class for which no constructors are provided. This line is also part of the class's interface; it gives the name of the C++ class that implements this specification: IntHeap. Line 7 gives the C++ interface of a member function. Note that this includes C++ details such as that the member function is virtual and const, and that it throws no exceptions. (The C++ syntax = 0 says that derived classes must implement this function, see section 7.10 Abstract Classes for details.) Line 8 starts the specification of the behavior of the member function largest. Line 9 gives the precondition for the member function, and line 11 the postcondition; both of these are written using the vocabulary of IntHeapTrait, which gives a meaning to the trait functions len and head (and =). Line 10 specifies that the implementation of this member function cannot change any variables; in particular, the value of the default parameter, *this, cannot be changed.

LSL comes with a wide variety of useful built-in traits, which form a so-called LSL "Handbook" (see [Guttag-Horning93], appendix A). The abstract values of IntHeap can thus be described by including the trait PriorityQueue (found on page 175 of [Guttag-Horning93]) and renaming its sorts. The other trait included, NoContainedObjects, says that the abstract values of IntHeap objects do not contain embedded objects. See section 7.5 Contained Objects for details of the trait NoContainedObjects. In more realistic examples, writing the trait might not be so easy.

% @(#)$Id: IntHeapTrait.lsl,v 1.2 1994/12/08 23:09:29 leavens Exp $
% This is LSL, not Larch/C++
IntHeapTrait: trait
  includes PriorityQueue (int for E, IntHeap for C, int for Int),
           NoContainedObjects(IntHeap)

The Larch style of specification goes back to Hoare's work. Hoare used pre- and postconditions to describe the semantics of computer programs in his famous article [Hoare69]. Later Hoare adapted these axiomatic techniques to the specification and correctness proofs of abstract data types [Hoare72a]. To specify an abstract data type, Hoare described a set of abstract values for the type, and then specified pre- and postconditions for each of the operations of the type in terms of how the abstract values of objects were affected. For example, one might specify the type IntHeap using abstract values of the form empty and add(i,h), where i is an int and h is an IntHeap. (The detailed description of such a type's abstract values, could be recorded in an LSL trait, such as IntHeapTrait.)

There are three advantages to using abstract values instead of directly using C++ variables and data structures. The first is that such a specification is simpler, because it has fewer details. This makes specifications easier to read and understand, provided that one is comfortable with the mathematical vocabulary used to describe the abstraction. The second, and more important, reason is that by using a mathematical abstraction to specify behavior, the specification does not have to be changed when the particular data structure used in the program is changed. This permits different implementations of the same specification to use different data structures. Therefore the specification forms a contract between the rest of the program in the implementation, which ensures that the rest of the program is also independent of the particular data structures used [Parnas72] [Liskov-Guttag86] [Meyer88] [Meyer92]. Finally, it allows the specification to be written even when there are no implementation data structures, as is the case for IntHeap.

This idea of model-oriented specification has been followed in VDM [Jones86b], Z [Hayes93] [Spivey92], and Larch [Guttag-Horning93]. The essential elaboration of Hoare's original idea is that the abstract values also come with a set of operators, which we will refer to as trait functions, to avoid confusion with C++ operators. The trait functions are used to precisely describe the set of abstract values. In Z one builds abstract values using tuples, sets, relations, functions, sequences, and bags; these all come with pre-defined trait functions that can be used in assertions. In VDM one has a similar collection of mathematical tools to describe abstract values, and another set of pre-defined trait functions. In the Larch approach, there are some pre-defined kinds of abstract values (found in Guttag and Horning's LSL Handbook, Appendix A of [Guttag-Horning93]), but these are expected to be extended as needed. (The advantage of being able to extend the mathematical vocabulary is similar to one advantage of object-oriented programming: one can use a vocabulary that is close to the way one thinks about a problem.)

The extension mechanism in LSL is called a trait. A trait is like an equational specification, augmented with some additional constructs. These additional constructs allow one to state specific kinds of induction and deduction principles for reasoning about abstract values. (See section 4.2 in [Guttag-Horning93] for details.)

For historical reasons, the types in a trait are called sorts. Each abstract value thus has a sort. The trait specifies the sorts of arguments that each trait function takes, and the sort of its result. The axioms of a trait specify what can be proven about an expression involving trait functions. Each well-formed expression built from the trait functions has a sort, which is the result sort of the outermost trait function. The abstract values of a given sort can be thought of as equivalence classes of expressions having that sort. (One can use a partitioned by clause in LSL to make this equivalence the same as observable equivalence if desired. See pages 38-39 in [Guttag-Horning93] for details.)

The division between the two layers (or tiers) of a Larch/C++ specification is thus as follows. The shared (or "bottom") layer of a specification consists of LSL traits, which describe abstract values. (These traits could be shared among many specifications for different interface layers.) The abstract values are purely functional; that is, there is no concept of mutation, state, aliasing, storage, non-termination, or nondeterminism in a trait.

The interface layer consists of Larch/C++ specification modules, which specify pieces of C++ programs. This is where one deals with the concepts of mutation, state, aliasing, storage, termination, nondeterminism, finiteness, and so on.

The separation of these layers is enforced in Larch/C++. That is, C++ functions cannot be used to specify other C++ functions. (Also, of course, trait functions cannot be used in C++ programs.) See section 7 Class Specifications for further discussion on this point.

The picture just described is accurate in theory. However, in practice, users do not always write their own traits when specifying a concrete class in Larch/C++. They need not always write such a trait, since, by default, Larch/C++ will automatically construct a trait that models the abstract values of a class as a tuple of its data members (as objects). This default can be overridden by using a trait, such as IntHeapTrait that defines a sort with the same name as the class being defined, such as IntHeap. See section 7 Class Specifications for several examples. See section 5.4.4 Structure and Class Declarations for more details about the automatically-constructed abstract model.

Go to the first, previous, next, last section, table of contents.