Extending CORBA IDL to Specify Behavior with Larch by Gary T. Leavens and Yoonsik Cheon Department of Computer Science, Iowa State University 226 Atanasoff Hall, Ames Iowa 50011-1040 USA leavens@cs.iastate.edu, cheon@cs.iastate.edu 1. INTRODUCTION This position paper explores how model-oriented specification in the Larch style [Guttag-Horning93] could be adapted to specify the behavioral semantics for objects in distributed systems. Our position is that the model-oriented specifications [Wing90] are ideally suited to behavioral specification of distributed object interfaces, and that the Larch style of interface specification has some interesting advantages over other model-oriented specification formalisms. As an example we discuss how to formally specify interfaces given in the Interface Definition Language (IDL) for the distributed object management framework specified by the Object Management Group (OMG) for its Common Object Request Broker Architecture (CORBA). We believe that a Larch/CORBA interface language could extend IDL with precise behavioral semantics, that it would not be extraordinarily difficult to design, and that it would be easy to use. Businesses would like to enhance the value of their existing software by treating programs as objects in a logically-centralized, distributed system. Ideally, the people accessing the network would see a single system, where all data and programs of the enterprise work in harmony. To program the glue and wrappers that hold such dream systems together will require an understanding of the precise interface and behavior of each of the software components in the system. One way of specifying such systems is the Interface Definition Language (IDL) that is described in Chapter 4 of [OMG92]. This language specifies an object's interface, that is, how the object can be accessed by code running on the network through the Object Request Broker (ORB). However, there is no way to specify the behavior of an object in IDL: one can only specify its interface. There are many reasons that a precise specification of the behavior of an object is needed, in addition to its interface. Although source code is a precise, formal specification, it is not suitable because: * In a large enterprise, there is simply too much code for any one person to understand. * Existing software is often purchased from outside vendors, and these vendors are naturally hesitant to let the code be seen. * Even if the code could be seen, it would be inefficient to have programmers read it, because it contains too much detail. What is needed is a behavioral specification that is inherently less detailed than code. 2. THE LARCH APPROACH TO SPECIFYING BEHAVIOR The Larch approach to specifying an object is model-oriented. The specifier formally describes the ``abstract values'' of objects in the system, using the Larch Shared Language (LSL) [Guttag-Horning-Modet90]. The behavior of the object's operations is described as a change in its abstract value, and perhaps a change in the abstract values of other objects as well. Since the abstract value of an object may change, the specification makes use of a notion of ``state.'' In the Larch approach, there are thus two specification languages. LSL and an interface language tailored to some particular programming language. Unlike IDL, the Larch family of interface specification languages, which includes Larch/Ada, Larch/C, Larch/C++, Larch/Smalltalk, describes both how to call the module and its behavior. The interface is described in a suitable notation tailored to the particular programming language; in a Larch/CORBA, the existing IDL, perhaps with some enhancements, could be used for this purpose. The behavior is described using pre- and post-conditions, which are predicate-calculus formulas over the formal parameters of the operation. In addition one can use the functions and relations defined in LSL to manipulate the abstract values. One is not limited to a fixed vocabulary for describing abstract values, but is able to tune the level of abstraction as desired. Another typical feature of a Larch-style operation specification is to state what objects may (and may not) be modified by the operation. Work has also been done on specifying exceptions [Jones91] and concurrency [Lerner91]. We have experience in designing interface specification languages for C++ and Smalltalk, and thus do not anticipate too many potential problems from object-oriented features. 3. EXAMPLE The following shows a simple example in what would be ``Larch/CORBA''; it is a specification of a PrinterQueue. -------------------------------------------------------------------- typedef unsigned long JobID; exception QUEUE_FULL; interface PrinterQueue { uses Queue(PrinterQueue for C, JobID for E, unsigned long for Int); const unsigned long MAX_QUEUE_SIZE = 100; void enqueue (in JobID id) raises (QUEUE_FULL) { requires true; modifies self; ensures self' = append(self^,id) except when len(self') > MAX_QUEUE_SIZE => raise(QUEUE_FULL); } JobID dequeue () { requires ~isEmpty(self'); modifies self; ensures self' = tail(self^) /\ result = head(self^); } boolean has (in JobID id) { modifies nothing; ensures result = id \in self^; } unsigned long size () { ensures result = len(self^); } } --------------------------------------------------------------------- The mathematical model of type PrinterQueue is specified in the trait Queue (see the handbook, Appendix A of [Guttag-Horning93]); that is, the abstract values of objects of type PrinterQueue are bounded queues of JobIDs. Users can specify their own traits if a suitable one is not found in the handbook. Or a specialized handbook could be constructed to describe common mathematical models for CORBA. The operation ``enqueue'' takes an argument of type JobID and returns nothing (void) or raises an exception named QUEUE_FULL. Since the pre-condition (following ``requires'') is true, a client can invoke it in any state. In the next line, ``modifies self'' prohibits the ``enqueue'' from changing the state of any objects aside from the printer queue object itself. That is, ``self'' denotes the object to which the message ``enqueue'' is sent. The post-condition of ``enqueue'' (following ``ensures'') says that if the length of the queue is greater than MAX_QUEUE_SIZE, the call raises the exception QUEUE_FULL, otherwise (i.e., in the normal case) the argument is appended to the queue. The names ``append'' and ``len'' appearing in the post-condition are trait functions, and are formally defined in the used trait Queue. The notations self^ and self' denote the abstract values of self in the pre- and post-state respectively (i.e., just before and after the call). The operation ``dequeue'' removes and returns the head of the queue; the identifier ``result'' in its post-condition denotes the object to be returned. The operation ``has'' tests whether a JobID is in the queue or not, since it has no ``modifies'' clause, it cannot change the state of any object. The omitted pre-condition defaults to ``true''. The infix ``\in'' is a trait function that tests membership in a queue. The operation ``size'' returns the length of the queue. Again, ``len'' is a trait function specified in the trait Queue. 4. POTENTIAL PROBLEMS The main problems we see in designing a Larch-style interface specification for CORBA (Larch/CORBA) are the following. * We do not have much experience with the specification of concurrency and atomicity. * The abstract values needed to describe such application software as OMG and others hope to make available under CORBA would seem to be very complex. An ominous fact is that the size of a good user's manual is typically a small book. 5. PROSPECTS We believe that a Larch/CORBA could be designed based on the existing IDL syntax, by adding pre-conditions, post-conditions, and a way to state what objects are allowed to change. Our own Larch/C++ could be used as a guide, because IDL already uses much C++ syntax. The ultimate goal would be to design and document Larch/CORBA, and to provide a syntax and type checker for the CORBA specific part. Another potential goal would be to design a handbook of LSL traits to be used in Larch/CORBA specifications. Because LSL is becoming a mature system, there would be few worries about mathematical foundations. LSL is mathematically solid. LSL also comes with a theorem prover, LP, that can be used to analyze and debug the LSL part of a specification. Both LSL and LP are free software. One exciting advantage of Larch/CORBA would be that, because it uses LSL to specify abstract values, one could translate Larch/CORBA specifications into the Larch-style specification languages tailored to C, C++, Smalltalk, Modula-3, etc., without rewriting the LSL part. This would work because Larch/C, Larch/C++, etc. can all use the same LSL specifications. ACKNOWLEDGEMENTS The work of both authors is supported in part by the National Science Foundation under the Grant CCR-9108654. Thanks to Roger Burkhart for providing us with documentation on OMG and CORBA, and for discussions on how we could adapt Larch style specifications to CORBA. BIBLIOGRAPHY [Guttag-Horning93] John V. Guttag, James J. Horning, S.J. Garland, K.D. Jones, A. Modet, and J.M. Wing, ``Larch: Languages and Tools for Formal Specification'', Springer-Verlag, New York, 1993 [Guttag-Horning-Modet90] John V. Guttag and James J. Horning, and A. Modet, ``Report on the Larch Shared Language: Version 2.3'', DECSRC, Palo Alto, Calif., Apr, 1990 [Jones91] Kevin D. Jones, ``LM3: A Larch Interface Language for Modula-3: A Definition and Introduction: Version 1.0'', DECSRC, Palo Alto, Calif., Jun, 1990 [Lerner91] Allen Lerner, ``Objects of Concurrent Systems'', PhD Thesis, CMU-CS-91-131, CMU, May, 1991 [OMG92] ``The Common Object Request Broker: Architecture and Specification'', Object Management Group, Inc., 492 Old Connecticut Path, Framingham, MA 01701, 1992 [Wing90], Jeannette M. Wing, ``A Specifier's Introduction to Formal Methods'', Computer, 23:9, p. 8-24, Sep, 1990