Go to the first, previous, next, last section, table of contents.


Database Packages

Base Table

A base table implementation using Scheme association lists is available as the value of the identifier alist-table after doing:

(require 'alist-table)

Association list base tables are suitable for small databases and support all Scheme types when temporary and readable/writeable Scheme types when saved. I hope support for other base table implementations will be added in the future.

This rest of this section documents the interface for a base table implementation from which the section Relational Database package constructs a Relational system. It will be of interest primarily to those wishing to port or write new base-table implementations.

All of these functions are accessed through a single procedure by calling that procedure with the symbol name of the operation. A procedure will be returned if that operation is supported and #f otherwise. For example:

(require 'alist-table)
(define open-base (alist-table 'make-base))
make-base       => *a procedure*
(define foo (alist-table 'foo))
foo             => #f

Function: make-base filename key-dimension column-types
Returns a new, open, low-level database (collection of tables) associated with filename. This returned database has an empty table associated with catalog-id. The positive integer key-dimension is the number of keys composed to make a primary-key for the catalog table. The list of symbols column-types describes the types of each column for that table. If the database cannot be created as specified, #f is returned.

Calling the close-base method on this database and possibly other operations will cause filename to be written to. If filename is #f a temporary, non-disk based database will be created if such can be supported by the base table implelentation.

Function: open-base filename mutable
Returns an open low-level database associated with filename. If mutable? is #t, this database will have methods capable of effecting change to the database. If mutable? is #f, only methods for inquiring the database will be available. If the database cannot be opened as specified #f is returned.

Calling the close-base (and possibly other) method on a mutable? database will cause filename to be written to.

Function: write-base lldb filename
Causes the low-level database lldb to be written to filename. If the write is successful, also causes lldb to henceforth be associated with filename. Calling the close-database (and possibly other) method on lldb may cause filename to be written to. If filename is #f this database will be changed to a temporary, non-disk based database if such can be supported by the underlying base table implelentation. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Function: sync-base lldb
Causes the file associated with the low-level database lldb to be updated to reflect its current state. If the associated filename is #f, no action is taken and #f is returned. If this operation completes successfully, #t is returned. Otherwise, #f is returned.

Function: close-base lldb
Causes the low-level database lldb to be written to its associated file (if any). If the write is successful, subsequent operations to lldb will signal an error. If the operations complete successfully, #t is returned. Otherwise, #f is returned.

Function: make-table lldb key-dimension column-types
Returns the base-id for a new base table, otherwise returns #f. The base table can then be opened using (open-table lldb base-id). The positive integer key-dimension is the number of keys composed to make a primary-key for this table. The list of symbols column-types describes the types of each column.

Constant: catalog-id
A constant base-id suitable for passing as a parameter to open-table. catalog-id will be used as the base table for the system catalog.

Function: open-table lldb base-id key-dimension column-types
Returns a handle for an existing base table in the low-level database lldb if that table exists and can be opened in the mode indicated by mutable?, otherwise returns #f.

As with make-table, the positive integer key-dimension is the number of keys composed to make a primary-key for this table. The list of symbols column-types describes the types of each column.

Function: kill-table lldb base-id key-dimension column-types
Returns #t if the base table associated with base-id was removed from the low level database lldb, and #f otherwise.

Function: make-keyifier-1 type
Returns a procedure which accepts a single argument which must be of type type. This returned procedure returns an object suitable for being a key argument in the functions whose descriptions follow.

Any 2 arguments of the supported type passed to the returned function which are not equal? must result in returned values which are not equal?.

Function: make-list-keyifier key-dimension types
The list of symbols types must have at least key-dimension elements. Returns a procedure which accepts a list of length key-dimension and whose types must corresopond to the types named by types. This returned procedure combines the elements of its list argument into an object suitable for being a key argument in the functions whose descriptions follow.

Any 2 lists of supported types (which must at least include symbols and non-negative integers) passed to the returned function which are not equal? must result in returned values which are not equal?.

Function: make-key-extractor key-dimension types column-number
Returns a procedure which accepts objects produced by application of the result of (make-list-keyifier key-dimension types). This procedure returns a key which is equal? to the column-numberth element of the list which was passed to create combined-key. The list types must have at least key-dimension elements.

Function: make-key->list key-dimension types
Returns a procedure which accepts objects produced by application of the result of (make-list-keyifier key-dimension types). This procedure returns a list of keys which are elementwise equal? to the list which was passed to create combined-key.

In the following functions, the key argument can always be assumed to be the value returned by a call to a keyify routine.

In contrast, a match-key argument is a list of length equal to the number of primary keys. The match-key restricts the actions of the table command to those records whose primary keys all satisfy the corresponding element of the match-key list. The elements and their actions are:

#f
The false value matches any key in the corresponding position.
an object of type procedure
This procedure must take a single argument, the key in the corresponding position. Any key for which the procedure returns a non-false value is a match; Any key for which the procedure returns a #f is not.
other values
Any other value matches only those keys equal? to it.

Function: for-each-key handle procedure match-key
Calls procedure once with each key in the table opened in handle which satisfies match-key in an unspecified order. An unspecified value is returned.

Function: map-key handle procedure match-key
Returns a list of the values returned by calling procedure once with each key in the table opened in handle which satisfies match-key in an unspecified order.

Function: ordered-for-each-key handle procedure match-key
Calls procedure once with each key in the table opened in handle which satisfies match-key in the natural order for the types of the primary key fields of that table. An unspecified value is returned.

Function: delete* handle match-key
Removes all rows which satisfy match-key from the table opened in handle. An unspecified value is returned.

Function: present? handle key
Returns a non-#f value if there is a row associated with key in the table opened in handle and #f otherwise.

Function: delete handle key
Removes the row associated with key from the table opened in handle. An unspecified value is returned.

Function: make-getter key-dimension types
Returns a procedure which takes arguments handle and key. This procedure returns a list of the non-primary values of the relation (in the base table opened in handle) whose primary key is key if it exists, and #f otherwise.

Function: make-putter key-dimension types
Returns a procedure which takes arguments handle and key and value-list. This procedure associates the primary key key with the values in value-list (in the base table opened in handle) and returns an unspecified value.

Function: supported-type? symbol
Returns #t if symbol names a type allowed as a column value by the implementation, and #f otherwise. At a minimum, an implementation must support the types integer, symbol, string, boolean, and base-id.

Function: supported-key-type? symbol
Returns #t if symbol names a type allowed as a key value by the implementation, and #f otherwise. At a minimum, an implementation must support the types integer, and symbol.

integer
Scheme exact integer.
symbol
Scheme symbol.
boolean
#t or #f.
base-id
Objects suitable for passing as the base-id parameter to open-table. The value of catalog-id must be an acceptable base-id.

Relational Database

(require 'relational-database)

This package implements a database system inspired by the Relational Model (E. F. Codd, A Relational Model of Data for Large Shared Data Banks). An SLIB relational database implementation can be created from any section Base Table implementation.

Motivations

Most nontrivial programs contain databases: Makefiles, configure scripts, file backup, calendars, editors, source revision control, CAD systems, display managers, menu GUIs, games, parsers, debuggers, profilers, and even error reporting are all rife with databases. Coding databases is such a common activity in programming that many may not be aware of how often they do it.

A database often starts as a dispatch in a program. The author, perhaps because of the need to make the dispatch configurable, the need for correlating dispatch in other routines, or because of changes or growth, devises a data structure to contain the information, a routine for interpreting that data structure, and perhaps routines for augmenting and modifying the stored data. The dispatch must be converted into this form and tested.

The programmer may need to devise an interactive program for enabling easy examination and modification of the information contained in this database. Often, in an attempt to foster modularity and avoid delays in release, intermediate file formats for the database information are devised. It often turns out that users prefer modifying these intermediate files with a text editor to using the interactive program in order to do operations (such as global changes) not forseen by the program's author.

In order to address this need, the conscientious software engineer may even provide a scripting language to allow users to make repetitive database changes. Users will grumble that they need to read a large manual and learn yet another programming language (even if it almost has language "xyz" syntax) in order to do simple configuration.

All of these facilities need to be designed, coded, debugged, documented, and supported; often causing what was very simple in concept to become a major developement project.

This view of databases just outlined is somewhat the reverse of the view of the originators of the Relational Model of database abstraction. The relational model was devised to unify and allow interoperation of large multi-user databases running on diverse platforms. A fairly general purpose "Comprehensive Language" for database manipulations is mandated (but not specified) as part of the relational model for databases.

One aspect of the Relational Model of some importance is that the "Comprehensive Language" must be expressible in some form which can be stored in the database. This frees the programmer from having to make programs data-driven in order to use a database.

This package includes as one of its basic supported types Scheme expressions. This type allows expressions as defined by the Scheme standards to be stored in the database. Using slib:eval retrieved expressions can be evaluated (in the top-level environment). Scheme's lambda facilitates closure of environments, modularity, etc. so that procedures (which could not be stored directly most databases) can still be effectively retrieved. Since slib:eval evaluates expressions in the top-level environment, built-in and user defined procedures can be easily accessed by name.

This package's purpose is to standardize (through a common interface) database creation and usage in Scheme programs. The relational model's provision for inclusion of language expressions as data as well as the description (in tables, of course) of all of its tables assures that relational databases are powerful enough to assume the roles currently played by thousands of ad-hoc routines and data formats.

Such standardization to a relational-like model brings many benefits:

Creating and Opening Relational Databases

Function: make-relational-system base-table-implementation

Returns a procedure implementing a relational database using the base-table-implementation.

All of the operations of a base table implementation are accessed through a procedure defined by requireing that implementation. Similarly, all of the operations of the relational database implementation are accessed through the procedure returned by make-relational-system. For instance, a new relational database could be created from the procedure returned by make-relational-system by:

(require 'alist-table)
(define relational-alist-system
        (make-relational-system alist-table))
(define create-alist-database
        (relational-alist-system 'create-database))
(define my-database
        (create-alist-database "mydata.db"))

What follows are the descriptions of the methods available from relational system returned by a call to make-relational-system.

Function: create-database filename

Returns an open, nearly empty relational database associated with filename. The only tables defined are the system catalog and domain table. Calling the close-database method on this database and possibly other operations will cause filename to be written to. If filename is #f a temporary, non-disk based database will be created if such can be supported by the underlying base table implelentation. If the database cannot be created as specified #f is returned. For the fields and layout of descriptor tables, See section Catalog Representation

Function: open-database filename mutable?

Returns an open relational database associated with filename. If mutable? is #t, this database will have methods capable of effecting change to the database. If mutable? is #f, only methods for inquiring the database will be available. Calling the close-database (and possibly other) method on a mutable? database will cause filename to be written to. If the database cannot be opened as specified #f is returned.

Relational Database Operations

These are the descriptions of the methods available from an open relational database. A method is retrieved from a database by calling the database with the symbol name of the operation. For example:

(define my-database
        (create-alist-database "mydata.db"))
(define telephone-table-desc
        ((my-database 'create-table) 'telephone-table-desc))

Function: close-database
Causes the relational database to be written to its associated file (if any). If the write is successful, subsequent operations to this database will signal an error. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Function: write-database filename
Causes the relational database to be written to filename. If the write is successful, also causes the database to henceforth be associated with filename. Calling the close-database (and possibly other) method on this database will cause filename to be written to. If filename is #f this database will be changed to a temporary, non-disk based database if such can be supported by the underlying base table implelentation. If the operations completed successfully, #t is returned. Otherwise, #f is returned.

Function: table-exists? table-name
Returns #t if table-name exists in the system catalog, otherwise returns #f.

Function: open-table table-name mutable?
Returns a methods procedure for an existing relational table in this database if it exists and can be opened in the mode indicated by mutable?, otherwise returns #f.

These methods will be present only in databases which are mutable?.

Function: delete-table table-name
Removes and returns the table-name row from the system catalog if the table or view associated with table-name gets removed from the database, and #f otherwise.

Function: create-table table-desc-name
Returns a methods procedure for a new (open) relational table for describing the columns of a new base table in this database, otherwise returns #f. For the fields and layout of descriptor tables, See section Catalog Representation.

Function: create-table table-name table-desc-name
Returns a methods procedure for a new (open) relational table with columns as described by table-desc-name, otherwise returns #f.

Function: create-view ??
Function: project-table ??
Function: restrict-table ??
Function: cart-prod-tables ??
Not yet implemented.

Table Operations

These are the descriptions of the methods available from an open relational table. A method is retrieved from a table by calling the table with the symbol name of the operation. For example:

(define telephone-table-desc
        ((my-database 'create-table) 'telephone-table-desc))
(require 'common-list-functions)
(define ndrp (telephone-table-desc 'row:insert))
(ndrp '(1 #t name #f string))
(ndrp '(2 #f telephone
          (lambda (d)
            (and (string? d) (> (string-length d) 2)
                 (every
                  (lambda (c)
                    (memv c '(#\0 #\1 #\2 #\3 #\4 #\5 #\6 #\7 #\8 #\9
                                  #\+ #\( #\  #\) #\-)))
                  (string->list d))))
          string))

Some operations described below require primary key arguments. Primary keys arguments are denoted key1 key2 .... It is an error to call an operation for a table which takes primary key arguments with the wrong number of primary keys for that table.

The term row used below refers to a Scheme list of values (one for each column) in the order specified in the descriptor (table) for this table. Missing values appear as #f. Primary keys must not be missing.

Function: get column-name
Returns a procedure of arguments key1 key2 ... which returns the value for the column-name column of the row associated with primary keys key1, key2 ... if that row exists in the table, or #f otherwise.

((plat 'get 'processor) 'djgpp) => i386
((plat 'get 'processor) 'be-os) => #f

Function: get* column-name
Returns a procedure of optional arguments match-key1 ... which returns a list of the values for the specified column for all rows in this table. The optional match-key1 ... arguments restrict actions to a subset of the table. See the match-key description below for details.

((plat 'get* 'processor)) =>
(i386 8086 i386 8086 i386 i386 8086 m68000
 m68000 m68000 m68000 m68000 powerpc)

((plat 'get* 'processor) #f) =>
(i386 8086 i386 8086 i386 i386 8086 m68000
 m68000 m68000 m68000 m68000 powerpc)

(define (a-key? key)
   (char=? #\a (string-ref (symbol->string key) 0)))

((plat 'get* 'processor) a-key?) =>
(m68000 m68000 m68000 m68000 m68000 powerpc)

((plat 'get* 'name) a-key?) =>
(atari-st-turbo-c atari-st-gcc amiga-sas/c-5.10
 amiga-aztec amiga-dice-c aix)

Function: row:retrieve
Returns a procedure of arguments key1 key2 ... which returns the row associated with primary keys key1, key2 ... if it exists, or #f otherwise.

((plat 'row:retrieve) 'linux) => (linux i386 linux gcc)
((plat 'row:retrieve) 'multics) => #f

Function: row:retrieve*
Returns a procedure of optional arguments match-key1 ... which returns a list of all rows in this table. The optional match-key1 ... arguments restrict actions to a subset of the table. See the match-key description below for details.

((plat 'row:retrieve*) a-key?) =>
((atari-st-turbo-c m68000 atari turbo-c)
 (atari-st-gcc m68000 atari gcc)
 (amiga-sas/c-5.10 m68000 amiga sas/c)
 (amiga-aztec m68000 amiga aztec)
 (amiga-dice-c m68000 amiga dice-c)
 (aix powerpc aix -))

Function: row:remove
Returns a procedure of arguments key1 key2 ... which removes and returns the row associated with primary keys key1, key2 ... if it exists, or #f otherwise.

Function: row:remove*
Returns a procedure of optional arguments match-key1 ... which removes and returns a list of all rows in this table. The optional match-key1 ... arguments restrict actions to a subset of the table. See the match-key description below for details.

Function: row:delete
Returns a procedure of arguments key1 key2 ... which deletes the row associated with primary keys key1, key2 ... if it exists. The value returned is unspecified.

Function: row:delete*
Returns a procedure of optional arguments match-key1 ... which Deletes all rows from this table. The optional match-key1 ... arguments restrict deletions to a subset of the table. See the match-key description below for details. The value returned is unspecified. The descriptor table and catalog entry for this table are not affected.

Function: row:update
Returns a procedure of one argument, row, which adds the row, row, to this table. If a row for the primary key(s) specified by row already exists in this table, it will be overwritten. The value returned is unspecified.

Function: row:update*
Returns a procedure of one argument, rows, which adds each row in the list of rows, rows, to this table. If a row for the primary key specified by an element of rows already exists in this table, it will be overwritten. The value returned is unspecified.

Function: row:insert
Adds the row row to this table. If a row for the primary key(s) specified by row already exists in this table an error is signaled. The value returned is unspecified.

Function: row:insert*
Returns a procedure of one argument, rows, which adds each row in the list of rows, rows, to this table. If a row for the primary key specified by an element of rows already exists in this table, an error is signaled. The value returned is unspecified.

Function: for-each-row
Returns a procedure of arguments proc match-key1 ... which calls proc with each row in this table in the (implementation-dependent) natural ordering for rows. The optional match-key1 ... arguments restrict actions to a subset of the table. See the match-key description below for details.

Real relational programmers would use some least-upper-bound join for every row to get them in order; But we don't have joins yet.

The (optional) match-key1 ... arguments are used to restrict actions of a whole-table operation to a subset of that table. Those procedures (returned by methods) which accept match-key arguments will accept any number of match-key arguments between zero and the number of primary keys in the table. Any unspecified match-key arguments default to #f.

The match-key1 ... restrict the actions of the table command to those records whose primary keys each satisfy the corresponding match-key argument. The arguments and their actions are:

#f
The false value matches any key in the corresponding position.
an object of type procedure
This procedure must take a single argument, the key in the corresponding position. Any key for which the procedure returns a non-false value is a match; Any key for which the procedure returns a #f is not.
other values
Any other value matches only those keys equal? to it.

Function: close-table
Subsequent operations to this table will signal an error.

Constant: column-names
Constant: column-foreigns
Constant: column-domains
Constant: column-types
Return a list of the column names, foreign-key table names, domain names, or type names respectively for this table. These 4 methods are different from the others in that the list is returned, rather than a procedure to obtain the list.

Constant: primary-limit
Returns the number of primary keys fields in the relations in this table.

Catalog Representation

Each database (in an implementation) has a system catalog which describes all the user accessible tables in that database (including itself).

The system catalog base table has the following fields. PRI indicates a primary key for that table.

PRI table-name
    column-limit            the highest column number
    coltab-name             descriptor table name
    bastab-id               data base table identifier
    user-integrity-rule
    view-procedure          A scheme thunk which, when called,
                            produces a handle for the view.  coltab
                            and bastab are specified if and only if
                            view-procedure is not.

Descriptors for base tables (not views) are tables (pointed to by system catalog). Descriptor (base) tables have the fields:

PRI column-number           sequential integers from 1
    primary-key?            boolean TRUE for primary key components
    column-name
    column-integrity-rule
    domain-name

A primary key is any column marked as primary-key? in the corresponding descriptor table. All the primary-key? columns must have lower column numbers than any non-primary-key? columns. Every table must have at least one primary key. Primary keys must be sufficient to distinguish all rows from each other in the table. All of the system defined tables have a single primary key.

This package currently supports tables having from 1 to 4 primary keys if there are non-primary columns, and any (natural) number if all columns are primary keys. If you need more than 4 primary keys, I would like to hear what you are doing!

A domain is a category describing the allowable values to occur in a column. It is described by a (base) table with the fields:

PRI domain-name
    foreign-table
    domain-integrity-rule
    type-id
    type-param

The type-id field value is a symbol. This symbol may be used by the underlying base table implementation in storing that field.

If the foreign-table field is non-#f then that field names a table from the catalog. The values for that domain must match a primary key of the table referenced by the type-param (or #f, if allowed). This package currently does not support composite foreign-keys.

The types for which support is planned are:

    atom
    symbol
    string                  [<length>]
    number                  [<base>]
    money                   <currency>
    date-time
    boolean

    foreign-key             <table-name>
    expression
    virtual                 <expression>

Unresolved Issues

Although `rdms.scm' is not large, I found it very difficult to write (six rewrites). I am not aware of any other examples of a generalized relational system (although there is little new in CS). I left out several aspects of the Relational model in order to simplify the job. The major features lacking (which might be addressed portably) are views, transaction boundaries, and protection.

Protection needs a model for specifying priveledges. Given how operations are accessed from handles it should not be difficult to restrict table accesses to those allowed for that user.

The system catalog has a field called view-procedure. This should allow a purely functional implementation of views. This will work but is unsatisfying for views resulting from a selection (subset of rows); for whole table operations it will not be possible to reduce the number of keys scanned over when the selection is specified only by an opaque procedure.

Transaction boundaries present the most intriguing area. Transaction boundaries are actually a feature of the "Comprehensive Language" of the Relational database and not of the database. Scheme would seem to provide the opportunity for an extremely clean semantics for transaction boundaries since the builtin procedures with side effects are small in number and easily identified.

These side-effect builtin procedures might all be portably redefined to versions which properly handled transactions. Compiled library routines would need to be recompiled as well. Many system extensions (delete-file, system, etc.) would also need to be redefined.

There are 2 scope issues that must be resolved for multiprocess transaction boundaries:

Process scope
The actions captured by a transaction should be only for the process which invoked the start of transaction. Although standard Scheme does not provide process primitives as such, dynamic-wind would provide a workable hook into process switching for many implementations.
Shared utilities with state
Some shared utilities have state which should not be part of a transaction. An example would be calling a pseudo-random number generator. If the success of a transaction depended on the pseudo-random number and failed, the state of the generator would be set back. Subsequent calls would keep returning the same number and keep failing. Pseudo-random number generators are not reentrant; thus they would require locks in order to operate properly in a multiprocess environment. Are all examples of utilities whose state should not be part of transactions also non-reentrant? If so, perhaps suspending transaction capture for the duration of locks would solve this problem.

Database Utilities

(require 'database-utilities)

This enhancement wraps a utility layer on relational-database which provides:

Also included are utilities which provide:

for any SLIB relational database.

Function: create-database filename base-table-type
Returns an open, nearly empty enhanced (with *commands* table) relational database (with base-table type base-table-type) associated with filename.

Function: open-database filename
Function: open-database filename base-table-type
Returns an open enchanced relational database associated with filename. The database will be opened with base-table type base-table-type) if supplied. If base-table-type is not supplied, open-database will attempt to deduce the correct base-table-type. If the database can not be opened or if it lacks the *commands* table, #f is returned.

Function: open-database! filename
Function: open-database! filename base-table-type
Returns mutable open enchanced relational database ...

The table *commands* in an enhanced relational-database has the fields (with domains):

PRI name        symbol
    parameters  parameter-list
    procedure   expression
    documentation string

The parameters field is a foreign key (domain parameter-list) of the *catalog-data* table and should have the value of a table described by *parameter-columns*. This parameter-list table describes the arguments suitable for passing to the associated command. The intent of this table is to be of a form such that different user-interfaces (for instance, pull-down menus or plain-text queries) can operate from the same table. A parameter-list table has the following fields:

PRI index       uint
    name        symbol
    arity       parameter-arity
    domain      domain
    defaulter   expression
    expander    expression
    documentation string

The arity field can take the values:

single
Requires a single parameter of the specified domain.
optional
A single parameter of the specified domain or zero parameters is acceptable.
boolean
A single boolean parameter or zero parameters (in which case #f is substituted) is acceptable.
nary
Any number of parameters of the specified domain are acceptable. The argument passed to the command function is always a list of the parameters.
nary1
One or more of parameters of the specified domain are acceptable. The argument passed to the command function is always a list of the parameters.

The domain field specifies the domain which a parameter or parameters in the indexth field must satisfy.

The defaulter field is an expression whose value is either #f or a procedure of one argument (the parameter-list) which returns a list of the default value or values as appropriate. Note that since the defaulter procedure is called every time a default parameter is needed for this column, sticky defaults can be implemented using shared state with the domain-integrity-rule.

Invoking Commands

When an enhanced relational-database is called with a symbol which matches a name in the *commands* table, the associated procedure expression is evaluated and applied to the enhanced relational-database. A procedure should then be returned which the user can invoke on (optional) arguments.

The command *initialize* is special. If present in the *commands* table, open-database or open-database! will return the value of the *initialize* command. Notice that arbitrary code can be run when the *initialize* procedure is automatically applied to the enhanced relational-database.

Note also that if you wish to shadow or hide from the user relational-database methods described in section Relational Database Operations, this can be done by a dispatch in the closure returned by the *initialize* expression rather than by entries in the *commands* table if it is desired that the underlying methods remain accessible to code in the *commands* table.

Function: make-command-server rdb table-name
Returns a procedure of 2 arguments, a (symbol) command and a call-back procedure. When this returned procedure is called, it looks up command in table table-name and calls the call-back procedure with arguments:
command
The command
command-value
The result of evaluating the expression in the procedure field of table-name and calling it with rdb.
parameter-name
A list of the official name of each parameter. Corresponds to the name field of the command's parameter-table.
positions
A list of the positive integer index of each parameter. Corresponds to the index field of the command's parameter-table.
arities
A list of the arities of each parameter. Corresponds to the arity field of the command's parameter-table. For a description of arity see table above.
types
A list of the type name of each parameter. Correspnds to the type-id field of the contents of the domain of the command's parameter-table.
defaulters
A list of the defaulters for each parameter. Corresponds to the defaulters field of the command's parameter-table.
domain-integrity-rules
A list of procedures (one for each parameter) which tests whether a value for a parameter is acceptable for that parameter. The procedure should be called with each datum in the list for nary arity parameters.
aliases
A list of lists of (alias parameter-name). There can be more than one alias per parameter-name.

For information about parameters, See section Parameter lists. Here is an example of setting up a command with arguments and parsing those arguments from a getopt style argument list (see section Getopt).

(require 'database-utilities)
(require 'fluid-let)
(require 'parameters)
(require 'getopt)

(define my-rdb (create-database #f 'alist-table))

(define-tables my-rdb
  '(foo-params
    *parameter-columns*
    *parameter-columns*
    ((1 single-string single string
        (lambda (pl) '("str")) #f "single string")
     (2 nary-symbols nary symbol
        (lambda (pl) '()) #f "zero or more symbols")
     (3 nary1-symbols nary1 symbol
        (lambda (pl) '(symb)) #f "one or more symbols")
     (4 optional-number optional uint
        (lambda (pl) '()) #f "zero or one number")
     (5 flag boolean boolean
        (lambda (pl) '(#f)) #f "a boolean flag")))
  '(foo-pnames
    ((name string))
    ((parameter-index uint))
    (("s" 1)
     ("single-string" 1)
     ("n" 2)
     ("nary-symbols" 2)
     ("N" 3)
     ("nary1-symbols" 3)
     ("o" 4)
     ("optional-number" 4)
     ("f" 5)
     ("flag" 5)))
  '(my-commands
    ((name symbol))
    ((parameters parameter-list)
     (parameter-names parameter-name-translation)
     (procedure expression)
     (documentation string))
    ((foo
      foo-params
      foo-pnames
      (lambda (rdb) (lambda args (print args)))
      "test command arguments"))))

(define (dbutil:serve-command-line rdb command-table
                                   command argc argv)
  (set! argv (if (vector? argv) (vector->list argv) argv))
  ((make-command-server rdb command-table)
   command
   (lambda (comname comval options positions
                    arities types defaulters dirs aliases) 
     (apply comval (getopt->arglist
                    argc argv options positions
                    arities types defaulters dirs aliases)))))

(define (cmd . opts)
  (fluid-let ((*optind* 1))
    (printf "%-34s => "
            (call-with-output-string
             (lambda (pt) (write (cons 'cmd opts) pt))))
    (set! opts (cons "cmd" opts))
    (force-output)
    (dbutil:serve-command-line
     my-rdb 'my-commands 'foo (length opts) opts)))

(cmd)                              => ("str" () (symb) () #f) 
(cmd "-f")                         => ("str" () (symb) () #t) 
(cmd "--flag")                     => ("str" () (symb) () #t) 
(cmd "-o177")                      => ("str" () (symb) (177) #f) 
(cmd "-o" "177")                   => ("str" () (symb) (177) #f) 
(cmd "--optional" "621")           => ("str" () (symb) (621) #f) 
(cmd "--optional=621")             => ("str" () (symb) (621) #f) 
(cmd "-s" "speciality")            => ("speciality" () (symb) () #f) 
(cmd "-sspeciality")               => ("speciality" () (symb) () #f) 
(cmd "--single" "serendipity")     => ("serendipity" () (symb) () #f) 
(cmd "--single=serendipity")       => ("serendipity" () (symb) () #f) 
(cmd "-n" "gravity" "piety")       => ("str" () (piety gravity) () #f) 
(cmd "-ngravity" "piety")          => ("str" () (piety gravity) () #f) 
(cmd "--nary" "chastity")          => ("str" () (chastity) () #f) 
(cmd "--nary=chastity" "")         => ("str" () ( chastity) () #f) 
(cmd "-N" "calamity")              => ("str" () (calamity) () #f) 
(cmd "-Ncalamity")                 => ("str" () (calamity) () #f) 
(cmd "--nary1" "surety")           => ("str" () (surety) () #f) 
(cmd "--nary1=surety")             => ("str" () (surety) () #f) 
(cmd "-N" "levity" "fealty")       => ("str" () (fealty levity) () #f) 
(cmd "-Nlevity" "fealty")          => ("str" () (fealty levity) () #f) 
(cmd "--nary1" "surety" "brevity") => ("str" () (brevity surety) () #f) 
(cmd "--nary1=surety" "brevity")   => ("str" () (brevity surety) () #f) 
(cmd "-?")
-| 
Usage: cmd [OPTION ARGUMENT ...] ...

  -f, --flag 
  -o, --optional[=]<number> 
  -n, --nary[=]<symbols> ...
  -N, --nary1[=]<symbols> ...
  -s, --single[=]<string> 

ERROR: getopt->parameter-list "unrecognized option" "-?"

Some commands are defined in all extended relational-databases. The are called just like section Relational Database Operations.

Function: add-domain domain-row
Adds domain-row to the domains table if there is no row in the domains table associated with key (car domain-row) and returns #t. Otherwise returns #f.

For the fields and layout of the domain table, See section Catalog Representation. Currently, these fields are

The following example adds 3 domains to the `build' database. `Optstring' is either a string or #f. filename is a string and build-whats is a symbol.

(for-each (build 'add-domain)
          '((optstring #f
                       (lambda (x) (or (not x) (string? x)))
                       string
                       #f)
            (filename #f #f string #f)
            (build-whats #f #f symbol #f)))

Function: delete-domain domain-name
Removes and returns the domain-name row from the domains table.

Function: domain-checker domain
Returns a procedure to check an argument for conformance to domain domain.

Defining Tables

Procedure: define-tables rdb spec-0 ...
Adds tables as specified in spec-0 ... to the open relational-database rdb. Each spec has the form:

(<name> <descriptor-name> <descriptor-name> <rows>)

or

(<name> <primary-key-fields> <other-fields> <rows>)

where <name> is the table name, <descriptor-name> is the symbol name of a descriptor table, <primary-key-fields> and <other-fields> describe the primary keys and other fields respectively, and <rows> is a list of data rows to be added to the table.

<primary-key-fields> and <other-fields> are lists of field descriptors of the form:

(<column-name> <domain>)

or

(<column-name> <domain> <column-integrity-rule>)

where <column-name> is the column name, <domain> is the domain of the column, and <column-integrity-rule> is an expression whose value is a procedure of one argument (which returns #f to signal an error).

If <domain> is not a defined domain name and it matches the name of this table or an already defined (in one of spec-0 ...) single key field table, a foriegn-key domain will be created for it.

The following example shows a new database with the name of `foo.db' being created with tables describing processor families and processor/os/compiler combinations.

The database command define-tables is defined to call define-tables with its arguments. The database is also configured to print `Welcome' when the database is opened. The database is then closed and reopened.

(require 'database-utilities)
(define my-rdb (create-database "foo.db" 'alist-table))

(define-tables my-rdb
  '(*commands*
    ((name symbol))
    ((parameters parameter-list)
     (procedure expression)
     (documentation string))
    ((define-tables
      no-parameters
      no-parameter-names
      (lambda (rdb) (lambda specs (apply define-tables rdb specs)))
      "Create or Augment tables from list of specs")
     (*initialize*
      no-parameters
      no-parameter-names
      (lambda (rdb) (display "Welcome") (newline) rdb)
      "Print Welcome"))))

((my-rdb 'define-tables)
 '(processor-family
   ((family    atom))
   ((also-ran  processor-family))
   ((m68000           #f)
    (m68030           m68000)
    (i386             8086)
    (8086             #f)
    (powerpc          #f)))

 '(platform
   ((name      symbol))
   ((processor processor-family)
    (os        symbol)
    (compiler  symbol))
   ((aix              powerpc aix     -)
    (amiga-dice-c     m68000  amiga   dice-c)
    (amiga-aztec      m68000  amiga   aztec)
    (amiga-sas/c-5.10 m68000  amiga   sas/c)
    (atari-st-gcc     m68000  atari   gcc)
    (atari-st-turbo-c m68000  atari   turbo-c)
    (borland-c-3.1    8086    ms-dos  borland-c)
    (djgpp            i386    ms-dos  gcc)
    (linux            i386    linux   gcc)
    (microsoft-c      8086    ms-dos  microsoft-c)
    (os/2-emx         i386    os/2    gcc)
    (turbo-c-2        8086    ms-dos  turbo-c)
    (watcom-9.0       i386    ms-dos  watcom))))

((my-rdb 'close-database))

(set! my-rdb (open-database "foo.db" 'alist-table))
-|
Welcome

Database Reports

Code for generating database reports is in `report.scm'. After writing it using format, I discovered that Common-Lisp format is not useable for this application because there is no mechanismm for truncating fields. `report.scm' needs to be rewritten using printf.

Procedure: create-report rdb destination report-name table
Procedure: create-report rdb destination report-name
The symbol report-name must be primary key in the table named *reports* in the relational database rdb. destination is a port, string, or symbol. If destination is a:

port
The table is created as ascii text and written to that port.
string
The table is created as ascii text and written to the file named by destination.
symbol
destination is the primary key for a row in the table named *printers*.

Each row in the table *reports* has the fields:

name
The report name.
default-table
The table to report on if none is specified.
header, footer
A format string. At the beginning and end of each page respectively, format is called with this string and the (list of) column-names of this table.
reporter
A format string. For each row in the table, format is called with this string and the row.
minimum-break
The minimum number of lines into which the report lines for a row can be broken. Use 0 if a row's lines should not be broken over page boundaries.

Each row in the table *printers* has the fields:

name
The printer name.
print-procedure
The procedure to call to actually print.

The report is prepared as follows:

Database Browser

(require 'database-browse)

Procedure: browse database

Prints the names of all the tables in database and sets browse's default to database.

Procedure: browse

Prints the names of all the tables in the default database.

Procedure: browse table-name

For each record of the table named by the symbol table-name, prints a line composed of all the field values.

Procedure: browse pathname

Opens the database named by the string pathname, prints the names of all its tables, and sets browse's default to the database.

Procedure: browse database table-name

Sets browse's default to database and prints the records of the table named by the symbol table-name.

Procedure: browse pathname table-name

Opens the database named by the string pathname and sets browse's default to it; browse prints the records of the table named by the symbol table-name.

Weight-Balanced Trees

(require 'wt-tree)

Balanced binary trees are a useful data structure for maintaining large sets of ordered objects or sets of associations whose keys are ordered. MIT Scheme has an comprehensive implementation of weight-balanced binary trees which has several advantages over the other data structures for large aggregates:

These features make weight-balanced trees suitable for a wide range of applications, especially those that require large numbers of sets or discrete maps. Applications that have a few global databases and/or concentrate on element-level operations like insertion and lookup are probably better off using hash-tables or red-black trees.

The size of a tree is the number of associations that it contains. Weight balanced binary trees are balanced to keep the sizes of the subtrees of each node within a constant factor of each other. This ensures logarithmic times for single-path operations (like lookup and insertion). A weight balanced tree takes space that is proportional to the number of associations in the tree. For the current implementation, the constant of proportionality is six words per association.

Weight balanced trees can be used as an implementation for either discrete sets or discrete maps (associations). Sets are implemented by ignoring the datum that is associated with the key. Under this scheme if an associations exists in the tree this indicates that the key of the association is a member of the set. Typically a value such as (), #t or #f is associated with the key.

Many operations can be viewed as computing a result that, depending on whether the tree arguments are thought of as sets or maps, is known by two different names. An example is wt-tree/member?, which, when regarding the tree argument as a set, computes the set membership operation, but, when regarding the tree as a discrete map, wt-tree/member? is the predicate testing if the map is defined at an element in its domain. Most names in this package have been chosen based on interpreting the trees as sets, hence the name wt-tree/member? rather than wt-tree/defined-at?.

The weight balanced tree implementation is a run-time-loadable option. To use weight balanced trees, execute

(load-option 'wt-tree)

once before calling any of the procedures defined here.

Construction of Weight-Balanced Trees

Binary trees require there to be a total order on the keys used to arrange the elements in the tree. Weight balanced trees are organized by types, where the type is an object encapsulating the ordering relation. Creating a tree is a two-stage process. First a tree type must be created from the predicate which gives the ordering. The tree type is then used for making trees, either empty or singleton trees or trees from other aggregate structures like association lists. Once created, a tree `knows' its type and the type is used to test compatibility between trees in operations taking two trees. Usually a small number of tree types are created at the beginning of a program and used many times throughout the program's execution.

procedure+: make-wt-tree-type key<?
This procedure creates and returns a new tree type based on the ordering predicate key<?. Key<? must be a total ordering, having the property that for all key values a, b and c:

(key<? a a)                         => #f
(and (key<? a b) (key<? b a))       => #f
(if (and (key<? a b) (key<? b c))
    (key<? a c)
    #t)                             => #t

Two key values are assumed to be equal if neither is less than the other by key<?.

Each call to make-wt-tree-type returns a distinct value, and trees are only compatible if their tree types are eq?. A consequence is that trees that are intended to be used in binary tree operations must all be created with a tree type originating from the same call to make-wt-tree-type.

variable+: number-wt-type
A standard tree type for trees with numeric keys. Number-wt-type could have been defined by

(define number-wt-type (make-wt-tree-type  <))

variable+: string-wt-type
A standard tree type for trees with string keys. String-wt-type could have been defined by

(define string-wt-type (make-wt-tree-type  string<?))

procedure+: make-wt-tree wt-tree-type
This procedure creates and returns a newly allocated weight balanced tree. The tree is empty, i.e. it contains no associations. Wt-tree-type is a weight balanced tree type obtained by calling make-wt-tree-type; the returned tree has this type.

procedure+: singleton-wt-tree wt-tree-type key datum
This procedure creates and returns a newly allocated weight balanced tree. The tree contains a single association, that of datum with key. Wt-tree-type is a weight balanced tree type obtained by calling make-wt-tree-type; the returned tree has this type.

procedure+: alist->wt-tree tree-type alist
Returns a newly allocated weight-balanced tree that contains the same associations as alist. This procedure is equivalent to:

(lambda (type alist)
  (let ((tree (make-wt-tree type)))
    (for-each (lambda (association)
                (wt-tree/add! tree
                              (car association)
                              (cdr association)))
              alist)
    tree))

Basic Operations on Weight-Balanced Trees

This section describes the basic tree operations on weight balanced trees. These operations are the usual tree operations for insertion, deletion and lookup, some predicates and a procedure for determining the number of associations in a tree.

procedure+: wt-tree? object
Returns #t if object is a weight-balanced tree, otherwise returns #f.

procedure+: wt-tree/empty? wt-tree
Returns #t if wt-tree contains no associations, otherwise returns #f.

procedure+: wt-tree/size wt-tree
Returns the number of associations in wt-tree, an exact non-negative integer. This operation takes constant time.

procedure+: wt-tree/add wt-tree key datum
Returns a new tree containing all the associations in wt-tree and the association of datum with key. If wt-tree already had an association for key, the new association overrides the old. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/add! wt-tree key datum
Associates datum with key in wt-tree and returns an unspecified value. If wt-tree already has an association for key, that association is replaced. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/member? key wt-tree
Returns #t if wt-tree contains an association for key, otherwise returns #f. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/lookup wt-tree key default
Returns the datum associated with key in wt-tree. If wt-tree doesn't contain an association for key, default is returned. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/delete wt-tree key
Returns a new tree containing all the associations in wt-tree, except that if wt-tree contains an association for key, it is removed from the result. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

procedure+: wt-tree/delete! wt-tree key
If wt-tree contains an association for key the association is removed. Returns an unspecified value. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in wt-tree.

Advanced Operations on Weight-Balanced Trees

In the following the size of a tree is the number of associations that the tree contains, and a smaller tree contains fewer associations.

procedure+: wt-tree/split< wt-tree bound
Returns a new tree containing all and only the associations in wt-tree which have a key that is less than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of the size of wt-tree.

procedure+: wt-tree/split> wt-tree bound
Returns a new tree containing all and only the associations in wt-tree which have a key that is greater than bound in the ordering relation of the tree type of wt-tree. The average and worst-case times required by this operation are proportional to the logarithm of size of wt-tree.

procedure+: wt-tree/union wt-tree-1 wt-tree-2
Returns a new tree containing all the associations from both trees. This operation is asymmetric: when both trees have an association for the same key, the returned tree associates the datum from wt-tree-2 with the key. Thus if the trees are viewed as discrete maps then wt-tree/union computes the map override of wt-tree-1 by wt-tree-2. If the trees are viewed as sets the result is the set union of the arguments. The worst-case time required by this operation is proportional to the sum of the sizes of both trees. If the minimum key of one tree is greater than the maximum key of the other tree then the time required is at worst proportional to the logarithm of the size of the larger tree.

procedure+: wt-tree/intersection wt-tree-1 wt-tree-2
Returns a new tree containing all and only those associations from wt-tree-1 which have keys appearing as the key of an association in wt-tree-2. Thus the associated data in the result are those from wt-tree-1. If the trees are being used as sets the result is the set intersection of the arguments. As a discrete map operation, wt-tree/intersection computes the domain restriction of wt-tree-1 to (the domain of) wt-tree-2. The time required by this operation is never worse that proportional to the sum of the sizes of the trees.

procedure+: wt-tree/difference wt-tree-1 wt-tree-2
Returns a new tree containing all and only those associations from wt-tree-1 which have keys that do not appear as the key of an association in wt-tree-2. If the trees are viewed as sets the result is the asymmetric set difference of the arguments. As a discrete map operation, it computes the domain restriction of wt-tree-1 to the complement of (the domain of) wt-tree-2. The time required by this operation is never worse that proportional to the sum of the sizes of the trees.

procedure+: wt-tree/subset? wt-tree-1 wt-tree-2
Returns #t iff the key of each association in wt-tree-1 is the key of some association in wt-tree-2, otherwise returns #f. Viewed as a set operation, wt-tree/subset? is the improper subset predicate. A proper subset predicate can be constructed:

(define (proper-subset? s1 s2)
  (and (wt-tree/subset? s1 s2)
       (< (wt-tree/size s1) (wt-tree/size s2))))

As a discrete map operation, wt-tree/subset? is the subset test on the domain(s) of the map(s). In the worst-case the time required by this operation is proportional to the size of wt-tree-1.

procedure+: wt-tree/set-equal? wt-tree-1 wt-tree-2
Returns #t iff for every association in wt-tree-1 there is an association in wt-tree-2 that has the same key, and vice versa.

Viewing the arguments as sets wt-tree/set-equal? is the set equality predicate. As a map operation it determines if two maps are defined on the same domain.

This procedure is equivalent to

(lambda (wt-tree-1 wt-tree-2)
  (and (wt-tree/subset? wt-tree-1 wt-tree-2
       (wt-tree/subset? wt-tree-2 wt-tree-1)))

In the worst-case the time required by this operation is proportional to the size of the smaller tree.

procedure+: wt-tree/fold combiner initial wt-tree
This procedure reduces wt-tree by combining all the associations, using an reverse in-order traversal, so the associations are visited in reverse order. Combiner is a procedure of three arguments: a key, a datum and the accumulated result so far. Provided combiner takes time bounded by a constant, wt-tree/fold takes time proportional to the size of wt-tree.

A sorted association list can be derived simply:

(wt-tree/fold  (lambda (key datum list)
                 (cons (cons key datum) list))
               '()
               wt-tree))

The data in the associations can be summed like this:

(wt-tree/fold  (lambda (key datum sum) (+ sum datum))
               0
               wt-tree)

procedure+: wt-tree/for-each action wt-tree
This procedure traverses the tree in-order, applying action to each association. The associations are processed in increasing order of their keys. Action is a procedure of two arguments which take the key and datum respectively of the association. Provided action takes time bounded by a constant, wt-tree/for-each takes time proportional to in the size of wt-tree. The example prints the tree:

(wt-tree/for-each (lambda (key value)
                    (display (list key value)))
                  wt-tree))

Indexing Operations on Weight-Balanced Trees

Weight balanced trees support operations that view the tree as sorted sequence of associations. Elements of the sequence can be accessed by position, and the position of an element in the sequence can be determined, both in logarthmic time.

procedure+: wt-tree/index wt-tree index
procedure+: wt-tree/index-datum wt-tree index
procedure+: wt-tree/index-pair wt-tree index
Returns the 0-based indexth association of wt-tree in the sorted sequence under the tree's ordering relation on the keys. wt-tree/index returns the indexth key, wt-tree/index-datum returns the datum associated with the indexth key and wt-tree/index-pair returns a new pair (key . datum) which is the cons of the indexth key and its datum. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

These operations signal an error if the tree is empty, if index<0, or if index is greater than or equal to the number of associations in the tree.

Indexing can be used to find the median and maximum keys in the tree as follows:

median:   (wt-tree/index wt-tree
                         (quotient (wt-tree/size wt-tree) 2))

maximum:  (wt-tree/index wt-tree
                         (-1+ (wt-tree/size wt-tree)))

procedure+: wt-tree/rank wt-tree key
Determines the 0-based position of key in the sorted sequence of the keys under the tree's ordering relation, or #f if the tree has no association with for key. This procedure returns either an exact non-negative integer or #f. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

procedure+: wt-tree/min wt-tree
procedure+: wt-tree/min-datum wt-tree
procedure+: wt-tree/min-pair wt-tree
Returns the association of wt-tree that has the least key under the tree's ordering relation. wt-tree/min returns the least key, wt-tree/min-datum returns the datum associated with the least key and wt-tree/min-pair returns a new pair (key . datum) which is the cons of the minimum key and its datum. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree.

These operations signal an error if the tree is empty. They could be written

(define (wt-tree/min tree)        (wt-tree/index tree 0))
(define (wt-tree/min-datum tree)  (wt-tree/index-datum tree 0))
(define (wt-tree/min-pair tree)   (wt-tree/index-pair tree 0))

procedure+: wt-tree/delete-min wt-tree
Returns a new tree containing all of the associations in wt-tree except the association with the least key under the wt-tree's ordering relation. An error is signalled if the tree is empty. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree. This operation is equivalent to

(wt-tree/delete wt-tree (wt-tree/min wt-tree))

procedure+: wt-tree/delete-min! wt-tree
Removes the association with the least key under the wt-tree's ordering relation. An error is signalled if the tree is empty. The average and worst-case times required by this operation are proportional to the logarithm of the number of associations in the tree. This operation is equivalent to

(wt-tree/delete! wt-tree (wt-tree/min wt-tree))


Go to the first, previous, next, last section, table of contents.