Virtual Tables, External Data, and Aggregates
Page 5 of 6

Aggregate Data

One of the classic reasons for using a DBMS to store and retrieve data is that it makes it very easy to work with aggregates. Aggregates, sometimes also described as set functions or column functions, provide a means for evaluating an expression over a set of rows. Aggregates can calculate the minimum, maximum, sum, count, and average of an expression over a set of values as well as count rows. Aggregates are what help you analyze and see trends in raw data.

Cloudscape also allows you to create your own aggregates if you want to do something more sophisticated or unusual with a set of data than what the built-in aggregates allow you to do.

The tasks in this section will be more fruitful if you have a bit more data in some of the tables. To this end, run the BuildATour application a few more times now so that you have more tour-related customer data to analyze.

Run JBMSTours.BuildATour a Few More Times

  1. Open a command window and change directories to your_tutorial_home.
  2. Run your setclasspath script (see Running the Script).
  3. Run BuildATour:

    java -Dcloudscape.system.home= your_tutorial_system JBMSTours.BuildATour

  4. Run it three or four times.

Start ij and Work with Aggregates

  1. Start ij and connect to toursDB, following the instructions in Start ij and Connect to toursDB.

    The COUNT aggregate function counts the number of rows for which a particular column is not NULL.

  2. Count the number of maps in the Maps table:

    SELECT COUNT(map) FROM Maps;

  3. Count the number of cities in Europe:

    SELECT COUNT(city) FROM Cities, Countries
    WHERE Cities.country_ISO_code = Countries.country_ISO_code
    AND Region = 'Europe';

    To count the number of rows in a table, use the COUNT(*) expression.

  4. Count the number of rows in the HotelAvailability table:

    SELECT COUNT(*) FROM HotelAvailability;

  5. MAX and MIN return the maximum and minimum value, respectively, of a set of rows. Find the length of the longest flight in the Flights table:

    SELECT MAX(miles) FROM Flights;

  6. Find the length of the shortest one:

    SELECT MIN(miles) FROM Flights;

    The aggregate functions work on Java methods, as long as those methods return built-in data types.

  7. Find the age of the oldest person in the People table:

    SELECT MAX(person.getAge()) FROM People;

  8. Find out the largest running total (amount spent so far) in the Groups table:

    SELECT MAX(running_total) FROM Groups;

  9. Find the average cost of a tour.

    AVG calculates the average value of a set of rows. The keyword DISTINCT eliminates duplicate values before calculating the average.

    SELECT AVG(customized_tour.getTotalCost()) FROM CustomizedTours;

    SELECT AVG(DISTINCT customized_tour.getTotalCost()) FROM CustomizedTours;

  10. Find the average total mileage of a tour:

    SELECT AVG(customized_tour.getTotalMilesTraveled()) FROM CustomizedTours;

  11. Find the shortest flight originating from San Francisco:

    SELECT MIN(miles) FROM Flights
    WHERE orig_airport = 'SFO'

Work with the User-Defined Aggregate MAXBUTONE

toursDB has one user-defined aggregate called MAXBUTONE. It is similar to MAX, except it returns the second highest value in a column instead of the highest value.

User-defined aggregates are backed up by user-supplied Java classes and in that sense are similar to aliases. MAXBUTONE is backed up by the Java class JBMSTours.aggregates.MaxButOneDef. Examine the source for JBMSTours.aggregates.MaxButOneDef.java to get a glimpse of how this works.

MAXBUTONE works only on columns of type Integer and type Double Precision.

  1. Try it on the miles column (type Integer) in the Flights table:

    SELECT MAXBUTONE(distinct miles) FROM Flights;

    Specifying distinct means that Cloudscape eliminates duplicate values. Otherwise, if there were two values tied for first place, one of those would be returned as MAXBUTONE.

  2. Compare that to MAX:

    SELECT MAX(miles) FROM Flights;

  3. Try it on the flying_time column (type double):

    SELECT MAXBUTONE(flying_time) FROM Flights;