Internal Language Transformations
Page 7 of 8

Sort Avoidance

Sorting is an expensive process. Cloudscape tries to eliminate unnecessary sorting steps where possible.

DISTINCT Elimination Based on a Uniqueness Condition

A DISTINCT (and the corresponding sort) can be eliminated from a query if a uniqueness condition exists that ensures that no duplicate values will be returned. If no duplicate values are returned, the DISTINCT node is superfluous, and Cloudscape transforms the statement internally into one without the DISTINCT keyword.

The requirements are:

  • Primary table

    There is at least one unique index on one table in the FROM list for which all the columns appear in one of the following:

  • Secondary table

    All the other tables in the FROM list also have at least one unique index for which all the columns appear in one of the following:

For example:

CREATE TABLE tab1 (c1 INT,
    c2 INT,
    c3 INT,
    c4 CHAR(2),
    PRIMARY KEY (c1, c2, c3))

CREATE TABLE tab2 (c1 INT,
    c2 INT,
    PRIMARY KEY (c1, c2))

INSERT INTO tab1 VALUES (1, 2, 3, 'WA'),
    (1, 2, 5, 'WA'),
    (1, 2, 4, 'CA'),
    (1, 3, 5, 'CA'),
    (2, 3, 1, 'CA')

INSERT INTO tab2 VALUES (1, 2),
    (1, 3),
    (2, 2),
    (2, 3)

-- all the columns in the index on the only table appear
-- in the way required for the Primary table
SELECT DISTINCT c1, c2, c3, c4
FROM tab1

-- all the columns in the index on the only table appear
-- in the way required for the Primary table
SELECT DISTINCT c3, c4
FROM tab1
WHERE c1 = 1
AND c2 = 2
AND c4 = 'WA'

-- all the columns in the index on tab1 appear
-- in the way required for the Primary table,
-- and all the columns in the
-- other tables appear in the way required
-- for a Secondary table
SELECT DISTINCT tab1.c1, tab1.c3, tab1.c4
FROM tab1, tab2
WHERE tab1.c2 = 2
AND tab2.c2 = tab1.c2
AND tab2.c1 = tab1.c1

Combining ORDER BY and DISTINCT

Without a transformation, a statement that contains both DISTINCT and ORDER BY would require two separate sorting steps--one to satisfy DISTINCT and one to satisfy ORDER BY. (Currently, Cloudscape uses sorting to evaluate DISTINCT. There are, in theory, other ways to accomplish this.) In some situations, Cloudscape can transform the statement internally into one that contains only one of these keywords. The requirements are:

  • The columns in the ORDER BY list must be a subset of the columns in the SELECT list.
  • All the columns in the ORDER BY list are sorted in ascending order.

A unique index is not required.

For example:

SELECT DISTINCT miles, meal
FROM Flights
ORDER BY meal

is transformed into

SELECT DISTINCT miles, meal
FROM Flights

Combining ORDER BY and UNION

Without a transformation, a statement that contains both ORDER BY and UNION would require two separate sorting steps--one to satisfy ORDER BY and one to satisfy UNION. (Currently Cloudscape uses sorting to eliminate duplicates from a UNION.)

In some situations, Cloudscape can transform the statement internally into one that contains only one of these keywords (the ORDER BY is thrown out). The requirements are:

  • The columns in the ORDER BY list must be a subset of the columns in the select list of the left side of the union.
  • All he columns in the ORDER BY list must be sorted in ascending order and they must be an in-order prefix of the columns in the target list of the left side of the UNION.

Cloudscape will be able to transform the following statements:

SELECT miles, meal
FROM Flights
UNION VALUES (1000, 'D')
ORDER BY miles

SELECT city_id, tour_level FROM Hotels
UNION
SELECT city_id, tour_level FROM Groups
ORDER BY Hotels.city_id, Hotels.tour_level

SELECT city_id, tour_level FROM Hotels
UNION
SELECT city_id, tour_level FROM Groups
ORDER BY city_id, tour_level

Cloudscape cannot avoid two sorting nodes in the following statement, because of the order of the columns in the ORDER BY clause:

SELECT city_id, tour_level FROM Hotels
UNION
SELECT city_id, tour_level FROM Groups
ORDER BY tour_level, city_id