![]() |
Tuning Databases and Applications
|
Reference Manual |
Application and Database Design IssuesThings that you can do to improve the performance of Cloudscape applications fall into three categories. In order of importance, these categories are:
Avoiding Table Scans of Large TablesCloudscape is fast and efficient, but when tables are huge, scanning tables may take longer than a user would expect. It's even worse if you then ask Cloudscape to sort this data. Things that you can do to avoid table scans fall into two categories. These categories are, in order of importance:
Index, Index, IndexHave you ever thought what it would be like to look up a phone number in the phone book of a major metropolitan city if the book were not indexed by name? For example, to look up the phone number for John Jones, you couldn't go straight to the J page. You'd have to read the entire book. That's what a table scan is like. Cloudscape has to read the entire table to retrieve what you're looking for unless you create useful indexes on your table. Create Useful IndexesIndexes are useful when a query contains a WHERE clause. Without a WHERE clause, Cloudscape is supposed to return all the data in the table, and so a table scan is the correct (if not desirable) behavior. (More about that in Prevent the User from Issuing Expensive Queries.) Cloudscape creates indexes on tables in the following situations:
For an index to be useful for a particular statement, one of the columns in the statement's WHERE clause must be the first column in the index's key. NOTE: For a complete discussion of how indexes work and when they are useful (including pictures), see What Is an Index? and Index Use and Access Paths. Indexes provide some other benefits as well:
NOTE: Don't index on large columns. Make Sure They Are Being Used, and Rebuild ThemIf an index is useful for a query, Cloudscape is probably using it. However, you need to make sure. Analyze the way Cloudscape is executing your application's queries. See Analyzing Statement Execution for information on how to do this. In addition, over time, index pages fragment. Rebuilding indexes improves performance significantly in these situations. To rebuild an index, drop it and then re-create it. Think About Join OrderFor some queries, join order can make the difference between a table scan (expensive) and an index scan (cheap). Here's an example:
SELECT hotel_name If Cloudscape chooses Hotels as the outer table, it can use the index on Hotels to retrieve qualifying rows. (Given the data in toursDB, it will return three rows; three hotels have a city_id of 10.) Then it need only look up data in HotelAvailability three times, once for each qualifying row. And to retrieve the appropriate rows from HotelAvailability, it can use an index for HotelAvailability's hotel_id column instead of scanning the entire table. If Cloudscape chooses the other order, with HotelAvailability as the outer table, it will have to probe the Hotels table for every row, not just three rows, because there are no qualifications on the HotelAvailability table. For more information about join order, see Joins and Performance. Cloudscape usually chooses a good join order. However, as with index use, you should make sure. Analyze the way Cloudscape is executing your application's queries. See Analyzing Statement Execution for information on how to do this. Prevent the User from Issuing Expensive QueriesSome applications have complete control over the queries that they issue; the queries are built into the applications. Other applications allow users to construct queries by filling in fields on a form. Any time you let users construct ad-hoc queries, you risk the possibility that the query a user constructs will be one like the following:
SELECT * FROM ExtremelyHugeTable This statement has no WHERE clause. It will require a full table scan. To make matters worse, Cloudscape will then have to order the data. It's likely that the user doesn't really want to browse through all 100,000 rows, and doesn't really care whether they're all in order. Do everything you can to avoid table scans and sorting of large results (such as table scans). Some things you can do to disallow such runaway queries:
Understand When Statements Go Stale
OverviewWhen Cloudscape compiles and optimizes a statement, it creates what is called a statement execution plan. As discussed in Avoiding Compiling SQL Statements, this is a time-consuming process that you typically want to avoid. With stored prepared statements or prepared statements, Cloudscape can execute the same statement multiple times using the same plan and thus very quickly. However, sometimes a statement's existing plan may no be longer the best plan for executing the statement. These situations are caused by significant changes in the number of rows in a table. In these situations, you should be willing to pay the price of recompilation, because it is cheaper than using the stale execution plan. For example, if you have an empty table or a table with only a few rows in it, Cloudscape's optimizer will probably decide that it is easier to do a table scan than to access the data through an index. After you have loaded several thousand rows, however, a different statement execution plan is in order. A statement can only go stale if it remains open or in the statement cache. Table 3-1, "When Statements Can Go Stale", shows the situations in which it is technically possible for a statement to go stale. Automatic Stale Plan InvalidationTo help you avoid stale statements, Cloudscape automatically checks whether a Statement's execution plan is still appropriate after a certain number of executions of that statement (by default, 100 executions) within a single Cloudscape session. Cloudscape determines if a statement's plan is appropriate by comparing the number of rows in the table with the number of rows in the table when the statement was first executed. If there has been a significant change in the number of the rows in the table, Cloudscape assumes the statement plan is no longer appropriate, and it invalidates and thus recompiles the statement. This situation typically only affects SELECTs, UPDATEs, INSERT SELECTS, and DELETEs with WHERE clauses (statements that benefit from use of an index). For example, imagine that you create a JDBC Statement (not a stored prepared statement). Its text follows:
SELECT * Suppose that when your application creates this statement, the table is empty, and so Cloudscape chooses to scan the table instead of using the index on the orig_airport column. During the course of a single Connection, the table gains a lot of data. The same or another user could insert thousands of rows. Unless Cloudscape notices the significant change in the size of the table, it continues to use the original--and now stale--plan. Given the system defaults, the hundredth time the application executes that statement, Cloudscape checks to see if the size of the table has changed significantly since the Statement was created. If it has, it marks the statement invalid and recompiles it before executing. You can configure how often Cloudscape checks the appropriateness of a statement's plan with the cloudscape.language.stalePlanCheckInterval property. Stale Plans and Stored Prepared StatementsCloudscape tracks a table's size only within a single Cloudscape session. This limitation is not a problem for standard JDBC Statements and PreparedStatements, since they exist only within the scope of a single Connection. However, this may be a problem for stored prepared statements. If you stop and re-start Cloudscape often, Cloudscape may not notice a gradual change to a table that affects a stored prepared statement. In that situation, you may want to force recompilation, or you may want to configure Cloudscape to check whether to invalidate statements much more often (by setting cloudscape.language.stalePlanCheckInterval to a low value). If you do not stop and re-start Cloudscape often (for example, in a server environment), Cloudscape's automatic invalidation and recompilation of stale plans works quite well even with the default value. When a Change in Table Makes a Plan StaleCloudscape uses the following rules to determine whether a change in a table's size is significant enough to force recompilation for a statement: For large tables, Cloudscape uses a simple percentage test. For smaller tables, Cloudscape uses a more complex, non-linear test. The smaller the table, the more sensitive Cloudscape is to changes.
NOTE: These rules may change in future releases. Consider the following example:
If a table originally had 8 rows, and the system discovers that it now has 15 rows, it recompiles the statement, because the square of the difference (49) is greater than four times the original size of the table (32). Similarly, if the table originally had 8 rows, and it now has 2 rows, it recompiles the statement, because the square of the difference (36) is greater than four times the original size of the table (32). On the other hand, if the table originally had 8 rows and now has 10 rows, it would not recompile the statement, because the square of the difference (4) is not greater than four times the original size of the table (32).
Avoiding Compiling SQL StatementsWhen you submit an SQL-J statement to Cloudscape, Cloudscape compiles and then executes the statement. Compilation is a time-consuming process that involves several steps, including optimization, the stage in which Cloudscape makes its statement execution plan. A statement execution plan includes whether to use an index, the join order, and so on. Unless there are significant changes in the amount of data in a table or new or deleted indexes, Cloudscape will probably come up with the same statement execution plan for the same statement if you submit it more than once. This means that the same statements should share the same plan, and Cloudscape should not bother to recompile them. Cloudscape allows you to ensure this in the following ways (in order of importance): Across Connections and SessionsStatements from any connection or session can share the same statement execution plan (and avoid compilation) by using stored prepared statements. Stored prepared statements are database objects that persist within and across sessions. For more information, see Storing Prepared Statements to Improve Performance. Stored prepared statements are not part of a standard but are unique to Cloudscape. Stored prepared statements are one of the most important features you can use to improve the performance of your application. Within a Single ConnectionAs explained above in Across Connections and Sessions, you can create statements that can share the same statement execution plan (and avoid compilation) by using stored prepared statements. Within a single connection, you can also do the following to avoid extra compilation when for some reason it is impractical to use stored prepared statements:
Figure 3-1 A connection need only compile a PreparedStatement once. Subsequent executions can use the same statement execution plan even if the parameter values are different. (PreparedStatements are not shared across connections.)
NOTE: The scroll type of the statement (forward-only vs. scrolling insensitive) of a statement must be identical, in addition to the SQL text, for there to be a match. Figure 3-2 A connection can reuse a Statement object when the SQL text matches a prior statement exactly. (PreparedStatements are much more efficient.) Shielding Users from Cloudscape Class-Loading EventsJVMs tend to load classes as they are needed, which means the first time you need a class in a piece of software, it takes longer to use. Cloudscape has three clear cases when a lot of class loading occurs:
For any of these events, you can control the impact they have on users by starting them in separate threads while other tasks are occurring. In addition, if you are using PreparedStatements instead of stored prepared statements, prepare them in a separate thread in the background while other tasks are occurring. Even if you are using stored prepared statements, you must still create a Statement or PreparedStatement that executes the stored prepared statement. Do that in a background thread as well. |
|
![]() Cloudscape Version 3.6 For information about Cloudscape technical support, go to: www.cloudscape.com/support/.Copyright © 1998, 1999, 2000 Informix Software, Inc. All rights reserved. |