Re: Optimizer Issues with 10.00.xC8
- From: "Art S. Kagel (Oninit)" <art@xxxxxxxxxx>
- Date: Fri, 29 Feb 2008 09:52:52 -0500
Thomas J. Girsch wrote:
Has anyone else been having optimizer issues with IDS 10.00.xC8?
We've got two applications on two different hardware platforms where the database backend was recently upgraded to IDS 10.00.FC8 (one DB is on Solaris 8, the other is on AIX 5.2), and in both cases, we've started getting queries that have long been running just fine that are now picking a bad path. These queries used to take seconds and now take minutes.
In the AIX case, we know that the statistics are less-than-current (the table is truncated and reloaded WITHOUT updating stats right before the query runs), but the point is that in IDS 10.00.FC6, these queries ran fine under these circumstances (and finished in about 6 seconds), but on FC8, under the exact same circumstances, it takes nearly 3 minutes. I've tested various scenarios, and the query actually runs better with NO STATS AT ALL than it does with the out-of-date stats it has (also not true in FC6).
In the Solaris case, I have a query that looks like a textbook case for distributions. It's a fairly sizable table, with millions of rows. There are two indexes: one is a composite, unique index on colA, colB (integer, smallint) and the other is a nonunique index on colC (integer). We have a query that selects against this table where colA = X and colC = Y. The data in colA has high selectivity, whereas the data in colC has very low selectivity (something like half the records in the table have colC = 0). If I run full statistics with distributions (including HIGH on colA and High on colC), the optimizer picks the index on colC every time, which is Very Bad, and takes forever. If I drop distributions, it (correctly) picks the composite index on (colA, colB) and runs very quickly. This is precisely the opposite of the behavior I would expect. For what it's worth, it doesn't seem to matter if OPTCOMPIND is 0 or 2 for this test, it still picks the wrong path when it has distributions.
Has anyone else had a similar experience? Any suggestions?
Thanks in advance,
You may indeed have found a bug in the 10.00.xC8 optimizer and should report this to IBM.
However, at least on the Solaris DB side, I would make two suggestions for quick fixes. First, if the selectivity of colA is so good, but with millions of rows the default resolution of the data distributions may be too low. Try running a HIGH on colA with double the default resolution. That may be the key to getting better results from the optimizer (FYI if the estimated #rows in a SET EXPLAIN output doesn't match the actual number of rows when you force the optimizer to use the index on colA & colB with a hint then this is the problem). Second, try appending colA (or even both colA and colB if you evern filter on all three columns) to the index on colC. That will resolve the problem altogether my making the colC index an even better and more selective index than the colA, colB index for this query and others like it.
On the AIX side, again, this is likely a bug, especially given that this used to work fine. That said, this is a common problem in IDS with tables that suffer major changes in contents immediately prior to being queried with no time to update stats properly in between. The best solution for this is to UPDATE STATISTICS LOW...DROP DISTRIBUTIONS immediately before the truncate/load operation, run that initial round of time sensitive queries without data distributions, and reinstate the distributions as soon after the initial round of queries as is practical. With no distributions, the older OL5.xx type optimizer will be invoked. It depends primarily on the depth of the indexes and the colmin and colmax columns of the syscolumns table to determine a query path. After the truncate/load operation these stats are likely to be sufficiently similar to the table contents immediately before the truncate/load operation that the decisions the optimizer makes in the absence of distributions will be reasonably good.
Another option, which some may find a bit shaky, is to pre-load the new data onto a second server with the same table/index schema, update stats there to the require level, then load up the data on the production server, manually delete the sysdistrib records for the table and then copy the distributions from the first server to the production server. You'll have up-to-date distributions without the post-load time and overhead needed to produce them on production. Sysdistrib is an ordinary table with no underlying hidden data related to it so it is safe to replace the distributions from another server as long as the server platform and version and the table schema, contents and indexes are identical. This is low risk, according to one of the original optimizer developers I spoke with several years ago, but as I said, an operation that some will deem shaky.
Art S. Kagel
Oninit
===========================================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:
http://www.oninit.com/home/disclaimer.php
===========================================================================================
.
- Follow-Ups:
- Re: Optimizer Issues with 10.00.xC8
- From: david@xxxxxxxxxxxxx
- Re: Optimizer Issues with 10.00.xC8
- References:
- Optimizer Issues with 10.00.xC8
- From: Thomas J. Girsch
- Optimizer Issues with 10.00.xC8
- Prev by Date: Re: Optimizer Issues with 10.00.xC8
- Next by Date: Re: current informix white papers ?
- Previous by thread: Re: Optimizer Issues with 10.00.xC8
- Next by thread: Re: Optimizer Issues with 10.00.xC8
- Index(es):
Relevant Pages
|