Re: Optimizer Issues with 10.00.xC8
- From: "Art S. Kagel (Oninit)" <art@xxxxxxxxxx>
- Date: Sun, 02 Mar 2008 12:25:01 -0500
Thomas J. Girsch wrote:
Art S. Kagel (Oninit) wrote:
Well, unless you need colC to lead the index for sorting, how about an index on colA then colC? That would sidestep the low selectivity of the colC index. If you still need the colC only index for some queries, then keep both.We have some queries that search based on ColA, some that search based on ColA and ColC combined, and some that search on ColC alone (but only when ColC is nonzero). Presumably we _could_ add an additional index on (ColA, ColC), but we shouldn't have to, especially since ColA by itself gets you plenty close enough -- within 18 rows out of 200 million makes for a pretty quick scan.
Interesting. That used to be a big problem. Since the engine doesn't know the values that you will ultimately supply to the replaceable parameters, the optimizer used to do a dumbed down version of the optimization that exhibited just the problems you are seeing with the queries working fine if it had exact values at prepare time, but not with parameters. I, and several other customers, complained about the problem and one of the big advances in the optimizer in 7.31/9.30 was for the engine to delay the final calculations of cost for these queries until it had the replacement values at cursor open/execute time.That doesn't seem to be it. I'm still getting inconsistent results, but what I'm finding is that which path it takes depends on which values it gets the _first time_ that the cursor is opened. So if it gets a ColC=0 request first, it picks the ColA index (the correct one), and does so for subsequent executions as well. But if it gets a ColC=[nonzero] request first, it picks the index on ColC, and seems to stick with it.
This sounds like they somehow broke that in this release.
Again, this _only_ appears to be the case if there are distributions on the table. Without distributions, it's reliably picking the index on ColA, the correct one. Which is exactly backward from the behavior I'd expect to see. With the data skew on ColC, I would expect it to _never_ select that index when it has a value for ColA.
Makes sense to me. Without distributions it uses the older dumber optimizer algorithms which are dependend on the relative depth and number of unique values of the indexes (stored in sysindices) and the colmin & colmax values (stored in syscolumns) which are the second smallest and second greatest values for the column. Since zero is likely the smallest value of colC and the number of unique values for the colC only index is relatively low compared to the number of unique keys in the unique index on colA & colB (which is somewhere shy of 200million if I remember correctly) the colA, colB index is clearly the more selective according to the algorithms used in the absence of distributions.
Art S. Kagel
Oninit
===========================================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:
http://www.oninit.com/home/disclaimer.php
===========================================================================================
.
- Follow-Ups:
- Re: Optimizer Issues with 10.00.xC8
- From: tgirsch
- Re: Optimizer Issues with 10.00.xC8
- References:
- Re: Optimizer Issues with 10.00.xC8
- From: Thomas J. Girsch
- Re: Optimizer Issues with 10.00.xC8
- From: Art S. Kagel (Oninit)
- Re: Optimizer Issues with 10.00.xC8
- From: Thomas J. Girsch
- Re: Optimizer Issues with 10.00.xC8
- Prev by Date: Re: Public paper on IDS
- Next by Date: Washington Area Informix User Group Meeting - April 3, 2008 - Application Development Tools for Informix Users
- Previous by thread: Re: Optimizer Issues with 10.00.xC8
- Next by thread: Re: Optimizer Issues with 10.00.xC8
- Index(es):
Relevant Pages
|