Re: Optimizer Issues with 10.00.xC8



Thomas J. Girsch wrote:
Art S. Kagel (Oninit) wrote:

You may indeed have found a bug in the 10.00.xC8 optimizer and should report this to IBM.

Already have, and am continuing to work on a test case.

However, at least on the Solaris DB side, I would make two suggestions for quick fixes. First, if the selectivity of colA is so good, but with millions of rows the default resolution of the data distributions may be too low. Try running a HIGH on colA with double the default resolution. That may be the key to getting better results from the optimizer

I hadn't thought to try that, but I wouldn't expect that to make that big a difference. I did some selectivity tests on the table. There are a little fewer than 200 million rows in the table. There are no more than 18 records for any one value of ColA. Whereas for ColC, nearly 60% (58.7%) of the rows have a value of ColC = 0. That, to my mind, makes ColC a bad index to begin with (and it would still be bad, even if you appended ColA to it, because you'd still have to trudge through all

Well, unless you need colC to lead the index for sorting, how about an index on colA then colC? That would sidestep the low selectivity of the colC index. If you still need the colC only index for some queries, then keep both.

those zero values). Actually, what I really need to do is fragment the index on ColC so that all the 0s go in one fragment, and all the nonzero values go into another fragment or set of fragments. This has worked well on another table; but I digress...

In further tests, what I've found is that queries run from dbaccess always choose the best index. But when you're using a cursor where colA = ? and colC = ?, it's prone to picking the wrong one (even though it used to work fine in FC6). I'm still trying to work on the particulars of that.

Interesting. That used to be a big problem. Since the engine doesn't know the values that you will ultimately supply to the replaceable parameters, the optimizer used to do a dumbed down version of the optimization that exhibited just the problems you are seeing with the queries working fine if it had exact values at prepare time, but not with parameters. I, and several other customers, complained about the problem and one of the big advances in the optimizer in 7.31/9.30 was for the engine to delay the final calculations of cost for these queries until it had the replacement values at cursor open/execute time.

This sounds like they somehow broke that in this release.

Art

On the AIX side, again, this is likely a bug, especially given that this used to work fine. That said, this is a common problem in IDS with tables that suffer major changes in contents immediately prior to being queried with no time to update stats properly in between. The best solution for this is to UPDATE STATISTICS LOW...DROP DISTRIBUTIONS immediately before the truncate/load operation, run that initial round of time sensitive queries without data distributions, and reinstate the distributions as soon after the initial round of queries as is practical.

Well, the real solution here is simply to run UPDATE STATISTICS after the truncate/reload, something we've been bugging the application team to do for nearly a year. However, we don't use distributions for this particular application. In fact, we almost never use distributions anywhere in the enterprise, because whenever we've tried it, we wind up having problems like what I describe above. Ditto for OPTCOMPIND=2, but again, I digress.

(Someone, I think it may have been Mark Scranton, used to bug us about this when we'd bring it up at conferences. "You should be running distributions and setting OPTCOMPIND=2." Not if we want our queries to come back today, we shouldn't...)

David:

Providing an optimizer directive does seem to resolve the issue, but we generally try to avoid putting them into our applications on a permanent basis. They work as a short-term fix, but that's about it.
_______________________________________________
Informix-list mailing list
Informix-list@xxxxxxxx
http://www.iiug.org/mailman/listinfo/informix-list

===========================================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:

http://www.oninit.com/home/disclaimer.php

===========================================================================================





===========================================================================================
Please access the attached hyperlink for an important electronic communications disclaimer:

http://www.oninit.com/home/disclaimer.php

===========================================================================================

.



Relevant Pages

  • Re: Optimizer Issues with 10.00.xC8
    ... one is a composite, unique index on colA, colB and the other is a nonunique index on colC. ... If I run full statistics with distributions, the optimizer picks the index on colC every time, which is Very Bad, and takes forever. ...
    (comp.databases.informix)
  • Re: Optimizer Issues with 10.00.xC8
    ...  one is a composite, unique index on colA, colB ... and the other is a nonunique index on colC ...  If I run full statistics with distributions ... You may indeed have found a bug in the 10.00.xC8 optimizer and should ...
    (comp.databases.informix)
  • Re: PHP/MySQL - Splitting similar data to two tables and querying as a whole
    ... > data - previous years stuff, which is only used in queries <5% of the ... select colA, colB, colC ...
    (comp.lang.php)
  • Re: Optimizer Issues with 10.00.xC8
    ... query runs), but the point is that in IDS 10.00.FC6, these queries ran ...  one is a composite, unique index on colA, colB ... and the other is a nonunique index on colC ...  If I run full statistics with distributions ...
    (comp.databases.informix)
  • Re: Optimizer Issues with 10.00.xC8
    ... First, if the selectivity of colA is so good, but with millions of rows the default resolution of the data distributions may be too low. ... Actually, what I really need to do is fragment the index on ColC so that all the 0s go in one fragment, and all the nonzero values go into another fragment or set of fragments. ... The best solution for this is to UPDATE STATISTICS LOW...DROP DISTRIBUTIONS immediately before the truncate/load operation, run that initial round of time sensitive queries without data distributions, and reinstate the distributions as soon after the initial round of queries as is practical. ...
    (comp.databases.informix)