Re: index bloat?





sql_server_2000_user wrote:
I updated statistics on the loan table and it had no effect.


And now for the obvious/hard question: now that I know what the right execution plan is, how can I get the query optimizer to generate it - *without the hints*? Because some users of the DB are using report designing software that will just generate the join query with no hints, and I don't want them sitting there for hours when they don't have to. Does the optimizer not choose the merge join because it requires a bookmark lookup? Does it not choose my nested loop join because the statistics are wrong somewhere?

This can be hard, but one place to start that I don't see done in this thread is to find the bad estimate, if there is one causing the problem. Usually this will be a bad rowcount estimate (rowcount estimates cannot always be improved with better statistics, so this can happen even after FULLSCAN) - sometimes it is a row size estimate.

First, though, just one thought - You noted that the index is
very fragmented, though the density is good.  This could be
a problem costing the seek on time period 196.  The optimizer might
expect all the qualifying rows to be physically contiguous, but
they might be scattered all over the place.  If the optimizer
expects 40-50 rows per page, and perhaps 1,000,000 qualifying
rows, the I/O for the scan will be ~20,000 pages, or ~150MB.
If instead, the index is so fragmented that to follow the
logical order requires 10 times as many pages, you could move
from an "all in memory" query to a disk-grinder.

I'm not sure whether the optimizer considers fragmentation at
all in planning - I assume not, but could be wrong.

Can you find out how many pages contain rows with time period
196?  You could clear the buffers and run a simple select like
select <some columns only in the clustered index>
from loan_history
where time_period = 196

It would be good here also to see if the estimated and actual
row count for that value are close in this simple query.

Then look at estimates in the actual slow query and see if any
are bad.  If you run one of the too-slow queries (the slowest
you can stand to run) after typing CTRL-K for "show execution plan",
you will see the estimated execution plan with a couple of
extra actual values in addition to most of the estimates you can
get with CTRL-L.  In particular, you will see the actual number
of rows (Row Count) and number of executions (Number of Executes)
of each operator.

Compare (Estimated Rowcount)*(Estimated number of executes) with
(Row Count) for the key operators.  You will have to get the
estimated number of executes from CTRL-L, because it is not repeated
in the actual plan.  Note that the actual rowcount already accounts
for the number of executes, so don't multiply it in.

Also for the key operators, look at Estimated Row Size.
It should be the average row size of an index row, and if it is
very far off, it can cause the wrong plan to be chosen.  I
doubt this is your problem, but it can't hurt to look.0

If you see bad estimates, let us know.

If this is the problem, and it can't be fixed with
statistics, you might consider trying to change the estimate
with extra predicates that don't affect the query result.
Changing the estimate to a smaller number can be done by
repeating or adding predicates that have no real effect but
which the optimizer expects will restrict the number of rows.
Making estimates larger in the hope of leading the optimizer
away from an index that is not helping might be possible
by adding superfluous OR conditions, or with a non-SARGable
predicate that cannot use statistics and will use a generic
estimate.

Perhaps some of this will help, in addition to what Erland's
good advice is doing.

Steve Kass
Drew University



Thanks again, Seth

.



Relevant Pages

  • Re: Frustrating Execution Plan Analysis
    ... >I've always been under the impression that Query cost is directly related to ... But if the execution plan isn't making sense, ... SQL Server will make a plan based on assumptions about what will ...
    (microsoft.public.sqlserver.programming)
  • Re: A new proof of the superiority of set oriented approaches: numerical/time serie linear interpola
    ... reduces the number of passes through the data required to answer a query. ... What kind of lock? ... The reduction in execution time improves response time. ... by a "sufficiently intelligent" optimizer. ...
    (comp.databases.theory)
  • RE: JOIN v. Subquery Speed
    ... tries to generate an execution plan with the lowest estimated cost. ... the optimizer when it evaluates each table involved in the query. ... Please feel free to let my manager ...
    (microsoft.public.sqlserver.programming)
  • Re: Question for 10g enthusiasts
    ... force Oracle to use the better execution plan for all users. ... If the access is through private synonyms, the query will ...
    (comp.databases.oracle.server)
  • Re: Performance issues with stored procedure
    ... execution plan for the query therefore when you run SQL Server Profiler you ... should see 'Cach Insert' event and the 'Cache remove' or 'Cache missed' ... Have you seen any differences in execution plan each time of execution? ...
    (microsoft.public.sqlserver.server)