Re: Planning multiple queries



(jim_geissman@xxxxxxxxxxxxxxx) writes:
> I'm cleaning data which involves updating ~12 million rows with three
> different models, progressively. First clean values using the model
> with finest granularity, then the remainder with the next model,
> finally what's left using the last model. The first model sets ~1/2 of
> the rows, the second ~1/4, the third ~1/5, and the remaining 5% don't
> get updated.
>
> It's something like this:
>
> UPDATE t SET value=value*m.AdjustmentFactor, updateFlag='updated'
> FROM Table t JOIN Models m ON ....
> WHERE m.ModelID='first model' AND t.updateFlag IS NULL
>
> Start with 'first model' then 'second model' etc.
>
> I'm wondering what happens if I submit all three queries together, or
> as three separate submissions, waiting for the one before to complete.
> If I do them all as one group, the query planner might plan for the
> second and third updates based on the initial distribution of values.
> However, the first update removes half of the rows from consideration,
> so it seems to me a new plan should be prepared for the second query,
> based on the distribution at that time. If I highlight the queries in
> Query Analyzer and execute, are all three plans created at the
> beginning? Does putting GO between them (which I do) make any
> difference?

If you submit them all in one batch, SQL Server generates a query plan
for all three queries at that point. However, if the first update
causes suffciently many rows to be updated, this may trigger auto-
statistcs to set in, which can lead to the entire query batch to be
recompiled. So in this case, using separate batches is a little more
effective, as you avoid the recompilations. But I suspect that's a
negliglible part of the operation.

If no statistics get updated as result of the query, it will not matter
how you submit them.

The recompilation threshold is when 20% of the rows has changed, so
judging from your description, there is a fair chance that you will
get statistics updated. And, of course, if there were no statistics
when SQL Server started, it will create statistics automatically.


--
Erland Sommarskog, SQL Server MVP, esquel@xxxxxxxxxxxxx

Books Online for SQL Server 2005 at
http://www.microsoft.com/technet/prodtechnol/sql/2005/downloads/books.mspx
Books Online for SQL Server 2000 at
http://www.microsoft.com/sql/prodinfo/previousversions/books.mspx
.



Relevant Pages

  • Re: CONTAINS performance
    ... That said, and with the query plan, I can start to give you more ... relational join in the context of the free-text optimization, ... SQL Server tables. ...
    (microsoft.public.sqlserver.fulltext)
  • Re: Reindex doesnt work
    ... The answer is that the whole affair with maintenance plan and reindexing ... SQL Server builds a query plan for the first time, ... passing a high-water mark as parameter. ...
    (microsoft.public.sqlserver.tools)
  • Re: Reindex doesnt work
    ... The answer is that the whole affair with maintenance plan and reindexing ... SQL Server builds a query plan for the first time, ... passing a high-water mark as parameter. ...
    (microsoft.public.sqlserver.tools)
  • Re: index bloat?
    ... Did you also update statistics (WITH FULLSCAN) on loan_history? ... > Was that a typo and you meant to say it sounds like a good plan? ... how can I get the query optimizer to generate it - ...
    (comp.databases.ms-sqlserver)
  • Re: WHERE clause question
    ... > no reason that select * with where should run slower. ... There are two parts to executing a query. ... SQL Server can start spooling rows as soon as it finds matches and has the ... if the statistics is outdated, ...
    (microsoft.public.sqlserver.programming)