Re: dostats



On Sep 25, 9:50 pm, mohitanch...@xxxxxxxxx wrote:
On Sep 25, 4:23 pm, Fernando Nunes <s...@xxxxxxxxxxxxxxx> wrote:



mohitanch...@xxxxxxxxx wrote:
Version IDS 10

I downloaded dostats from iiug:

I am trying to understand how does dostat works.

1. I thought head of the index should be used as "high", other parts
of index as "medium" and rest of columns as low. But after I run
dostat I see that the all the keys of the index appear in "low" as
well as "high" part. I am not sure why and what the difference is.

2. Also, I noticed that same key is repeated mutliple times for same
"low/high/medium" type. I am not sure why and what the difference is.

3. I am not sure if repeating column in same type of level for eg:
low, have any impact on how long update stats take to run. Is there
any difference if we remove duplicates

I'm sure Art, the dostats author will answer you, but I can give some info:

1) Head of index should be high... I usually do the rest as medium. Sometimes
this can be unnecessary... LOW does not create distribution (histograms in
other RDBMS). LOW collects some other data and if the column belongs to an
index it gets specific index info and can probably "trigger" index cleaning.
"HIGH" will also do "LOW" unless you specify "distribution only"...

2 Possible because sometimes it uses "distribution only"?

3 It doesn't make sense to make low, medium and high or any combination of two
of these on the same column unless you specify distribution only in medium/high

Performance guide can help... Also look for "update statistics john miller".
You'll find a good article on developerworks...

Update statistics suffers from two issues:

- Lack of documentation
- Too much bad documentation

;)

Regards.

--
Fernando Nunes
Portugal

http://informix-technology.blogspot.com
My email works... but I don't check it frequently...- Hide quoted text -

- Show quoted text -

I'll give more example, I may sound confusing:

Version IDS 10

I downloaded dostats from iiug:

I am trying to understand how does dostat works.

1. I thought head of the index should be used as "high", other parts
of index as "medium" and rest of columns as low. But after I run
dostat I see that the all the keys of the index appear in "low" as
well as "high" part. I am not sure why and what the difference is.

-->for Example Column G in below example

2. Why high type has DISTRIBUTIONS ONLY

3. I am not sure if repeating column in same type of level for eg:
low, have any impact on how long update stats take to run. Is there
any difference if we remove duplicates

--> for example column G appears thrice in below example after running
dostats

EXAMPLE:

For Table T

Index Column
ix_1 A

ix_2 B,C,D

ix_3 E,F

ix_4 G,B,C,D

ix_5 G,E,C

ix_6 H

ix_7 I,B,J,C,D

-- OUTPUT FROM DOSTATS --

UPDATE STATISTICS LOW FOR TABLE T (G,B,C,D);
UPDATE STATISTICS LOW FOR TABLE T (B,C,D);
UPDATE STATISTICS LOW FOR TABLE T (A);
UPDATE STATISTICS LOW FOR TABLE T (H);
UPDATE STATISTICS LOW FOR TABLE T (I,B,J,C,D);
UPDATE STATISTICS LOW FOR TABLE T (G,E,C);
UPDATE STATISTICS LOW FOR TABLE T (E,F);
UPDATE STATISTICS HIGH FOR TABLE T (G,B,E,A,I,H) DISTRIBUTIONS ONLY;
UPDATE STATISTICS MEDIUM FOR TABLE T (D,NON-INDEX COLUMN, C, J, NON-
INDEX COLUMN, NON-INDEX COLUMN, NON-INDEX COLUMN, F);

-----
I started to look into update stats after I found that fragmenting
table in multiple dbspace actually has slowed down the queries. I am
not sure if tunning update statistics to generate more bins or samples
will help. Or perhaps reducing number of fragments from many to few.

My understanding about update statistics is:

1. Run UPDATE STATISTICS MEDIUM on all columns which are in an index,
but are not the first column of any index.
2. Run UPDATE STATISTICS HIGH on all columns which are the first
column in an index.

It looks like dostats is doing more than 1 and 2.

Yes it is. Here's what it does, as you've surmised, and why:

This is the method for newer 'optimized' versions of IDS (the older
server output is more complex because the older servers were not as
good at handling lots of columns in a single update statistics
statement - see John Miller's paper on the subject link below):

1. LOW on the full key of each index. This is neccessary to generate
values for statistical columns in the sysindexes or sysindices
(nunique, leaves, levels) table (without the fill key the update
statistics doesn't know that it should update the sysindices columns -
yes this can result in some columns being repeated in the LOW commands
but LOW is very cheap and fast and not dependent on the number of
columns being run. It also updates statistical columns in syscolumns
(colmin & colmax) and systables (nrows and npused).

2. HIGH on the lead columns of each index and on any columns following
the first if multiple indexes begin with the same first column(s).
This is done DISTRIBUTIONS ONLY because that skips the updating of
sysindices, systables and syscolumns which was already covered for
these columns in 1 above. If the statement will be too long for IDS
to process dostats will issue multiple statements, but it tries to
minimize the number as the newer versions of IDS execute the update
statistics more efficiently this way.

3. MEDIUM on any columns NOT listed in the HIGH statement(s). Again a
minimal number of commands are issued.

John Miller's paper covering these rules are a good adjunct to the
overview in the Performance Guide:

http://www.ibm.com/developerworks/db2/zones/informix/library/techarticle/miller/0203miller.html

Art S. Kagel

.



Relevant Pages

  • Re: dostats
    ... I'm sure Art, the dostats author will answer you, but I can give some info: ... I usually do the rest as medium. ... Possible because sometimes it uses "distribution only"? ... Also look for "update statistics john miller". ...
    (comp.databases.informix)
  • Re: dostats
    ... I downloaded dostats from iiug: ... I usually do the rest as medium. ... Also look for "update statistics john miller". ... UPDATE STATISTICS LOW FOR TABLE T; ...
    (comp.databases.informix)
  • Re: dostats
    ... I downloaded dostats from iiug: ... I usually do the rest as medium. ... Also look for "update statistics john miller". ... UPDATE STATISTICS LOW FOR TABLE T; ...
    (comp.databases.informix)
  • Re: dostats
    ... I downloaded dostats from iiug: ... I usually do the rest as medium. ... Also look for "update statistics john miller". ... UPDATE STATISTICS LOW FOR TABLE T; ...
    (comp.databases.informix)
  • Re: dostats
    ... I downloaded dostats from iiug: ... I usually do the rest as medium. ... Also look for "update statistics john miller". ... UPDATE STATISTICS LOW FOR TABLE T; ...
    (comp.databases.informix)