Re: SEQUENTIAL SCANS
- From: bozon <curtis@xxxxxxxxxxxx>
- Date: Mon, 10 Dec 2007 06:49:03 -0800 (PST)
On Dec 8, 12:00 pm, Richard Kofler <richard.kof...@xxxxxxxxx> wrote:
mohitanch...@xxxxxxxxx schrieb:
On Dec 6, 2:33 pm, Jonathan Leffler <jleff...@xxxxxxxxxxxxx> wrote:
mohitanch...@xxxxxxxxx wrote:
Version IDS 10How many rows in table A? What are the distributions like on s, t, u?
I am running a query something like this:
table A (column s, column t, column u, column v)
table A has individual index on column s.
table A has another individual index on column t.
table A has another individual index on column u.
Update stats have run as high on these columns
table B (column w, column x)How many rows in table B? What are the distributions like on w?
table B has individual index on column w.
table B has another individual index on column x.
Update stats have run as high on these columns
select * from A, B where A.s = 1200 and B.w = A.t;...on A.s, I assume.
Above query correctly takes INDEX PATH.
Now I change the query to:What cost did SET EXPLAIN give you?
select * from A, B where A.s = 1200 and (B.w = A.t or B.w = A.u);
set explain shows that it did SEQUENTIAL SCAN on table B.
It's perplexing why would optimizer behave like this. It should be
taking index path for the second query too.
Have you tried to force it to use an index scan with a hint?
What cost did SET EXPLAIN give you now?
Have you considered using:
SELECT * FROM A, B WHERE A.S = 1200 AND B.W = A.T
UNION -- ALL?
SELECT * FROM A, B WHERE A.S = 1200 AND B.W = A.U
It shouldn't be necessary - and the absence of 'ALL' might slow this down.
Depending on the statistics and distributions, it might be reasonable to
cost the sequential scan less than the indexed join - but I agree it is
not all that likely.
--
Jonathan Leffler #include <disclaimer.h>
Email: jleff...@xxxxxxxxxxxxx, jleff...@xxxxxxxxxx
Guardian of DBD::Informixv2007.0914 --http://dbi.perl.org/
publictimestamp.org/ptb/PTB-1963 tiger2 2007-12-06 21:00:03
29E0C81DABD186DDC84130B4E36174FFB2FCD866B598A5E7
Table A has around 20M rows and table B has around 130M rows, but
table B is fragmented accross 15 dbspaces. Column in table B have High
Mode, 0.500000 Resolution. It's really bizzare that optimizer is
choosing squential scan when "OR" is added on on the same column. I
ran update stats just before running the query too.
Hi
here are my 2 Eurocent:
What the optimizer prefers is:
scan B, indexread A - do a nested loop join
I assume (but do not know) that you can read
like 1000 pages belonging to table B from disk per second and fragment
(given that you have READAHEAD around 4-6, not higher)
I further assume, that RPP (rows per page) of table B is 20 (row size around 100 bytes)
Your nrows of table B is 130.000.000 (BTW: does the optimizer know this, i.e.
you did run update statistics low for table B, didn't you?)
Now for your 15 frags you read 15.000 pages or 300.000 rows of table B per second
So here we are: Pls test, if you can do a seqscan over table B in approx. 6,6 minutes.
And then run update statistics low on table B, as I am quite sure now, that
the poor optimizer does not know nrows of table B.
But pls, read on!
Now on the other hand we have:
Indexread B, Indexread A - do a nested loop join
Chances are (on modern HW, cheap (commodity HW) that is, no guaratnee that
this is also true on big iron) that you can attach 100.000 BUFFERS per second
I guess you indexes have 5 or 6 levels, including leaf level - let's go with 5
After a short while probability of buffer hit is
I further assume that you can do a direct read from disk in 1000 mysec or 1 millisecond
Level 0, 1, 2: 100% 3 x 1 mysec 3 mysec
Level 3: 50% 501 mysec 501 mysec
Level 4 (Leafes) 10% 901 mysec 901 mysec
total per indexread: 1.405 Mysec
The according avg access time you can see on right part of the line
--> you can do 700 indexreads per second
Therefore when you expect to hit 300.000 rows or more into result set, index based access
is not as fast as a seqscan.
Not for this situation, but for other queries this _is_ very important to keep in mind,
as I always win bets with everyone and her sister, because few ppl 'd assume that
the sweet spot is so low (300.000 is less than 0,3% of the table)
HTH
dic_k
--
Richard Kofler
SOLID STATE EDV
Dienstleistungen GmbH
Vienna/Austria/Europe
nice explanation. I have tried to explain this in the past before but
have failed miserably. Nice to see a good explanation.
.
- References:
- SEQUENTIAL SCANS
- From: mohitanchlia
- Re: SEQUENTIAL SCANS
- From: Jonathan Leffler
- Re: SEQUENTIAL SCANS
- From: mohitanchlia
- Re: SEQUENTIAL SCANS
- From: Richard Kofler
- SEQUENTIAL SCANS
- Prev by Date: Re: Where The *** is the Informix Developer Edition V11 Download?
- Next by Date: Re: SEQUENTIAL SCANS
- Previous by thread: Re: SEQUENTIAL SCANS
- Next by thread: Re: SEQUENTIAL SCANS
- Index(es):