Composite index and data distribution



Hi,

I've observed a behaviour that I cannot explain by myself
and I would be glad for help. Here it goes (10gXE
but tested on 9.2.0.5 EE too):


SIMPLE TABLE WITH TWO COLUMNS:

SQL> create table foobar(foo number not null, bar number not null);

Table created.


DATA DISTRIBUTION: IN 1% OF THE ROWS FOO=1, IN 99% FOO=99:

SQL> begin
2 for i in 1..1000000 loop
3 insert into foobar(foo, bar) values (case when i <= 10000 then 1 else
99 end, i);
4 end loop;
5 end;
6 /

PL/SQL procedure successfully completed.


COMPOSITE INDEX (BEST I CAN THINK OF FOR THE QUERY TO BE EXECUTED):

SQL> create index foobar_idx on foobar(foo, bar);

Index created.


DETAILED STATISTICS (SO IT'S NOT STATISTICS-RELATED ISSUE):

SQL> exec dbms_stats.gather_table_stats(user, 'FOOBAR', cascade=>true,
estimate_percent=>100, method_opt=>'for all columns size auto');

PL/SQL procedure successfully completed.


JUST TO BE SURE - DATA DISTRIBUTION:

SQL> select foo, count(*) from foobar group by foo;

FOO COUNT(*)
---------- ----------
1 10000
99 990000


TOOLS:

SQL> set timing on
SQL> set autotrace traceonly
SQL> alter session set sql_trace=true;

Session altered.

Elapsed: 00:00:00.08


AND HERE COMES THE QUERY FOR FOO=1 (1% OF THE ROWS):

SQL> select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
2 from foobar where foo=1 order by bar) where rn <= 25;

25 rows selected.

Elapsed: 00:00:00.03

Execution Plan
----------------------------------------------------------
Plan hash value: 2407035716

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 650 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 25 | 650 | 3 (0)| 00:00:01 |
| 2 | COUNT | | | | | |
|* 3 | INDEX RANGE SCAN| FOOBAR_IDX | 25 | 175 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("RN"<=25)
3 - access("FOO"=1)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
30 consistent gets
0 physical reads
0 redo size
811 bytes sent via SQL*Net to client
395 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
25 rows processed


AND HERE IS THE SAME QUERY WITH FOO=99 (99% OF THE ROWS):

SQL> select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
2 from foobar where foo=99 order by bar) where rn <= 25;

25 rows selected.

Elapsed: 00:00:00.78

Execution Plan
----------------------------------------------------------
Plan hash value: 2407035716

---------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 25 | 650 | 3 (0)| 00:00:01 |
|* 1 | VIEW | | 25 | 650 | 3 (0)| 00:00:01 |
| 2 | COUNT | | | | | |
|* 3 | INDEX RANGE SCAN| FOOBAR_IDX | 25 | 175 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

1 - filter("RN"<=25)
3 - access("FOO"=99)


Statistics
----------------------------------------------------------
1 recursive calls
0 db block gets
2624 consistent gets
0 physical reads
0 redo size
811 bytes sent via SQL*Net to client
395 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
25 rows processed


In the first case there were 30 consistent gets and in the second - 2624!
Why?! Plan is the same. The index contains exactly the answer to the query
(both columns, sorted as requested), so I would expect it to just grab the
first 25 rows from the index and return them. I would ignore a few blocks
difference but it's not the case - Oracle is obviously doing something
different with those two queries.

Here is the output from tkprof:

********************************************************************************

select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
from foobar where foo=1 order by bar) where rn <= 25

call count cpu elapsed disk query current rows
------- ------ ------ -------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3 0.00 0.00 0 30 0 25
------- ------ ------ -------- ---------- ---------- ---------- ----------
total 5 0.00 0.00 0 30 0 25

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 36

Rows Row Source Operation
------- ---------------------------------------------------
25 VIEW (cr=30 pr=0 pw=0 time=70 us)
10000 COUNT (cr=30 pr=0 pw=0 time=60049 us)
10000 INDEX RANGE SCAN FOOBAR_IDX (cr=30 pr=0 pw=0 time=30038 us
(object id 14207)

********************************************************************************

select /*+ FIRST_ROWS(25) */ * from (select foo, rownum rn
from foobar where foo=99 order by bar) where rn <= 25

call count cpu elapsed disk query current rows
------- ------ ------ -------- ---------- ---------- ---------- ----------
Parse 1 0.00 0.00 0 0 0 0
Execute 1 0.00 0.00 0 0 0 0
Fetch 3 0.75 0.75 0 2624 0 25
------- ------ ------ -------- ---------- ---------- ---------- ----------
total 5 0.75 0.75 0 2624 0 25

Misses in library cache during parse: 1
Optimizer mode: FIRST_ROWS
Parsing user id: 36

Rows Row Source Operation
------- ---------------------------------------------------
25 VIEW (cr=2624 pr=0 pw=0 time=73 us)
990000 COUNT (cr=2624 pr=0 pw=0 time=5940050 us)
990000 INDEX RANGE SCAN FOOBAR_IDX (cr=2624 pr=0 pw=0 time=1980043 us
(object id 14207)


Seems like Oracle cannot optimize the query and really reads all those rows
instead of stopping after the first 25. Is it true or am I missing
something? I can't believe that with all those tricks that CBO is capable
of, it can't do such an obvious one. Any suggestions?

Thanks!
--
Michal Kuratczyk
.



Relevant Pages