Re: "'Resultcache" effect in 10.2.0.4 ??
- From: jlchiappa <jlchiappa@xxxxxxxxx>
- Date: Wed, 29 Jul 2009 08:01:03 -0700 (PDT)
cached in the SGA on the first execution and retrieved from memory on the
second execution?
Matthias
He did mention in the OP that the number of I/O's (PIO's, I assume...)
and the plans are the same, as well as ALTER SESSION FLUSH
BUFFER_CACHE and FLUSH SHARED POOL don't make it happen.
YES, this is exactly the current suspect...
Perhaps it is possible that two separate things are happening with the
same apparent effect: When he shuts the db, it's slow because it has
to load the SGA, when he does it later, a SAN cache or something helps
things along.
This is the only possible explanation until now, so I´m buying it,
too.
I would think that there would be some way to flood the
OS cache he is talking about to show it is responsible.
yeah, in the test machine thiscolud be done, maybe writing a C program
malloc´íng more and more RAM until the OS begins to remove pages from
the OS cache, yes...
It's just so
odd that the 9i didn't show the effect, and 10g does (if I'm rereading
it correctly).
Yes you read correctly, this was exactly what happened. Strange,
agreed, but some possible explanations :
a) as I said, the files are in cooked filesystems, maybe in Solaris
9i was doing direct I/Os (bypassing FS caches) and 10g not, due to
different code in the db or even due to different settings
b) this db attends an Oracle EBS 11i app, and when in 9i it was using
RULE optimization for sure (I know it was true in 9i epoch), so maybe
the RBO was generating always a lot of small I/Os and the OS caching
effect for it was small too, hardly detected, only now with CBO and
big hashes the effect is more present
but all of this is pure speculation, only...
That's why I suggested bind peeking or something to do
with histograms.
Nope, as I said even if we test with the exact SAME values 2x the
effect is the same, so no way for bvp, I think...
Hopefully the 10046 will show more exactly where the
slow is, over all the combinations of situations.
In truth, my expectations about the 10046 are : if I could get 2 10046
trace files about 2 executions of the same SQL with the same values
but flushing buffer pool and hard parsing the SQLs each time , imho
the plans and the I/Os will be more or less the same, with this if I
see in the 10046 the same I/Os being faster in the 2nd exec, I prooved
that some "external force" is acting, something outside the db, being
OS cache, hardware cache, something like that...
I've been trying to
figure out if MBRC could somehow have this effect (different between
versions), but I don't think so - but see
http://www.freelists.org/post/oracle-l/db-file-multiblock-read-count-...
(and read the whole thread). Then of course, it could be some solaris
specific thing, I haven't been on solaris in a long time.
Hmmm, different MBRCs ??? I haven´t thought about this, maybe this
could be a possibility...
Is there any possibility we might find out the storage hardware
architecture?
Sure : the storage is an IBM Shark (unsure about the exact model and
cache sizings), disks in RAID-5 groups, and the space is formatted in
native (cooked) Solaris filesystems, that´s it...
IMHO, you are just guessing until you can actually get a trace. Also,
I wouldn't change anything. Run it once with a trace, log out, log
in, and run it again with a trace.
Sure.
What if you do the following:
- restart the instance
- execute and time the query
- reboot the Solaris server and start the instance
- execute and time the query
If the second query is faster than the first one, the data is probably
cached by the SAN. If not, the OS cache is probably playing a role.
very good idea, an interesting test, will try it if possible...
Out of curiosity, it sounds like your colleague isn't very interested
in getting this fixed if he/she can't provide you with a trace?
How true : it´s the same old story, the behavior occurs only in the
big beast SQLs, that ones naturally slow and complex, so the end-users
are "used" to get bad response times, and anyway this kind of monsters
are "batch" things, they are not directly impacting day to day work,
so no much interest here, yes... In truth, my intention, more than
only helping my colleague, is to learn, receive different inputs, and
to be prepared if something like this occurs with me, only... In some
measure I already received this, received some hints and thoughts, and
the important, eliminated the chance for some obscure point (maybe a
'hidden' result cache) inside the db being present without my
knowledge, this was a main objective here, too, and this possibility
was eliminated , I wanted it...
Regards,
Chiappa
.
- Follow-Ups:
- Re: "'Resultcache" effect in 10.2.0.4 ??
- From: joel garry
- Re: "'Resultcache" effect in 10.2.0.4 ??
- References:
- "'Resultcache" effect in 10.2.0.4 ??
- From: jlchiappa
- Re: "'Resultcache" effect in 10.2.0.4 ??
- From: GD
- Re: "'Resultcache" effect in 10.2.0.4 ??
- From: jlchiappa
- Re: "'Resultcache" effect in 10.2.0.4 ??
- From: Matthias Hoys
- "'Resultcache" effect in 10.2.0.4 ??
- Prev by Date: Re: How to call a procedure,with RCT1
- Next by Date: Re: long to clob conversion redo size and new table size
- Previous by thread: Re: "'Resultcache" effect in 10.2.0.4 ??
- Next by thread: Re: "'Resultcache" effect in 10.2.0.4 ??
- Index(es):
Relevant Pages
|
Loading