Re: Variable confidence/urgency prefetch?



Paul Aaron Clayton <Dysthymicdolt@xxxxxxx> wrote:

Seongbae Park wrote:
[snip]
US4+ has weak and strong prefetches -
the processor is allowed to drop weak prefetches on various
conditions (typically prefetch queue full condition or TLB miss),
whereas the strong prefetch are not dropped (almost) no matter what.
The intention is just as you said - one for potentially higher cost
but guaranteed prefetching, the other for guaranteed
small/no runtime cost but no guarantee of actual prefetching.
In contrast, US3 and US4 have only weak prefetches
which often forced the compiler to issue duplicate prefetches
to ensure all necessary data are really prefetched.

See 8.72.2 Weak versus Strong Prefetch of UA2005:

http://opensparc.sunsource.net/specs/UA2005-current-draft-P-EXT.pdf

A belated thank you for the pointer. (BTW, why is saving this public
document unallowed--especially as a little extra effort allows one to
save it anyway?)

No idea.
http://blogs.sun.com/roller/page/dweaver
probably can tell you why.

It seems UltraSPARC does not have an equivalent to Alpha's
Write Hint 64bytes or PPC's Data Cache Block Zero (or Allocate).
Did the guardians of the ISA judge that the block write instruction
(which writes 8 FP registers to memory) was adequate for most
uses of WH64/DCB[ZA] or that such instructions provide little
benefit in real programs or what?

It seems that some memory
allocators (especially a binary buddy allocator?) could use such to
avoid accesses to main memory for blocks that initially contain no
meaningful data.

See 5.9 Block Initiailizing Store in

http://opensparc.sunsource.net/specs/UST1-UASuppl-current-draft-P-EXT.pdf

Please note that this is T1 supplement to the UA2005 spec.
I guess "the guardians" (whoever they are) haven't decided on
what form of block init store is acceptable
for the non-implementation specific portion of ISA.
My guess is that some form of this will percolate up
to the general UA200x specification in the future.

--
#pragma ident "Seongbae Park, compiler, http://blogs.sun.com/seongbae/";
.



Relevant Pages

  • Re: Variable confidence/urgency prefetch?
    ... US3 and US4 have only weak prefetches ... Write Hint 64bytes or PPC's Data Cache Block Zero (or Allocate). ... It seems that some memory ...
    (comp.arch)
  • Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
    ...  If I optimize my prefetches by trial and error for my ... Though this was code which had quite predicable memory access ... access locality and vectorization (SSE) ... writing to memory whereas the SSE2 version uses movntpd to write to ...
    (sci.image.processing)
  • Re: Share Your Experience with 3DNow, SSE, SSE2 etc.
    ... speed improvement over 10% is seen over 87 code. ... If I optimize my prefetches by trial and error for my ... Though this was code which had quite predicable memory access ...
    (sci.image.processing)

Loading