Re: Memory fence instructions on x86

"PillMonsta" <chris@xxxxxxxxxxxxx> wrote in message
Dear all,

Does anyone have a readable document, or alternatively source which
demonstrates when and where sfence, lfence and mfence instructions are
required for programming atomic operations on P4 and 686?
I have heard conflicting reports that the fence instructions are not
required on SMP P4, but I doubt this. Information on this seems to be
pretty scarce and any help would be very welcome.



FWIW, here is my "current" take on the x86:

This brief description seems to cover x86 and UltraSPARC T1 TSO. That is
every explicit memory barrier operation is a nop, except #StoreLoad... Store
followed by load to different location can be reordered on x86 or sparcV9...

My experimental implementation of Petersons Algorithm demonstrates the need
for a #StoreLoad barrier on x86:

Notice how there is no explicit barrier for the "unlock" functions... Again,
this is because "current" x86 stores automatically take care of #LoadStore

That was a trick to exploit the fact that in TSO model, stores are
"basically" equivalent to:

1. #LoadStore|#StoreStore > Release barrier
2. Peform The Actuall Store
(read all of this)

!!> Please note that Intel explicitly states that these rules may not hold
true for "future" x86 memory models... So always have a "backup" plan that
uses the lfence, sfence, and mfence instructions in the "correct" places...

Here is my implementation of a "simple" x86 assembly based atomic operations
abstraction... This code uses mfence, so you may have to change this if your
processor doesn't support the SSE 2, IIRC...