Re: questions about memory_order_seq_cst fence

Anthony Williams wrote:

the "Incorrect implementation" above is actually correct (probably
because according to them, MFENCE has the same semantics as PPC's

That paper also says:

"Sequentially consistent atomics The proposal above includes
two implementations of sequentially consistent atomic reads and
writes; one with the x86 locked instructions, and the other with
fence instructions on both the reads and writes. However, we can
prove that it suffices either to place an mfence before every sc read,
or after every sc write, but that it is not necessary to do both. In
practice, placing the fence after the sc writes is expected to yield
higher performance."

i.e. they think that

Load SC: MOV

is a valid implementation.

They are talking about X86-TSO (not actual X86), which is equivalent to
SPARC-TSO (with #StoreLoad spelled as MFENCE),

Their X86-TSO paper says that it tries to model actual X86 cpus, and
tries to be consistent with the documented semantics, but favouring
actual CPU behaviour over the specs.

I've just checked the recent Intel specs... and according to
(May 2011)

recent X86 hardware (P6 and More Recent Processor Families) *is* a step
back to TSO, see

"8.2.2 Memory Ordering in P6 and More Recent Processor Families

.. . .

Any two stores are seen in a consistent order by processors other than
those performing the stores"


" Stores Are Seen in a Consistent Order by Other Processors"

(IRIW without any fencing or XCHG)

More googling...

David Dice reported it back in 2009:

"Intel has taken steps toward addressing some of the concerns about
potential platform memory model relaxations that I identified in a
previous blog entry. Specifically see section of their latest
Software Developer's Manual which now guarantees global store ordering."

2009's is current