Re: C++0x atomic and reference counter
- From: Anthony Williams <anthony.ajw@xxxxxxxxx>
- Date: Fri, 11 Dec 2009 21:18:05 +0000
Helge Bahmann <bahmann@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> writes:
Anthony Williams <anthony.ajw@xxxxxxxxx> wrote:
If you stick to memory_order_seq_cst (the default) then it's easy. If
you opt for low-level ordering constraints then you have to specify
exactly what you need.
I know, but seq_cst is just insanely expensive. The release/acquire
model is quite intuitive I think, but I guess I have a little bit of a
"reverse mapping" problem from the memory models of physically exsting
CPUs that are much stricter than the C++0x one.
Yes, I've known others have that problem too. It's particularly hard if
you have working code for a particular architecture and wish to replace
it with C++0x atomics. The "equivalent" C++ doesn't always compile back
to the same instructions.
Helge Bahmann <hcb@xxxxxxxxxxxxxxx> writes:
I believe that fn_1 is sufficient on any machine that implements
"release" with a "LoadStore|StoreStore" memory fence just before (or
implicit with) the atomic decrement -- there is no need to "order" the
decrement to zero and the delete operation from the POV of other
threads, as they cannot legally access the memory locations. While this
might not be a necessity, I have great difficulties imagining a machine
for which this would not be true. What *might* be necessary is a barrier
preventing the compiler from speculatively deleting the object.
On SPARC in RMO mode, the CPU could hoist any loads needed for the
delete above the atomic decrement if you only had a LoadStore|StoreStore
barrier. I think that IA64 and Alpha can do similarly.
Does it matter if the destructor is empty as in the example provided?
Yes. No. Maybe. :-)
See below.
- the memory allocator has no business reading the object contents (and
even if it did, there is nothing it could do with the data), the lack of
StoreLoad fencing is therefore not problematic
True. If the object itself is never accessed in the "delete" branch
(e.g. because the destructor is trivial) then you don't need an acquire
fence for that.
If the destructor is not trivial then presumably it does read the data
in some fashion, and thus needs the acquire fence.
- the allocator must also protect its own internal data structures
independently, so these can't be at issue either
Ooh hang on there. Maybe the allocator stored info about the memory
block (such as its size) alongside the object when it was
constructed. It must read this data in order to free the memory
block. Unless there is a happens-before edge between the new and delete
then it won't necessarily read valid data. However, there must be such
an edge already in order to ensure that the initialization of the
reference count happens-before the decrement of that reference count.
So, if the destructor really is a no-op, you're OK without the acquire
fence. If it is NOT a no-op (i.e. in the general case) then you need the
fence.
Anthony
--
Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/
just::thread C++0x thread library http://www.stdthread.co.uk
Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976
.
- Follow-Ups:
- Re: C++0x atomic and reference counter
- From: Dmitriy Vyukov
- Re: C++0x atomic and reference counter
- References:
- C++0x atomic and reference counter
- From: Helge Bahmann
- Re: C++0x atomic and reference counter
- From: Anthony Williams
- Re: C++0x atomic and reference counter
- From: Helge Bahmann
- C++0x atomic and reference counter
- Prev by Date: Re: Memory manager for a lock-free queue
- Next by Date: Re: Memory manager for a lock-free queue
- Previous by thread: Re: C++0x atomic and reference counter
- Next by thread: Re: C++0x atomic and reference counter
- Index(es):
Relevant Pages
|