Re: C++0x atomic and reference counter



Helge Bahmann <bahmann@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> writes:

Anthony Williams <anthony.ajw@xxxxxxxxx> wrote:

If you stick to memory_order_seq_cst (the default) then it's easy. If
you opt for low-level ordering constraints then you have to specify
exactly what you need.

I know, but seq_cst is just insanely expensive. The release/acquire
model is quite intuitive I think, but I guess I have a little bit of a
"reverse mapping" problem from the memory models of physically exsting
CPUs that are much stricter than the C++0x one.

Yes, I've known others have that problem too. It's particularly hard if
you have working code for a particular architecture and wish to replace
it with C++0x atomics. The "equivalent" C++ doesn't always compile back
to the same instructions.

Helge Bahmann <hcb@xxxxxxxxxxxxxxx> writes:
I believe that fn_1 is sufficient on any machine that implements
"release" with a "LoadStore|StoreStore" memory fence just before (or
implicit with) the atomic decrement -- there is no need to "order" the
decrement to zero and the delete operation from the POV of other
threads, as they cannot legally access the memory locations. While this
might not be a necessity, I have great difficulties imagining a machine
for which this would not be true. What *might* be necessary is a barrier
preventing the compiler from speculatively deleting the object.

On SPARC in RMO mode, the CPU could hoist any loads needed for the
delete above the atomic decrement if you only had a LoadStore|StoreStore
barrier. I think that IA64 and Alpha can do similarly.

Does it matter if the destructor is empty as in the example provided?

Yes. No. Maybe. :-)

See below.

- the memory allocator has no business reading the object contents (and
even if it did, there is nothing it could do with the data), the lack of
StoreLoad fencing is therefore not problematic

True. If the object itself is never accessed in the "delete" branch
(e.g. because the destructor is trivial) then you don't need an acquire
fence for that.

If the destructor is not trivial then presumably it does read the data
in some fashion, and thus needs the acquire fence.

- the allocator must also protect its own internal data structures
independently, so these can't be at issue either

Ooh hang on there. Maybe the allocator stored info about the memory
block (such as its size) alongside the object when it was
constructed. It must read this data in order to free the memory
block. Unless there is a happens-before edge between the new and delete
then it won't necessarily read valid data. However, there must be such
an edge already in order to ensure that the initialization of the
reference count happens-before the decrement of that reference count.

So, if the destructor really is a no-op, you're OK without the acquire
fence. If it is NOT a no-op (i.e. in the general case) then you need the
fence.

Anthony
--
Author of C++ Concurrency in Action http://www.stdthread.co.uk/book/
just::thread C++0x thread library http://www.stdthread.co.uk
Just Software Solutions Ltd http://www.justsoftwaresolutions.co.uk
15 Carrallack Mews, St Just, Cornwall, TR19 7UL, UK. Company No. 5478976
.



Relevant Pages

  • Re: C++0x atomic and reference counter
    ... you opt for low-level ordering constraints then you have to specify ... as they cannot legally access the memory locations. ... If the destructor is not trivial then presumably it does read the data ... and thus needs the acquire fence. ...
    (comp.programming.threads)
  • Re: C++0x atomic and reference counter
    ... It must read this data in order to free the memory ... reference count happens-before the decrement of that reference count. ... So, if the destructor really is a no-op, you're OK without the acquire ... fence. ...
    (comp.programming.threads)
  • Re: Memory visibility and MS Interlocked instructions
    ... Processor consistency is the term of the art for multiprocessor *memory ... The SFENCE instruction provides greater control over ... The LFENCE instruction establishes a memory fence for loads. ...
    (comp.programming.threads)
  • Re: Returning an unknown number of types/values
    ... >>second is freeing the memory. ... the object's destructor won't be called automatically. ... > works fine with and empty derived destructor as far as I can see. ... That gives us a pointer to buf_. ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Virtual dtor and placement new.
    ... // Basically calls the destructor. ... It is possible to store this information in a smart-pointer, ... additional cleanup when the memory is to be deleted (i.e. this functor ...
    (comp.lang.cpp)