Re: Boolean Buyers Beware ... AIX compiler bug --- PMR 26241,756



***.dunbar <***.dunbar@xxxxxxxxx> wrote:

Boolean Buyers Beware ... AIX compiler bug --- PMR 26241,756.

Hint: _Tag _M_tag:8; is the line that causes the problem; an 8-bit
holder of enum data.

Test Case Essentials:
--------------------
class NQNString {
public:
enum _Tag {_S_leaf, _S_concat, _S_substringfn, _S_function};
volatile int _M_ref_count; // 32-bit int used in tcpip to
32-bit clients
_Tag _M_tag:8; // 8-bit char is as large as
we need
int _M_incr() {return fetch_and_add(&_M_ref_count, 1);}
int _M_decr() {return fetch_and_add(&_M_ref_count,-1);}
};

void *ThreadFunc(void *threadid) {
for (int i = 0; i < 100000000; ++i) {
gs._M_incr(); // atomicly increment
gs._M_tag=NQNString::_S_concat; // set it to one
if(gs._M_decr() == 0 ) ++*(int*)0; // ref_count should never go
to zero!
}
return 0;
}

First of all, please note that C++ makes no definitions about threads
at all, and provides no guarantees whatsoever concerning thread-safety.
Thus, it's purely an extension your vendor provides, reliable or unreliable.
Make sure you know what you're doing, but the C++ standard offers you nothing
here.

The Bug:
-------
1. ref_count is incremented/decremented consistently using AIX atomic
ops

Thus, test_and_set() works according to what was intented by the
vendor.

2. _M_tag assignment causes both ref_count and _tag to be loaded into
a 64-bit register, updated by twiddling bits, and stored non-
atomicly.

Well, if that is the only way how to access an 8 bit field on the
architecture, that's it. If you want thread-safe access on a
structure, use a mutex, hopefully provided by the implementation.

PMR 26241,756.
-------------
IBM identified a source work-around, and claimed the problem was our
source.
A TechNote was published but fails to relate to our 64-bit experience.

http://www-1.ibm.com/support/docview.wss?rs=2239&uid=swg21259159

Problem
-------
In a multithreaded program, updating a bitfield variable may affect
the values of other variables.

That is definitely a problem. It is also a problem the C++ standardization
committee is aware of, see the discussion in comp.lang.c++.moderated.
I think it is really that: On some architectures, it is unreasonable
to expect that you can access structure units atomically (without adding
much overhead). If you need to be thread safe, you should protect the whole
thing by a mutex.

I cited section 9.6 of the C++ standard as justification for our
coding.

I don't think the C++ standard has a word to say about threads. Note
that a C++ compiler is always free to operate on an "as if" basis, i.e.
its implementation might do whatever it wants as long as the observed
result is the same. And as C++ only addresses single-threaded programs,
observed behaviour is only defined in this sense. Thus, your program
has been proven to work correctly for single-threaded by the compiler,
and thus the compiler is fine.

IBM Response:
------------
Section 9.6 of ISO14882 does not specify the mandatory bit-field
implementation in multithreaded environment.

A bit "off" I would call that. The problem is that the standard does
not say anything about multithreading. Leave alone how bitfields are
supposed to work..

The standard expects you to do something rational on your hardware.
IBM can do anything they like to bit-fields, but to subsume
non-bitfield POD data into your BITCONTAINER is wrong.

No, why? You seem to imply that an "allocation unit" itself is something
that can be addressed in a thread-safe way. There is no such guarantee
in the standard. An allocation unit might well be a byte, even if the
processor has no provisions to access that unit atomically.

To avoid the compiler bug, we changed _M_tag to remove the
implication of bits,
and therefore stop IBM from "inventing this BITCONTAINER construct"
when none was
intended or implied in our code.

A "bitcontainer" is nothing specified in C++. Compilers may use that
*if* they can prove correct behaivor on the virtual machine C++ defines.
And that one is single-threaded. Thus, *formally* the code is running
into undefined-behaivour land. IBM provides additional guarantees to
you when allowing you to run multi-threaded code. As it seems, they don't
express them very clearly.

So what do you think? Our bug or theirs?

Your bug to depend on something that isn't specified. But unfortunately,
C++ doesn't specify anything here, and still one needs multithreading.

Thus, IBM may specify whatever they found reasonable. Aparently,
that's different from what you expected or think it's reasonable.
Thus, probably bad documentation.

So long,
Thomas
.