Re: pthread_join SEGFAULT



Dalbosco J-F wrote:
Hi,

In the man page about pthread_join(pthread_t thread, void **value_ptr); it is written that the function shall fail if the implementation has detected that the value specified by "thread" does not refer to a joinable thread.

On my system (Linux Fedora Core 6) if I call pthread_join(0,0); so with a non joinable thread "0", it triggers a SEGFAULT instead of safely returning with the error code EINVAL.

(Note that I don't know which thread library you're using, and I'm not really interested because I'm not trying to specifically address the error checking paths, if any, in pthread_join. The more general implicit questions are always more interesting.)

A literal "0" is not "a non joinable thread". On most systems it is not a valid thread ID -- and that is never correct coding.

You cannot legally assume anything about the representation of a pthread_t; it is an opaque value and might, for example, be a pointer to some internal implementation data.

The standard is in fact slightly ambiguous in this area regarding the distinctions between a valid pthread_t value that's not currently a live thread or not joinable and a random garbage value that is not a valid pthread_t and can never possibly represent a thread.

It's nice and friendly for an implementation to err on the side of the "shall fail" (mandatory) ESRCH rather than the "may fail" (essentially, optional; the "may fail" is a modification of the original POSIX "if detected", and that's really what we meant) EINVAL. A bad value that misses the implementation's definition of ESRCH need not, in strict interpretation and intent, be reported by EINVAL. The point was to allow implementations to optimize for speed and omit checks for "programmer errors" like your fabricated call. And if that's your goal then a bad pointer should SIGSEGV rather than spending precious cycles (and, more important, memory references) validating the pointer.

Also, note that if pthread_t is a scalar integer AND 0 happens to be a valid ID value, then it very likely refers to the process's initial thread: which would result in another "may fail" error, EDEADLK... OR undefined behavior, usually a deadlock.

Can anyone give me any clue about this?
Is there any better implementation that does not SEGFAULT in such a case?

There are many possible better implementations of your sample code that won't provoke a SIGSEGV, yes. There are also implementations that more thoroughly check pthread_t values, or where they are simply hashed integers in the first place, that are more likely to correctly diagnose this broken code. But in the end, no matter what, your pthread_join() call is broken and useless. While your hypothetical "better implementation" might be vaguely more resistant to some trivial types of memory corruption, for example, this robustness comes at a cost and can never be 100% protection against bad code.

Use real pthread_t values and watch your thread ID lifetimes.

Thanks by advance,
JF
.



Relevant Pages

  • Re: pthread_join SEGFAULT
    ... it is written that the function shall fail if the implementation has ... detected that the value specified by "thread" does not refer to a ... a non joinable thread "0", it triggers a SEGFAULT instead of safely ... Free time? ...
    (comp.programming.threads)
  • Re: pthread_join SEGFAULT
    ... it is written that the function shall fail if the implementation has ... detected that the value specified by "thread" does not refer to a ... a non joinable thread "0", it triggers a SEGFAULT instead of safely ... Passing random garbage to it is not permitted. ...
    (comp.programming.threads)
  • pthread_join SEGFAULT
    ... it is written that the function shall fail if the implementation has detected that the value specified by "thread" does not refer to a joinable thread. ... On my system if I call pthread_join; so with a non joinable thread "0", it triggers a SEGFAULT instead of safely returning with the error code EINVAL. ...
    (comp.programming.threads)
  • Re: Code fails with Segmentation Fault
    ... You haven't quite understood that creating a pointer ... understand that either one can fail. ... You should be aware that strdup() is not a Standard C ... you don't want a memory region ...
    (comp.lang.c)
  • Re: Help. What is the error?
    ... > The code will work if the returned pointer lies in the first 4GB ... int, void*, and int* are all 64 bits. ... It could either succeed or fail ... Undefined behavior is undefined behavior. ...
    (comp.lang.c)