Re: EPERM from pthread_mutex_unlock after fork using pthread_atfork()



Steve Watt wrote:
In article <S6WdnZw2IYqycCnbnZ2dnUVZ_oaonZ2d@xxxxxxxxxxx>,
Jason Roscoe <jason.roscoe@xxxxxxxxx> wrote:
David Schwartz wrote:
On Aug 4, 6:56 am, Jason Roscoe <jason.ros...@xxxxxxxxx> wrote:
I have a simple test program that I wrote in order to understand the
pthread_atfork() routine. The source code is attached.

The mutex is not locked until the "atForkPrepare" handler. In this test program, there is only one thread - the one that calls fork(). So, if the fork() creates a copy of the parent (single thread in this case), the child should get a copy of the mutex as well. The mutex should be in a locked state after the "atForkPrepare" routine exits. So, it should be locked in the child process and the parent process once the fork() processing is complete. The child process has a copy of the thread that called fork() which is the thread that locked the mutex. Thanks for your help but it still needs explaining.

Technically, it's not a mutex any more. If it wasn't in shared address
space, it's a copy of a mutex, which is a rather unpleasant thing to
be. If it was in shared address space (sorry, I didn't read your code
sample because your prose seemed pretty clear), then the child thread
wasn't an owner, and thus couldn't unlock it.

Well, POSIX is squishy here, but definitely allows that (at the sole discretion of the implementation) it MAY BE a mutex. "Possibly including the states of mutexes", remember?

Sure, a portable APPLICATION cannot legally copy a pthread_mutex_t, but only because it doesn't know what a mutex means "inside". The implementation can, because it can copy what needs to be copied and reconstruct or ignore what needs to be reconstructed or ignored. (After all, it knows how to create a mutex from scratch, and can pick apart and analyze every aspect of the old one... how could it NOT be able to construct a duplicate with whatever semantic properties are considered desirable?) For example, copying the lock state, marking it owned by the child thread, and eliminating any waiters. If a mutex needs to work by pointing to some special type of memory (as some exotic architectures require), then the implementation can do that, too. But, more importantly, no implementation is REQUIRED to do any of this.

So, yeah; a portable conforming application must assume that the data in any pthread_mutex_t, within the child, is meaningless and unusable. Not that it matters, because such an application (or at least its developer) is aware that no pthread function can be called before wiping the address space completely via exec*(), anyway. ;-)

The best thing to do with private mutexes in child atfork handlers is to
initialize them.

Except you can't do that either, because (1) the mutex may be locked, depending on what the implementation actually does to these mutex copies, and you can't initialize a locked mutex; and (2) because pthread_mutex_init() isn't any more async-signal-safe than pthread_mutex_unlock() and therefore also cannot be portably used in a child atfork handler.

The fact is that atfork handlers were a necessary mechanism to ensure that the C runtime can preserve enough state across the fork that the child can actually back out into the application code and survive long enough to call exec*(). (If it hadn't been standardized it would have had to have been invented everywhere anyway.) ESPECIALLY where fork() may be a pure syscall() (though it often isn't) and the C runtime data and locks may be in a completely unpredictable state due to other threads' activities at the time of the fork. The C runtime has a special contract and relationship with the kernel and knows how far it can push the rules -- it need not be (and in fact can't be) "portable POSIX code" anyway.

For the application, the safe solution (the only conforming or portable solution) is to forget about the enticing illusion of atfork handlers entirely. Just exec*() in the child and start fresh. Of course it's perfectly legitimate to exec*() your original argv[0] and restart the same binary in the child, if that's what you want to do.
.



Relevant Pages

  • Re: Ruby lacks atfork : The evil that lives in fork...
    ... so that there's no way to ensure mutex to share ... Give the resource to the child. ... fork() in a multi-threaded program. ... multi-threaded I/O libraries, which are almost sure to be invoked ...
    (comp.lang.ruby)
  • Re: Invoking external processes in threaded program
    ... is locked before the fork by a thread other than the forking thread, ... the child and so the thread that took the lock is no longer around to ... unlock the mutex. ... The problem is somewhat solvable for my own mutexes; ...
    (comp.unix.programmer)
  • Re: automoc4 processes lock again
    ... Which of the processes is parent, which is child? ... another thread did the fork. ... the child process would never be able ...
    (freebsd-stable)
  • Re: EPERM from pthread_mutex_unlock after fork using pthread_atfork()
    ... there would be two threads that own the mutex. ... "The pthread_atfork() function provides multi-threaded libraries with a means to protect themselves from innocent application programs that call fork, and it provides multi-threaded application programs with a standard mechanism for protecting themselves from forkcalls in a library routine or the application itself. ... "For example, an application can supply a prepare routine that acquires the necessary mutexes the library maintains and supply child and parent routines that release those mutexes, thus ensuring that the child gets a consistent snapshot of the state of the library. ... only the calling thread is duplicated in the child process. ...
    (comp.programming.threads)
  • Re: EPERM from pthread_mutex_unlock after fork using pthread_atfork()
    ... after the 'fork', there would be two threads that own the mutex. ... child and parent routines that release those mutexes, ... the child process. ...
    (comp.programming.threads)