Re: accessing each pthread's context



I agree that sending a signal (via pthread_kill?) to all the other
threads that haven't asserted, and then appending a stack backtrace to
an open file would most likely work. My hesitation comes from the
nondeterministic nature of mixing threads and signals and adding the
messiness of having the app in a "crazy" state. For example, you might
get into a situation where you're not sure if the last thread has
checked in -- is it time to exit(), or should we give that last
recalcitrant thread a chance to perform its backtrace. The difference
to me is getting a core dump immediately after an exception/signal, or
building up a core dump over time by letting each thread add its
context to the snapshot.

That's why Paul Pluzhnikov's suggestion of creating a monitor process
that has some of the features of a debugger is appealing to me. In
fact, that would be probably be a useful opensource project ... call it
"overlord" as in "I for one welcome our new ..."

.



Relevant Pages

  • Re: [2.6.27.24] Kernel coredump to a pipe is failing
    ... basic things like signals? ... Should dump_writetry again on ERESTARTSYS? ... When a signal happens during core dump the core dump to a pipe ... Based on debugging by Paul Smith. ...
    (Linux-Kernel)
  • Re: Process with many NPTL threads terminates slowly on core dump signal
    ... clear false pending signal indication in core dump ... > to terminate if it receives a signal that may cause a core dump. ... > program is designed to consume infinite CPU time). ... > * The slow process termination time only occurs for those signals that ...
    (Linux-Kernel)
  • Re: [PATCH] coredump: Retry writes where appropriate
    ... Core dump write operations (especially to a pipe) can be incomplete due ... In fact the signals checks were *purposefully added* some time ago. ...
    (Linux-Kernel)
  • Re: [2.6.27.24] Kernel coredump to a pipe is failing
    ... When a signal happens during core dump the core dump to a pipe ... can fail, because the write returns short, but the ELF core dumpers ... There's no reason to handle signals during core dumping, ...
    (Linux-Kernel)
  • Re: How to make a core dump?
    ... > its core dump, but i couldn't figure out how to do this. ... > thing that came to mind was attaching to it with gdb, ... all of the above signals can be caught (SIGKILL and SIGCONT ... resource limits may prevent a core dump, ...
    (Focus-Linux)