unkillable process

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

unkillable process

Paul Hargrove
>Synopsis:      Unkillable pthreaded process when sig handler calls raise()
>Category:      kernel
>Environment:
        System      : OpenBSD 5.2
        Details     : OpenBSD 5.2 (GENERIC.MP) #339: Wed Aug  1 10:13:24
MDT 2012
                         [hidden email]:
/usr/src/sys/arch/i386/compile/GENERIC.MP

        Architecture: OpenBSD.i386
        Machine     : i386
>Description:

The code (below) runs the following steps
1) Register a handler for SIGSEGV
2) Spawn a pthread, which just sleep(s)
3) Generate SIGSEGV
4) The SEGV handler calls signal(SIGSEGV, SIG_DFL)
5) The SEGV handler calls raise(SIGSEGV)

The result is a hung process which cannot be killed, even with SIGKILL.
$  ps -lk -p 21960
  UID   PID  PPID CPU PRI  NI   VSZ   RSS WCHAN   STAT  TT       TIME
COMMAND
 1000 21960     1   0  28   0   516  1080 thrdeat DE    p0-   0:00.06 (bug)

The code (below) is a reduced form of some error-handling code.
The full code worked fine with OpenBSD-5.1 (uthreads), but began failing
after I updated to OpenBSD-5.2 (rthreads) this week.

>How-To-Repeat:
$ cat >bug.c <<__EOF__
#include <signal.h>
#include <pthread.h>

// Handler for SIGSEGV which will just re-raise it
void catcher(int sig) {
  signal(sig, SIG_DFL);
#if 1
  raise(sig);  // BUG!!
#else
  pthread_kill(pthread_self(),sig); // OK
#endif
}

// An idle thread
void *thr_main(void *arg) {
  sleep(40);
  return NULL;
}

int main(void) {
  pthread_t thread;
  pthread_attr_t attr;
  void *result;
  int rc;

  (void) signal(SIGSEGV, &catcher); // Register SEGV handler

  rc = pthread_attr_init(&attr);
  rc = pthread_create(&thread, &attr, &thr_main, NULL);

  rc = *(volatile int *)0xdeadbeaf; // Trigger SIGSEGV

  // Not reached!
  rc = pthread_join(thread, &result);

  return rc;
}
__EOF__
$ cc -pthread -o bug bug.c
$ ./bug
[now hung]

>Fix:

As noted in the code, use of pthread_kill() in place of raise() works OK.


dmesg:
OpenBSD 5.2 (GENERIC.MP) #339: Wed Aug  1 10:13:24 MDT 2012
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC.MP
cpu0: AMD Opteron(tm) Processor 248 ("AuthenticAMD" 686-class, 1024KB L2
cache) 2.21 GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,NXE,MMXX,LONG,3DNOW2,3DNOW
real mem  = 4159614976 (3966MB)
avail mem = 4080795648 (3891MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 06/26/08, BIOS32 rev. 0 @ 0xfd580,
SMBIOS rev. 2.34 @ 0xf7f7c000 (33 entries)
bios0: vendor Phoenix Technologies Ltd. version "V2.18B RF4 for Rackable"
date 06/26/2008
bios0: Rackable Systems Inc. C2004
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP SRAT HPET APIC SPCR
acpi0: wakeup devices TP2P(S1) USB0(S1) USB1(S1) G0PA(S1) LAN0(S1) LAN1(S1)
G0PB(S1)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD errata 89, 97, 101 present, BIOS upgrade may be required
cpu0: apic clock running at 200MHz
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD Opteron(tm) Processor 248 ("AuthenticAMD" 686-class, 1024KB L2
cache) 2.21 GHz
cpu1:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,NXE,MMXX,LONG,3DNOW2,3DNOW
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 11, 24 pins
ioapic1 at mainbus0: apid 3 pa 0xfc000000, version 11, 4 pins
ioapic2 at mainbus0: apid 4 pa 0xfc001000, version 11, 4 pins
acpiprt0 at acpi0: bus 1 (TP2P)
acpiprt1 at acpi0: bus 2 (G0PA)
acpiprt2 at acpi0: bus 3 (G0PB)
acpicpu0 at acpi0
acpicpu1 at acpi0
acpibtn0 at acpi0: PWRB
bios0: ROM list: 0xc0000/0x8000 0xc8000/0x1800 0xc9800/0x1800
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
ppb0 at pci0 dev 6 function 0 "AMD 8111 PCI-PCI" rev 0x07
pci1 at ppb0 bus 1
ohci0 at pci1 dev 0 function 0 "AMD 8111 USB" rev 0x0b: apic 2 int 19,
version 1.0, legacy support
ohci1 at pci1 dev 0 function 1 "AMD 8111 USB" rev 0x0b: apic 2 int 19,
version 1.0, legacy support
vga1 at pci1 dev 6 function 0 "ATI Rage XL" rev 0x27
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
usb0 at ohci0: USB revision 1.0
uhub0 at usb0 "AMD OHCI root hub" rev 1.00/1.00 addr 1
usb1 at ohci1: USB revision 1.0
uhub1 at usb1 "AMD OHCI root hub" rev 1.00/1.00 addr 1
amdpcib0 at pci0 dev 7 function 0 "AMD 8111 LPC" rev 0x05
pciide0 at pci0 dev 7 function 1 "AMD 8111 IDE" rev 0x03: DMA, channel 0
configured to compatibility, channel 1 configured to compatibility
wd0 at pciide0 channel 0 drive 0: <ST340014A>
wd0: 16-sector PIO, LBA48, 38166MB, 78165360 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
pciide0: channel 1 disabled (no drives)
amdpm0 at pci0 dev 7 function 3 "AMD 8111 Power" rev 0x05
ppb1 at pci0 dev 10 function 0 "AMD 8131 PCIX" rev 0x12
pci2 at ppb1 bus 2
bge0 at pci2 dev 3 function 0 "Broadcom BCM5702X" rev 0x02, BCM5702/5703 A2
(0x1002): apic 3 int 3, address 00:50:45:5c:07:ce
brgphy0 at bge0 phy 1: BCM5703 10/100/1000baseT PHY, rev. 2
bge1 at pci2 dev 4 function 0 "Broadcom BCM5702X" rev 0x02, BCM5702/5703 A2
(0x1002): apic 3 int 3, address 00:50:45:5c:07:cf
brgphy1 at bge1 phy 1: BCM5703 10/100/1000baseT PHY, rev. 2
"AMD 8131 PCIX IOAPIC" rev 0x01 at pci0 dev 10 function 1 not configured
ppb2 at pci0 dev 11 function 0 "AMD 8131 PCIX" rev 0x12
pci3 at ppb2 bus 3
"AMD 8131 PCIX IOAPIC" rev 0x01 at pci0 dev 11 function 1 not configured
pchb0 at pci0 dev 24 function 0 "AMD AMD64 0Fh HyperTransport" rev 0x00
pchb1 at pci0 dev 24 function 1 "AMD AMD64 0Fh Address Map" rev 0x00
pchb2 at pci0 dev 24 function 2 "AMD AMD64 0Fh DRAM Cfg" rev 0x00
kate0 at pci0 dev 24 function 3 "AMD AMD64 0Fh Misc Cfg" rev 0x00
pchb3 at pci0 dev 25 function 0 "AMD AMD64 0Fh HyperTransport" rev 0x00
pchb4 at pci0 dev 25 function 1 "AMD AMD64 0Fh Address Map" rev 0x00
pchb5 at pci0 dev 25 function 2 "AMD AMD64 0Fh DRAM Cfg" rev 0x00
kate1 at pci0 dev 25 function 3 "AMD AMD64 0Fh Misc Cfg" rev 0x00
isa0 at amdpcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
mtrr: Pentium Pro MTRR support
vscsi0 at root
scsibus0 at vscsi0: 256 targets
softraid0 at root
scsibus1 at softraid0: 256 targets
root on wd0a (126354e88d14de08.a) swap on wd0b dump on wd0b
cpu1: AMD errata 89, 97, 101 present, BIOS upgrade may be required

usbdevs:
Controller /dev/usb0:
addr 1: full speed, self powered, config 1, OHCI root hub(0x0000),
AMD(0x1022), rev 1.00
 port 1 powered
 port 2 powered
 port 3 powered
Controller /dev/usb1:
addr 1: full speed, self powered, config 1, OHCI root hub(0x0000),
AMD(0x1022), rev 1.00
 port 1 powered
 port 2 powered
 port 3 powered



--
Paul H. Hargrove                          [hidden email]
Future Technologies Group
Computer and Data Sciences Department     Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900

Reply | Threaded
Open this post in threaded view
|

Re: unkillable process

Philip Guenther-5
On Fri, 11 Jan 2013, Paul Hargrove wrote:
> >Synopsis:      Unkillable pthreaded process when sig handler calls raise()
...
> The code (below) runs the following steps
> 1) Register a handler for SIGSEGV
> 2) Spawn a pthread, which just sleep(s)
> 3) Generate SIGSEGV
> 4) The SEGV handler calls signal(SIGSEGV, SIG_DFL)
> 5) The SEGV handler calls raise(SIGSEGV)
>
> The result is a hung process which cannot be killed, even with SIGKILL.

Already fixed in -current.  c.f. my commits on 2012/10/17 and 2012/07/11.


Philip Guenther

Reply | Threaded
Open this post in threaded view
|

Re: unkillable process

Paul Hargrove
Thanks for the good news, Philip.

Before posting the bug report I had looked in anonymous CVS at  kern_exit.c
(where the WCHAN string occurs) and saw no changes since 5.2 that seemed to
correspond to my problem.  However, with no prior experience with BSD
kernel internals, I didn't attempt to look farther afield.

I've never built an OpenBSD kernel from source before, but now I have a
reason to.
Having done FreeBSD kernel builds, however, I have no fears.

Thanks,
-Paul