System freeze after zzz

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

System freeze after zzz

Donald Allen
I'm running current (as of the 11/14 snapshot) on a micro-itx box I
built around an Intel Atom d510mo motherboard. When I try to wake the
system after zzz, my X session comes alive and I can change workspaces
with the window manager (xmonad; no desktop system), but firefox is hung
(won't repaint its window), can't run top, can't run shutdown from
another console. I can ping the system from another, but can't ssh to
it. Guesswork on my part: the kernel is giant-locked, so no system calls?

The only way I've found to get out of this is to power-cycle. It's
happened a few times now, so I think I can reproduce it and am certainly
willing to try to help debug.

Dmesg:


OpenBSD 5.4-current (GENERIC.MP) #147: Tue Nov 12 16:37:15 MST 2013
     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
RTC BIOS diagnostic error 80<clock_battery>
real mem = 4260089856 (4062MB)
avail mem = 4138549248 (3946MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xe43c0 (27 entries)
bios0: vendor Intel Corp. version "MOPNV10J.86A.0311.2010.0802.2346"
date 08/02/2010
bios0: Intel Corporation D510MO
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S3 S4 S5
acpi0: tables DSDT FACP APIC MCFG HPET SSDT
acpi0: wakeup devices SLPB(S4) PS2M(S4) PS2K(S4) UAR1(S4) UAR2(S4)
P32_(S4) ILAN(S4) PEX0(S4) PEX1(S4) PEX2(S4) PEX3(S4) UHC1(S3) UHC2(S3)
UHC3(S3) UHC4(S3) EHCI(S3) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.96 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
cpu0: 512KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
cpu0: apic clock running at 166MHz
cpu0: mwait min=64, max=64, C-substates=0.1.0.0.0, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
cpu1: 512KB 64b/line 8-way L2 cache
cpu1: smt 1, core 0, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
cpu2: 512KB 64b/line 8-way L2 cache
cpu2: smt 0, core 1, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
cpu3: 512KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins
ioapic0: misconfigured as apic 0, remapped to apid 8
acpimcfg0 at acpi0 addr 0xf8000000, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 5 (P32_)
acpiprt1 at acpi0: bus 0 (PCI0)
acpiprt2 at acpi0: bus 1 (PEX0)
acpiprt3 at acpi0: bus 2 (PEX1)
acpiprt4 at acpi0: bus 3 (PEX2)
acpiprt5 at acpi0: bus 4 (PEX3)
acpicpu0 at acpi0: C1
acpicpu1 at acpi0: C1
acpicpu2 at acpi0: C1
acpicpu3 at acpi0: C1
acpibtn0 at acpi0: SLPB
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Pineview DMI" rev 0x02
vga1 at pci0 dev 2 function 0 "Intel Pineview Video" rev 0x02
intagp0 at vga1
agp0 at intagp0: aperture at 0xe0000000, size 0x10000000
inteldrm0 at vga1
drm0 at inteldrm0
intel_overlay_map_regs partial stub
inteldrm0: 1280x800
wsdisplay0 at vga1 mux 1: console (std, vt100 emulation)
wsdisplay0: screen 1-5 added (std, vt100 emulation)
azalia0 at pci0 dev 27 function 0 "Intel 82801GB HD Audio" rev 0x01: msi
azalia0: codecs: Realtek ALC662
audio0 at azalia0
ppb0 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: msi
pci1 at ppb0 bus 1
re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x03: RTL8168D/8111D
(0x2800), apic 8 int 16, address 00:27:0e:0e:ff:a9
rgephy0 at re0 phy 7: RTL8169S/8110S PHY, rev. 2
ppb1 at pci0 dev 28 function 1 "Intel 82801GB PCIE" rev 0x01: msi
pci2 at ppb1 bus 2
ppb2 at pci0 dev 28 function 2 "Intel 82801GB PCIE" rev 0x01: msi
pci3 at ppb2 bus 3
ppb3 at pci0 dev 28 function 3 "Intel 82801GB PCIE" rev 0x01: msi
pci4 at ppb3 bus 4
uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 8 int 23
uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 8 int 19
uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 8 int 18
uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 8 int 16
ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 8 int 23
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb4 at pci0 dev 30 function 0 "Intel 82801BAM Hub-to-PCI" rev 0xe1
pci5 at ppb4 bus 5
pcib0 at pci0 dev 31 function 0 "Intel NM10 LPC" rev 0x01
ahci0 at pci0 dev 31 function 2 "Intel 82801GR AHCI" rev 0x01: msi, AHCI 1.1
scsibus0 at ahci0: 32 targets
cd0 at scsibus0 targ 0 lun 0: <ASUS, DRW-24B1ST c, 1.05> ATAPI 5/cdrom
removable
sd0 at scsibus0 targ 1 lun 0: <ATA, WDC WD3200AAJS-5, 01.0> SCSI3
0/direct fixed naa.50014ee157ed27dc
sd0: 305245MB, 512 bytes/sector, 625142448 sectors
ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01: apic 8
int 19
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-6400CL5
spdmem1 at iic0 addr 0x51: 2GB DDR2 SDRAM non-parity PC2-6400CL5
usb1 at uhci0: USB revision 1.0
uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb2 at uhci1: USB revision 1.0
uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb3 at uhci2: USB revision 1.0
uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
usb4 at uhci3: USB revision 1.0
uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
wbsio0 at isa0 port 0x4e/2: W83627THF rev 0x84
lm1 at wbsio0 port 0x290/8: W83627THF
mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
uhub5 at uhub0 port 4 "NEC hub" rev 2.00/1.00 addr 2
uhub6 at uhub5 port 2 "D-Link product 0xf103" rev 2.00/1.00 addr 3
uhub7 at uhub5 port 3 "ATEN International product 0x7000" rev 1.10/1.00
addr 4
uhidev0 at uhub7 port 1 configuration 1 interface 0 "ATEN SW2-RA
V1.0.072" rev 1.10/1.00 addr 5
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes, country code 33
wskbd1 at ukbd0 mux 1
wskbd1: connecting to wsdisplay0
uhidev1 at uhub7 port 1 configuration 1 interface 1 "ATEN SW2-RA
V1.0.072" rev 1.10/1.00 addr 5
uhidev1: iclass 3/1
ums0 at uhidev1: 5 buttons, Z dir
wsmouse0 at ums0 mux 0
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
root on sd0a (52b32a3ca15accec.a) swap on sd0b dump on sd0b

Reply | Threaded
Open this post in threaded view
|

Re: System freeze after zzz

Mike Larkin
On Sun, Nov 24, 2013 at 10:42:45AM -0500, Don Allen wrote:

> I'm running current (as of the 11/14 snapshot) on a micro-itx box I
> built around an Intel Atom d510mo motherboard. When I try to wake
> the system after zzz, my X session comes alive and I can change
> workspaces with the window manager (xmonad; no desktop system), but
> firefox is hung (won't repaint its window), can't run top, can't run
> shutdown from another console. I can ping the system from another,
> but can't ssh to it. Guesswork on my part: the kernel is
> giant-locked, so no system calls?
>
> The only way I've found to get out of this is to power-cycle. It's
> happened a few times now, so I think I can reproduce it and am
> certainly willing to try to help debug.

Can you try to see if time is advancing after resume?

Eg, a sequence of 'date' commands from an open xterm?

-ml


>
> Dmesg:
>
>
> OpenBSD 5.4-current (GENERIC.MP) #147: Tue Nov 12 16:37:15 MST 2013
>     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> RTC BIOS diagnostic error 80<clock_battery>
> real mem = 4260089856 (4062MB)
> avail mem = 4138549248 (3946MB)
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xe43c0 (27 entries)
> bios0: vendor Intel Corp. version "MOPNV10J.86A.0311.2010.0802.2346"
> date 08/02/2010
> bios0: Intel Corporation D510MO
> acpi0 at bios0: rev 2
> acpi0: sleep states S0 S1 S3 S4 S5
> acpi0: tables DSDT FACP APIC MCFG HPET SSDT
> acpi0: wakeup devices SLPB(S4) PS2M(S4) PS2K(S4) UAR1(S4) UAR2(S4)
> P32_(S4) ILAN(S4) PEX0(S4) PEX1(S4) PEX2(S4) PEX3(S4) UHC1(S3)
> UHC2(S3) UHC3(S3) UHC4(S3) EHCI(S3) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.96 MHz
> cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
> cpu0: 512KB 64b/line 8-way L2 cache
> cpu0: smt 0, core 0, package 0
> cpu0: apic clock running at 166MHz
> cpu0: mwait min=64, max=64, C-substates=0.1.0.0.0, IBE
> cpu1 at mainbus0: apid 1 (application processor)
> cpu1: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
> cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
> cpu1: 512KB 64b/line 8-way L2 cache
> cpu1: smt 1, core 0, package 0
> cpu2 at mainbus0: apid 2 (application processor)
> cpu2: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
> cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
> cpu2: 512KB 64b/line 8-way L2 cache
> cpu2: smt 0, core 1, package 0
> cpu3 at mainbus0: apid 3 (application processor)
> cpu3: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
> cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
> cpu3: 512KB 64b/line 8-way L2 cache
> cpu3: smt 1, core 1, package 0
> ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins
> ioapic0: misconfigured as apic 0, remapped to apid 8
> acpimcfg0 at acpi0 addr 0xf8000000, bus 0-63
> acpihpet0 at acpi0: 14318179 Hz
> acpiprt0 at acpi0: bus 5 (P32_)
> acpiprt1 at acpi0: bus 0 (PCI0)
> acpiprt2 at acpi0: bus 1 (PEX0)
> acpiprt3 at acpi0: bus 2 (PEX1)
> acpiprt4 at acpi0: bus 3 (PEX2)
> acpiprt5 at acpi0: bus 4 (PEX3)
> acpicpu0 at acpi0: C1
> acpicpu1 at acpi0: C1
> acpicpu2 at acpi0: C1
> acpicpu3 at acpi0: C1
> acpibtn0 at acpi0: SLPB
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel Pineview DMI" rev 0x02
> vga1 at pci0 dev 2 function 0 "Intel Pineview Video" rev 0x02
> intagp0 at vga1
> agp0 at intagp0: aperture at 0xe0000000, size 0x10000000
> inteldrm0 at vga1
> drm0 at inteldrm0
> intel_overlay_map_regs partial stub
> inteldrm0: 1280x800
> wsdisplay0 at vga1 mux 1: console (std, vt100 emulation)
> wsdisplay0: screen 1-5 added (std, vt100 emulation)
> azalia0 at pci0 dev 27 function 0 "Intel 82801GB HD Audio" rev 0x01: msi
> azalia0: codecs: Realtek ALC662
> audio0 at azalia0
> ppb0 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: msi
> pci1 at ppb0 bus 1
> re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x03: RTL8168D/8111D
> (0x2800), apic 8 int 16, address 00:27:0e:0e:ff:a9
> rgephy0 at re0 phy 7: RTL8169S/8110S PHY, rev. 2
> ppb1 at pci0 dev 28 function 1 "Intel 82801GB PCIE" rev 0x01: msi
> pci2 at ppb1 bus 2
> ppb2 at pci0 dev 28 function 2 "Intel 82801GB PCIE" rev 0x01: msi
> pci3 at ppb2 bus 3
> ppb3 at pci0 dev 28 function 3 "Intel 82801GB PCIE" rev 0x01: msi
> pci4 at ppb3 bus 4
> uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 8 int 23
> uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 8 int 19
> uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 8 int 18
> uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 8 int 16
> ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 8 int 23
> usb0 at ehci0: USB revision 2.0
> uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
> ppb4 at pci0 dev 30 function 0 "Intel 82801BAM Hub-to-PCI" rev 0xe1
> pci5 at ppb4 bus 5
> pcib0 at pci0 dev 31 function 0 "Intel NM10 LPC" rev 0x01
> ahci0 at pci0 dev 31 function 2 "Intel 82801GR AHCI" rev 0x01: msi, AHCI 1.1
> scsibus0 at ahci0: 32 targets
> cd0 at scsibus0 targ 0 lun 0: <ASUS, DRW-24B1ST c, 1.05> ATAPI
> 5/cdrom removable
> sd0 at scsibus0 targ 1 lun 0: <ATA, WDC WD3200AAJS-5, 01.0> SCSI3
> 0/direct fixed naa.50014ee157ed27dc
> sd0: 305245MB, 512 bytes/sector, 625142448 sectors
> ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01:
> apic 8 int 19
> iic0 at ichiic0
> spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-6400CL5
> spdmem1 at iic0 addr 0x51: 2GB DDR2 SDRAM non-parity PC2-6400CL5
> usb1 at uhci0: USB revision 1.0
> uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
> usb2 at uhci1: USB revision 1.0
> uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
> usb3 at uhci2: USB revision 1.0
> uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
> usb4 at uhci3: USB revision 1.0
> uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
> isa0 at pcib0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> lpt0 at isa0 port 0x378/4 irq 7
> wbsio0 at isa0 port 0x4e/2: W83627THF rev 0x84
> lm1 at wbsio0 port 0x290/8: W83627THF
> mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
> uhub5 at uhub0 port 4 "NEC hub" rev 2.00/1.00 addr 2
> uhub6 at uhub5 port 2 "D-Link product 0xf103" rev 2.00/1.00 addr 3
> uhub7 at uhub5 port 3 "ATEN International product 0x7000" rev
> 1.10/1.00 addr 4
> uhidev0 at uhub7 port 1 configuration 1 interface 0 "ATEN SW2-RA
> V1.0.072" rev 1.10/1.00 addr 5
> uhidev0: iclass 3/1
> ukbd0 at uhidev0: 8 variable keys, 6 key codes, country code 33
> wskbd1 at ukbd0 mux 1
> wskbd1: connecting to wsdisplay0
> uhidev1 at uhub7 port 1 configuration 1 interface 1 "ATEN SW2-RA
> V1.0.072" rev 1.10/1.00 addr 5
> uhidev1: iclass 3/1
> ums0 at uhidev1: 5 buttons, Z dir
> wsmouse0 at ums0 mux 0
> vscsi0 at root
> scsibus1 at vscsi0: 256 targets
> softraid0 at root
> scsibus2 at softraid0: 256 targets
> root on sd0a (52b32a3ca15accec.a) swap on sd0b dump on sd0b

Reply | Threaded
Open this post in threaded view
|

Re: System freeze after zzz

Donald Allen
On Tue, Nov 26, 2013 at 2:15 AM, Mike Larkin <[hidden email]> wrote:

> On Sun, Nov 24, 2013 at 10:42:45AM -0500, Don Allen wrote:
>> I'm running current (as of the 11/14 snapshot) on a micro-itx box I
>> built around an Intel Atom d510mo motherboard. When I try to wake
>> the system after zzz, my X session comes alive and I can change
>> workspaces with the window manager (xmonad; no desktop system), but
>> firefox is hung (won't repaint its window), can't run top, can't run
>> shutdown from another console. I can ping the system from another,
>> but can't ssh to it. Guesswork on my part: the kernel is
>> giant-locked, so no system calls?
>>
>> The only way I've found to get out of this is to power-cycle. It's
>> happened a few times now, so I think I can reproduce it and am
>> certainly willing to try to help debug.
>
> Can you try to see if time is advancing after resume?
>
> Eg, a sequence of 'date' commands from an open xterm?
>

What I'm talking about here occurs after bringing up X with 'startx'
and then suspending with 'zzz'. The system is now up-to-date as of two
days ago (kernel, userland, xenocara).

After waking up:

Switching consoles with ctrl-alt F2, I was able to run the date
command repeatedly, and the time is advancing. 'ls' also worked
normally, but 'ls -l' hung. 'ps aux' hangs.  'shutdown' and 'reboot'
both hang.

Switching consoles with ctrl-alt F1, I noticed the following chatter:
ahci0: device on port 0 didn't come ready TFD: 0x80<BSY>
ahci0: Stopping the port, soft reset slot 31 was still active
ahci0: unable to communicate with device on port 1

I don't know if the above is significant, but it isn't there on that
first console if I don't suspend and it struck me as suspicious.

I also noticed that the disk-busy light on the front panel is on solid
after attempting to resume. In normal operation, when there is disk
activity and the light is on, I can hear the disk, presumably the
heads seeking. In this situation, I don't hear that. I realize that
doesn't mean there isn't disk activity, just not long enough head
excursions to be audible.

I have all filesystems mounted with softdep enabled, and after
power-cycling to reboot, there's usually a lot of chatter from fsck
about repairing things on various filesystems. One that usually turns
up needing repair is sd0d, which is /tmp. If the fsck output is logged
somewhere and it would be helpful, I can send it. I tried to find it
with

cd /var
find . -exec fgrep SALVAGED {} \; -print

which turned up nothing. Or I can try to photograph the screen as it's
happening.

I also tried suspending with 'zzz' right after booting and logging in,
no 'startx'. After attempting to resume, I got a stream of messages on
the first console, all the same:

ehci_idone: ex=0xffff8000001f3c00 is done!

The disk-busy light was not on. I could not switch consoles to try
commands and could not type at the console that was spewing these
messages. As with the above, I had to power-cycle to recover.

/Don


> -ml
>
>
>>
>> Dmesg:
>>
>>
>> OpenBSD 5.4-current (GENERIC.MP) #147: Tue Nov 12 16:37:15 MST 2013
>>     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> RTC BIOS diagnostic error 80<clock_battery>
>> real mem = 4260089856 (4062MB)
>> avail mem = 4138549248 (3946MB)
>> mainbus0 at root
>> bios0 at mainbus0: SMBIOS rev. 2.5 @ 0xe43c0 (27 entries)
>> bios0: vendor Intel Corp. version "MOPNV10J.86A.0311.2010.0802.2346"
>> date 08/02/2010
>> bios0: Intel Corporation D510MO
>> acpi0 at bios0: rev 2
>> acpi0: sleep states S0 S1 S3 S4 S5
>> acpi0: tables DSDT FACP APIC MCFG HPET SSDT
>> acpi0: wakeup devices SLPB(S4) PS2M(S4) PS2K(S4) UAR1(S4) UAR2(S4)
>> P32_(S4) ILAN(S4) PEX0(S4) PEX1(S4) PEX2(S4) PEX3(S4) UHC1(S3)
>> UHC2(S3) UHC3(S3) UHC4(S3) EHCI(S3) [...]
>> acpitimer0 at acpi0: 3579545 Hz, 24 bits
>> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
>> cpu0 at mainbus0: apid 0 (boot processor)
>> cpu0: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.96 MHz
>> cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
>> cpu0: 512KB 64b/line 8-way L2 cache
>> cpu0: smt 0, core 0, package 0
>> cpu0: apic clock running at 166MHz
>> cpu0: mwait min=64, max=64, C-substates=0.1.0.0.0, IBE
>> cpu1 at mainbus0: apid 1 (application processor)
>> cpu1: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
>> cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
>> cpu1: 512KB 64b/line 8-way L2 cache
>> cpu1: smt 1, core 0, package 0
>> cpu2 at mainbus0: apid 2 (application processor)
>> cpu2: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
>> cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
>> cpu2: 512KB 64b/line 8-way L2 cache
>> cpu2: smt 0, core 1, package 0
>> cpu3 at mainbus0: apid 3 (application processor)
>> cpu3: Intel(R) Atom(TM) CPU D510 @ 1.66GHz, 1666.69 MHz
>> cpu3: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,DTES64,MWAIT,DS-CPL,TM2,SSSE3,CX16,xTPR,PDCM,MOVBE,NXE,LONG,LAHF,PERF
>> cpu3: 512KB 64b/line 8-way L2 cache
>> cpu3: smt 1, core 1, package 0
>> ioapic0 at mainbus0: apid 8 pa 0xfec00000, version 20, 24 pins
>> ioapic0: misconfigured as apic 0, remapped to apid 8
>> acpimcfg0 at acpi0 addr 0xf8000000, bus 0-63
>> acpihpet0 at acpi0: 14318179 Hz
>> acpiprt0 at acpi0: bus 5 (P32_)
>> acpiprt1 at acpi0: bus 0 (PCI0)
>> acpiprt2 at acpi0: bus 1 (PEX0)
>> acpiprt3 at acpi0: bus 2 (PEX1)
>> acpiprt4 at acpi0: bus 3 (PEX2)
>> acpiprt5 at acpi0: bus 4 (PEX3)
>> acpicpu0 at acpi0: C1
>> acpicpu1 at acpi0: C1
>> acpicpu2 at acpi0: C1
>> acpicpu3 at acpi0: C1
>> acpibtn0 at acpi0: SLPB
>> pci0 at mainbus0 bus 0
>> pchb0 at pci0 dev 0 function 0 "Intel Pineview DMI" rev 0x02
>> vga1 at pci0 dev 2 function 0 "Intel Pineview Video" rev 0x02
>> intagp0 at vga1
>> agp0 at intagp0: aperture at 0xe0000000, size 0x10000000
>> inteldrm0 at vga1
>> drm0 at inteldrm0
>> intel_overlay_map_regs partial stub
>> inteldrm0: 1280x800
>> wsdisplay0 at vga1 mux 1: console (std, vt100 emulation)
>> wsdisplay0: screen 1-5 added (std, vt100 emulation)
>> azalia0 at pci0 dev 27 function 0 "Intel 82801GB HD Audio" rev 0x01: msi
>> azalia0: codecs: Realtek ALC662
>> audio0 at azalia0
>> ppb0 at pci0 dev 28 function 0 "Intel 82801GB PCIE" rev 0x01: msi
>> pci1 at ppb0 bus 1
>> re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x03: RTL8168D/8111D
>> (0x2800), apic 8 int 16, address 00:27:0e:0e:ff:a9
>> rgephy0 at re0 phy 7: RTL8169S/8110S PHY, rev. 2
>> ppb1 at pci0 dev 28 function 1 "Intel 82801GB PCIE" rev 0x01: msi
>> pci2 at ppb1 bus 2
>> ppb2 at pci0 dev 28 function 2 "Intel 82801GB PCIE" rev 0x01: msi
>> pci3 at ppb2 bus 3
>> ppb3 at pci0 dev 28 function 3 "Intel 82801GB PCIE" rev 0x01: msi
>> pci4 at ppb3 bus 4
>> uhci0 at pci0 dev 29 function 0 "Intel 82801GB USB" rev 0x01: apic 8 int 23
>> uhci1 at pci0 dev 29 function 1 "Intel 82801GB USB" rev 0x01: apic 8 int 19
>> uhci2 at pci0 dev 29 function 2 "Intel 82801GB USB" rev 0x01: apic 8 int 18
>> uhci3 at pci0 dev 29 function 3 "Intel 82801GB USB" rev 0x01: apic 8 int 16
>> ehci0 at pci0 dev 29 function 7 "Intel 82801GB USB" rev 0x01: apic 8 int 23
>> usb0 at ehci0: USB revision 2.0
>> uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
>> ppb4 at pci0 dev 30 function 0 "Intel 82801BAM Hub-to-PCI" rev 0xe1
>> pci5 at ppb4 bus 5
>> pcib0 at pci0 dev 31 function 0 "Intel NM10 LPC" rev 0x01
>> ahci0 at pci0 dev 31 function 2 "Intel 82801GR AHCI" rev 0x01: msi, AHCI 1.1
>> scsibus0 at ahci0: 32 targets
>> cd0 at scsibus0 targ 0 lun 0: <ASUS, DRW-24B1ST c, 1.05> ATAPI
>> 5/cdrom removable
>> sd0 at scsibus0 targ 1 lun 0: <ATA, WDC WD3200AAJS-5, 01.0> SCSI3
>> 0/direct fixed naa.50014ee157ed27dc
>> sd0: 305245MB, 512 bytes/sector, 625142448 sectors
>> ichiic0 at pci0 dev 31 function 3 "Intel 82801GB SMBus" rev 0x01:
>> apic 8 int 19
>> iic0 at ichiic0
>> spdmem0 at iic0 addr 0x50: 2GB DDR2 SDRAM non-parity PC2-6400CL5
>> spdmem1 at iic0 addr 0x51: 2GB DDR2 SDRAM non-parity PC2-6400CL5
>> usb1 at uhci0: USB revision 1.0
>> uhub1 at usb1 "Intel UHCI root hub" rev 1.00/1.00 addr 1
>> usb2 at uhci1: USB revision 1.0
>> uhub2 at usb2 "Intel UHCI root hub" rev 1.00/1.00 addr 1
>> usb3 at uhci2: USB revision 1.0
>> uhub3 at usb3 "Intel UHCI root hub" rev 1.00/1.00 addr 1
>> usb4 at uhci3: USB revision 1.0
>> uhub4 at usb4 "Intel UHCI root hub" rev 1.00/1.00 addr 1
>> isa0 at pcib0
>> isadma0 at isa0
>> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
>> com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
>> pckbc0 at isa0 port 0x60/5
>> pckbd0 at pckbc0 (kbd slot)
>> pckbc0: using irq 1 for kbd slot
>> wskbd0 at pckbd0: console keyboard, using wsdisplay0
>> pcppi0 at isa0 port 0x61
>> spkr0 at pcppi0
>> lpt0 at isa0 port 0x378/4 irq 7
>> wbsio0 at isa0 port 0x4e/2: W83627THF rev 0x84
>> lm1 at wbsio0 port 0x290/8: W83627THF
>> mtrr: Pentium Pro MTRR support, 7 var ranges, 88 fixed ranges
>> uhub5 at uhub0 port 4 "NEC hub" rev 2.00/1.00 addr 2
>> uhub6 at uhub5 port 2 "D-Link product 0xf103" rev 2.00/1.00 addr 3
>> uhub7 at uhub5 port 3 "ATEN International product 0x7000" rev
>> 1.10/1.00 addr 4
>> uhidev0 at uhub7 port 1 configuration 1 interface 0 "ATEN SW2-RA
>> V1.0.072" rev 1.10/1.00 addr 5
>> uhidev0: iclass 3/1
>> ukbd0 at uhidev0: 8 variable keys, 6 key codes, country code 33
>> wskbd1 at ukbd0 mux 1
>> wskbd1: connecting to wsdisplay0
>> uhidev1 at uhub7 port 1 configuration 1 interface 1 "ATEN SW2-RA
>> V1.0.072" rev 1.10/1.00 addr 5
>> uhidev1: iclass 3/1
>> ums0 at uhidev1: 5 buttons, Z dir
>> wsmouse0 at ums0 mux 0
>> vscsi0 at root
>> scsibus1 at vscsi0: 256 targets
>> softraid0 at root
>> scsibus2 at softraid0: 256 targets
>> root on sd0a (52b32a3ca15accec.a) swap on sd0b dump on sd0b

Reply | Threaded
Open this post in threaded view
|

Re: System freeze after zzz

Mike Larkin
On Tue, Nov 26, 2013 at 07:44:32AM -0500, Donald Allen wrote:
> On Tue, Nov 26, 2013 at 2:15 AM, Mike Larkin <[hidden email]> wrote:
> > On Sun, Nov 24, 2013 at 10:42:45AM -0500, Don Allen wrote:

..snip..

>
> After waking up:
>
> Switching consoles with ctrl-alt F2, I was able to run the date
> command repeatedly, and the time is advancing. 'ls' also worked
> normally, but 'ls -l' hung. 'ps aux' hangs.  'shutdown' and 'reboot'
> both hang.
>
> Switching consoles with ctrl-alt F1, I noticed the following chatter:
> ahci0: device on port 0 didn't come ready TFD: 0x80<BSY>
> ahci0: Stopping the port, soft reset slot 31 was still active
> ahci0: unable to communicate with device on port 1

That's your problem, your disk didn't come back after resume. I'm not sure
why, this is the first time I've seen that. Maybe some ahci expert
can comment further. I've frequently seen the first ahci0: line above
but my disks always come back online after that.

>
> I don't know if the above is significant, but it isn't there on that
> first console if I don't suspend and it struck me as suspicious.
>
> I also noticed that the disk-busy light on the front panel is on solid
> after attempting to resume. In normal operation, when there is disk
> activity and the light is on, I can hear the disk, presumably the
> heads seeking. In this situation, I don't hear that. I realize that
> doesn't mean there isn't disk activity, just not long enough head
> excursions to be audible.

The disk came back partly-resumed. Who knows what state it's in.

>
> I have all filesystems mounted with softdep enabled, and after
> power-cycling to reboot, there's usually a lot of chatter from fsck
> about repairing things on various filesystems. One that usually turns
> up needing repair is sd0d, which is /tmp. If the fsck output is logged
> somewhere and it would be helpful, I can send it. I tried to find it
> with
>
> cd /var
> find . -exec fgrep SALVAGED {} \; -print
>
> which turned up nothing. Or I can try to photograph the screen as it's
> happening.

Your FSes were uncleanly shut down since the disk didn't resume and that's
why fsck finds a bunch of uncleanliness.

>
> I also tried suspending with 'zzz' right after booting and logging in,
> no 'startx'. After attempting to resume, I got a stream of messages on
> the first console, all the same:
>
> ehci_idone: ex=0xffff8000001f3c00 is done!

That's irrelevant and may even be fixed by some recent commits. It's
because we basically need to tear down the USB device tree and reconnect
it on resume. There was probably an xfer in flight when you suspended and
the device to which it was associated dissappeared (temporarily) on
resume.

>
> The disk-busy light was not on. I could not switch consoles to try
> commands and could not type at the console that was spewing these
> messages. As with the above, I had to power-cycle to recover.
>
> /Don

Your problem is that your disk didn't resume. There are some efforts going
on presently to improve some of the wakeup/resume codepaths, but those
diffs aren't in the tree yet. They may or may not help.