kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"

Nicholas Marriott
>Number:         5036
>Category:       kernel
>Synopsis:       sparc64 nfs server panics somewhat randomly with "mem address not aligned"
>Confidential:   yes
>Severity:       serious
>Priority:       medium
>Responsible:    bugs
>State:          open
>Quarter:        
>Keywords:      
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Sat Feb 25 21:50:01 GMT 2006
>Closed-Date:
>Last-Modified:
>Originator:     Nicholas Marriott <[hidden email]>
>Release:        3.9
>Organization:
net
>Environment:
        System      : OpenBSD 3.9
        Architecture: OpenBSD.sparc64
        Machine     : sparc64
>Description:
Ultra 10 nfs server exporting a few directories to an i386 nfs client. Both running -current installed from ftp.openbsd.org today, but I've also tried 3.8 and an older -current on the server and a different i386 client running 3.7 and had the same problem.

Everything initially works fine but after some time the server panics when doing something on an nfs mounted fs, always with the same panic message and always in what looks like nfs code, although the exact point seems to vary. I can always trigger the panic immediately by untarring something large over nfs, for example running "make extract" in /usr/ports/devel/jdk/1.3-linux from the client, when /usr/ports is on nfs. Doing similar things locally on the server works fine and neither it nor the client exhibit any other problems.

I can supply any further information, do tests, etc on request, at least for the next week.
------------------
Panic message, trace and ps:

trap type 0x34: pc=11be180 npc=11be184 pstate=820006<PRIV,IE>
panic: mem address not aligned
kdb breakpoint at 132d680
Stopped at Debugger+0x4: nop
RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
ddb> trace
trap(e1ab620, 34, 11be180, 820006, 7, 1f) at trap+0x1dc
slowtrap(7439e46, 0, 744aa4c, 1, 139b1b0, 9) at slowtrap+0x17c
nfsrv_dorec(21a8000, 2183800, 7457c90, e1aba70, 8, 21a8390) at nfsrv_dorec+0x94
nfssvc_nfsd(0, 725f90, 7457c90, e1a8000, e1abb70, 725dcc0) at nfssvc_nfsd+0x3f0
sys_nfssvc(0, e1abdd0, e1abdc0, 3, 1, 180ded8) at sys_nfssvc+0x2ac
syscall(e1abed0, 9b, 101384, 101388, 0, 0) at syscall+0x280
softtrap(4, 725f90, 18, 31e7f0, 0, 71e000) at softtrap+0x184
ddb> ps
   PID   PPID   PGRP    UID  S       FLAGS  WAIT       COMMAND        
   317  15061  15061      0  3        0x86  netio      tcpdump        
 15061  32397  15061     76  3      0x4186  bpf        tcpdump        
  5731  22136  22136    518  3      0x4184  kqread     imap-login      
 23728  22136  22136    518  3      0x4184  kqread     imap-login      
 30531  22136  22136    518  3      0x4184  kqread     imap-login      
 13664  22136  22136      0  3      0x4084  kqread     dovecot-auth    
 32397      1  32397      0  3      0x4086  pause      ksh            
 21120      1  21120      0  3        0x84  select     cron            
 23237  20474  20474      0  3       0x185  pause      smbd            
 22136      1  22136      0  3        0x84  kqread     dovecot        
 20474      1  20474      0  3       0x185  select     smbd            
 22914      1  22914      0  3        0x85  select     nmbd            
 27665      1  27665      0  3        0x84  poll       logfmon        
  9012      1  30017      0  3        0x86  nanosleep  perl            
 24514      1  24514      0  3     0x40184  select     sendmail        
 13207      1  13207      0  3        0x84  select     sshd            
  6963      1   6963      0  3       0x184  select     inetd          
  9441      1   9441     71  3       0x184  kqread     ftp-proxy      
  5027      1   5027     77  3       0x184  poll       dhcpd          
 13234   9425   9425     83  3       0x184  poll       ntpd            
  9425      1   9425      0  3        0x84  poll       ntpd            
 11200      1  11200      0  3        0x84  poll       rpc.lockd      
   616  28982  28982      0  3        0x84  nfsd       nfsd            
 23386  28982  28982      0  3        0x84  nfsd       nfsd            
 16210  28982  28982      0  3        0x84  nfsd       nfsd            
 12516  28982  28982      0  3        0x84  nfsd       nfsd            
 11810  28982  28982      0  3        0x84  nfsd       nfsd            
* 2035  28982  28982      0  7         0x4             nfsd            
 28982      1  28982      0  3        0x84  netcon     nfsd            
 29005      1  29005      0  3        0x84  select     mountd          
  1975      1   1975      0  3        0x84  poll       rpc.yppasswdd  
  9464      1   9464      0  3        0x84  select     ypbind          
 28849      1  30017      0  3        0x84  poll       ypserv          
  5187      1   5187     28  3       0x184  poll       portmap        
  1698  14555  14555     70  3       0x184  select     named          
 14555      1  14555      0  3       0x184  netio      named          
 18509  21537  21537     74  2       0x584             pflogd          
 21537      1  21537      0  3        0x84  netio      pflogd          
 27545  11733  11733     73  2       0x184             syslogd        
 11733      1  11733      0  3        0x84  netio      syslogd        
 16543      1  16543     77  3       0x184  poll       dhclient        
 12595      1  30017      0  3        0x86  poll       dhclient        
     9      0      0      0  3    0x100204  crypto_wa  crypto          
     8      0      0      0  3    0x100204  aiodoned   aiodoned        
     7      0      0      0  3    0x100204  syncer     update          
     6      0      0      0  3    0x100204  cleaner    cleaner        
     5      0      0      0  3    0x100204  reaper     reaper          
     4      0      0      0  3    0x100204  pgdaemon   pagedaemon      
     3      0      0      0  3    0x100204  pftm       pfpurge        
     2      0      0      0  3    0x100204  kmalloc    kmthread        
     1      0      1      0  3      0x4084  wait       init            
     0     -1      0      0  3     0x80204  scheduler  swapper        
------------------
Server dmesg:

OpenBSD 3.9-beta (GENERIC) #754: Fri Feb 24 20:19:09 MST 2006
    [hidden email]:/usr/src/sys/arch/sparc64/compile/GENERIC
total memory = 268435456
avail memory = 234782720
using 1638 buffers containing 13418496 bytes of memory
bootpath: /pci@1f,0/pci@1,1/ide@3,0/disk@0,0
mainbus0 (root): Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 333MHz)
cpu0 at mainbus0: SUNW,UltraSPARC-IIi @ 333 MHz, version 0 FPU
cpu0: physical 32K instruction (32 b/l), 16K data (32 b/l), 2048K external (64 b/l)
psycho0 at mainbus0 addr 0xfffc4000
SUNW,sabre: impl 0, version 0: ign 7c0 bus range 0 to 2; PCI bus 0
DVMA map: c0000000 to e0000000
IOTDB: 1362000 to 13e2000
pci0 at psycho0
ppb0 at pci0 dev 1 function 1 "Sun Simba PCI-PCI" rev 0x13
pci1 at ppb0 bus 1
ebus0 at pci1 dev 1 function 0 "Sun PCIO Ebus2" rev 0x01
auxio0 at ebus0 addr 726000-726003, 728000-728003, 72a000-72a003, 72c000-72c003, 72f000-72f003
power at ebus0 addr 724000-724003 ipl 37 not configured
SUNW,pll at ebus0 addr 504000-504002 not configured
sab0 at ebus0 addr 400000-40007f ipl 43: rev 3.2
sabtty0 at sab0 port 0
sabtty1 at sab0 port 1
comkbd0 at ebus0 addr 3083f8-3083ff ipl 41: layout 46
wskbd0 at comkbd0: console keyboard
com0 at ebus0 addr 3062f8-3062ff ipl 42: mouse: ns16550a, 16 byte fifo
lpt0 at ebus0 addr 3043bc-3043cb, 30015c-30015d, 700000-70000f ipl 34: polled
fdthree at ebus0 addr 3023f0-3023f7, 706000-70600f, 720000-720003 ipl 39 not configured
clock1 at ebus0 addr 0-1fff: mk48t59: hostid 80a219fa
flashprom at ebus0 addr 0-fffff not configured
audioce0 at ebus0 addr 200000-2000ff, 702000-70200f, 704000-70400f, 722000-722003 ipl 35 ipl 36: nvaddrs 0
audio0 at audioce0
hme0 at pci1 dev 1 function 1 "Sun HME" rev 0x01: ivec 3021, address 08:00:20:a2:19:fa
nsphy0 at hme0 phy 1: DP83840 10/100 PHY, rev. 1
vgafb0 at pci1 dev 2 function 0 "ATI Mach64 GP" rev 0x5c
wsdisplay0 at vgafb0: console (std, sun emulation), using wskbd0
pciide0 at pci1 dev 3 function 0 "CMD Technology PCI0646" rev 0x03: DMA, channel 0 configured to native-PCI, channel 1 configured to native-PCI
pciide0: using ivec 1820 for native-PCI interrupt
wd0 at pciide0 channel 0 drive 0: <Maxtor 6Y080L0>
wd0: 16-sector PIO, LBA, 78167MB, 160086528 sectors
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <LG, CD-ROM CRD-8322B, 1.03> SCSI0 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
ppb1 at pci0 dev 1 function 0 "Sun Simba PCI-PCI" rev 0x13
pci2 at ppb1 bus 2
rl0 at pci2 dev 1 function 0 "Realtek 8139" rev 0x10: ivec 10, address 00:a1:b0:00:80:50
rlphy0 at rl0 phy 0: RTL internal PHY
pciide1 at pci2 dev 2 function 0 "CMD Technology PCI0680" rev 0x02
pciide1: bus-master DMA support present
pciide1: channel 0 configured to native-PCI mode
pciide1: using ivec 14 for native-PCI interrupt
wd1 at pciide1 channel 0 drive 0: <Maxtor 6L200P0>
wd1: 16-sector PIO, LBA48, 194481MB, 398297088 sectors
wd1(pciide1:0:0): using PIO mode 4, Ultra-DMA mode 2
pciide1: channel 1 configured to native-PCI mode
pcons at mainbus0 not configured
No counter-timer -- using %tick at 333MHz as system clock.
root on wd0a
rootdev=0xc00 rrootdev=0x1a00 rawdev=0x1a02
WARNING: / was not properly unmounted
------------------
Server /exports:

# $OpenBSD: exports,v 1.2 2002/05/31 08:15:44 pjanzen Exp $

/home2 -maproot=root -network 192.168.0.0 -mask 255.255.255.0
/data -maproot=root -network 192.168.0.0 -mask 255.255.255.0
/export/src -maproot=root -alldirs -network 192.168.0.0 -mask 255.255.255.0
/export/ports -maproot=root -alldirs -network 192.168.0.0 -mask 255.255.255.0
/export/tmp -maproot=root -network 192.168.0.0 -mask 255.255.255.0
------------------
Server fstab:

/dev/wd0a / ffs rw,softdep 1 1
/dev/wd0e /tmp ffs rw,softdep,nodev,nosuid 1 2
/dev/wd0f /usr ffs rw,softdep,nodev 1 2
/dev/wd0d /var ffs rw,softdep,nodev,nosuid 1 2
/dev/wd0g /backup ffs rw,softdep,nodev,nosuid 1 2

/dev/wd1d /home/selah ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1e /home2 ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1f /home/backup ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1g /export/src ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1h /home/ftp ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1i /data ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1j /export/ports ffs rw,softdep,nodev 1 2
/dev/wd1k /home/cvs ffs rw,softdep,nodev,nosuid 1 2
/dev/wd1l /export/tmp ffs rw,softdep,nodev,nosuid 1 2
------------------
Client dmesg:

OpenBSD 3.9-beta (GENERIC) #607: Fri Feb 24 15:30:22 MST 2006
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: AMD Athlon(tm)  ("AuthenticAMD" 686-class, 256KB L2 cache) 1.25 GHz
cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
cpu0: AMD Powernow: TS
real mem  = 402169856 (392744K)
avail mem = 359612416 (351184K)
using 4278 buffers containing 20209664 bytes (19736K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 12/10/02, BIOS32 rev. 0 @ 0xfdac0
pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xf8030/176 (9 entries)
pcibios0: PCI Interrupt Router at 000:17:0 ("VIA VT8235 ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc0000/0x10000
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "VIA VT8366 PCI" rev 0x00
ppb0 at pci0 dev 1 function 0 "VIA VT8366 AGP" rev 0x00
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "3DFX Interactive Voodoo3" rev 0x01
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
vr0 at pci0 dev 11 function 0 "VIA VT6105 RhineIII" rev 0x86: irq 10, address 00:0d:88:f5:72:00
ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 4: OUI 0x004063, model 0x0034
eap0 at pci0 dev 12 function 0 "Ensoniq AudioPCI97" rev 0x08: irq 3
ac97: codec id 0x43525913 (Cirrus Logic CS4297A rev 3)
ac97: codec features headphone, 20 bit DAC, 18 bit ADC, Crystal Semi 3D
audio0 at eap0
midi0 at eap0: <AudioPCI MIDI UART>
"SiS 300/305/630 VGA" rev 0x90 at pci0 dev 13 function 0 not configured
uhci0 at pci0 dev 16 function 0 "VIA VT83C572 USB" rev 0x80: irq 3
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 16 function 1 "VIA VT83C572 USB" rev 0x80: irq 11
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2 at pci0 dev 16 function 2 "VIA VT83C572 USB" rev 0x80: irq 10
usb2 at uhci2: USB revision 1.0
uhub2 at usb2
uhub2: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0 at pci0 dev 16 function 3 "VIA VT6202 USB" rev 0x82: irq 10
usb3 at ehci0: USB revision 2.0
uhub3 at usb3
uhub3: VIA EHCI root hub, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
viapm0 at pci0 dev 17 function 0 "VIA VT8235 ISA" rev 0x00
iic0 at viapm0
pciide0 at pci0 dev 17 function 1 "VIA VT82C571 IDE" rev 0x06: ATA133, channel 0 configured to compatibility, channel 1 configured to compatibility
wd0 at pciide0 channel 0 drive 0: <QUANTUM FIREBALL ST4.3A>
wd0: 16-sector PIO, LBA, 4110MB, 8418816 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
atapiscsi0 at pciide0 channel 1 drive 0
scsibus0 at atapiscsi0: 2 targets
cd0 at scsibus0 targ 0 lun 0: <PIONEER, DVD-RW DVR-105, 1.33> SCSI0 5/cdrom removable
atapiscsi1 at pciide0 channel 1 drive 1
scsibus1 at atapiscsi1: 2 targets
cd1 at scsibus1 targ 0 lun 0: <SONY, CD-RW CRX225E, QYB2> SCSI0 5/cdrom removable
cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2
cd1(pciide0:1:1): using PIO mode 4, Ultra-DMA mode 2
isa0 at mainbus0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
midi1 at pcppi0: <PC speaker>
spkr0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
lm0 at isa0 port 0x290/8: W83697HF
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
fd0 at fdc0 drive 0: 1.44MB 80 cyl, 2 head, 18 sec
biomask ff6d netmask ff6d ttymask ffef
pctr: user-level cycle counter enabled
mtrr: Pentium Pro MTRR support
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302
------------------
Client fstab:

/dev/wd0a / ffs rw,softdep 1 1
/dev/wd0d /tmp ffs rw,softdep,nodev,nosuid 1 2
/dev/wd0f /usr ffs rw,softdep,nodev 1 2
/dev/wd0e /var ffs rw,softdep,nodev,nosuid 1 2

/dev/cd0a /mnt/dvdrw cd9660 ro,noauto 0 0
/dev/cd1a /mnt/cdrw cd9660 ro,noauto 0 0
/dev/sd0c /mnt/mp3 msdos rw,noauto,long 0 0

nfs:/home2 /home2 nfs rw,nodev,nosuid,tcp,soft,intr 0 0
nfs:/data /data nfs rw,nodev,nosuid,tcp,soft,intr 0 0
nfs:/export/src/openbsd/current /usr/src nfs rw,nodev,nosuid,tcp,soft,intr 0 0
nfs:/export/ports/openbsd/current /usr/ports nfs rw,nodev,tcp,soft,intr 0 0

>How-To-Repeat:


>Fix:


>Release-Note:
>Audit-Trail:
>Unformatted:

Reply | Threaded
Open this post in threaded view
|

Re: kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"

Pedro Martelletto
The following reply was made to PR kernel/5036; it has been noted by GNATS.

From: Pedro Martelletto <[hidden email]>
To: Nicholas Marriott <[hidden email]>
Cc: [hidden email]
Subject: Re: kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"
Date: Wed, 1 Mar 2006 10:47:11 -0300

 Is the NFS traffic on the server going through rl(4) or hme(4)?
 
 -p.

Reply | Threaded
Open this post in threaded view
|

Re: kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"

Nicholas Marriott
In reply to this post by Nicholas Marriott
The following reply was made to PR kernel/5036; it has been noted by GNATS.

From: Nicholas Marriott <[hidden email]>
To: Pedro Martelletto <[hidden email]>
Cc: [hidden email]
Subject: Re: kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"
Date: Wed, 1 Mar 2006 14:28:30 +0000 (GMT)

 > Is the NFS traffic on the server going through rl(4) or hme(4)?
 
 I should have said, with that panic it is going through the rl.
 
 -- Nicholas.

Reply | Threaded
Open this post in threaded view
|

Re: kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"

Nicholas Marriott
In reply to this post by Nicholas Marriott
The following reply was made to PR kernel/5036; it has been noted by GNATS.

From: Nicholas Marriott <[hidden email]>
To: Pedro Martelletto <[hidden email]>
Cc: [hidden email]
Subject: Re: kernel/5036: sparc64 nfs server panics somewhat randomly with "mem address not aligned"
Date: Wed, 1 Mar 2006 14:23:05 +0000 (GMT)

 > Is the NFS traffic on the server going through rl(4) or hme(4)?
 
 I tried both and had the same problem.
 
 -- Nicholas.