kernel/4604: [Fwd: fxp nics + pf + bridge = panic]

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

kernel/4604: [Fwd: fxp nics + pf + bridge = panic]

Don Feliciano
>Number:         4604
>Category:       kernel
>Synopsis:       fxp nics + pf + bridge = panic
>Confidential:   yes
>Severity:       critical
>Priority:       high
>Responsible:    bugs
>State:          open
>Quarter:        
>Keywords:      
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   net
>Arrival-Date:   Mon Nov 07 17:10:02 GMT 2005
>Closed-Date:
>Last-Modified:
>Originator:     Charlie Root
>Release:        3.8 GENERIC#138 i386
>Organization:
net
>Environment:
        Embedded VIA Eden/C3 Platform
        Onboard Eden 533 MHz FAN FREE CPU
        Onboard 3/4 x Intel PRO/100+ LAN Interface
        Onboard 1/4 x Intel PRO/1000 GbE Gigabit Ethernet
        http://www.commell-sys.com/Product/SBC/LE-564.htm
        System      : OpenBSD 3.8
        Architecture: OpenBSD.i386
        Machine     : i386
>Description:
        I have reproduced this problem in 3.6, 3.7, and now 3.8.  Original platform was an HP Kayak Xu-800 i386 PC with Intel Pro NICs.  New platform is VIA Eden with Intel NICs.  (http://www.commell-sys.com/Product/SBC/LE-564.htm)  Problem is the same:  After initial configuration as transparent bridge + pf, system becomes unstable (panics) after approx 3 days uptime.  Following initial panic, system will crash within seconds thereafter (until or unless the bridge disabled).  Having now ruled out hardware as a possible cause, this seems to be an issue with bridging Intel NICs.  Flushing pf rules has no affect.  Problem does not occur on FreeBSD 5.3 + pf.

dmesg:

OpenBSD 3.8 (GENERIC) #138: Sat Sep 10 15:41:37 MDT 2005
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: VIA Samuel 2 ("CentaurHauls" 686-class) 533 MHz
cpu0: FPU,DE,TSC,MSR,MTRR,PGE,MMX
real mem  = 125415424 (122476K)
avail mem = 107687936 (105164K)
using 1556 buffers containing 6373376 bytes (6224K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(51) BIOS, date 11/24/03, BIOS32 rev. 0 @ 0xfb590
apm0 at bios0: Power Management spec V1.2
apm0: AC on, battery charge unknown
apm0: flags 70102 dobusy 1 doidle 1
pcibios0 at bios0: rev 2.1 @ 0xf0000/0xdef4
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfde50/160 (8 entries)
pcibios0: PCI Exclusive IRQs: 5 10 11 12
pcibios0: PCI Interrupt Router at 000:07:0 ("VIA VT82C596A ISA" rev 0x00)
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc0000/0xc000 0xcc000/0x4000!
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "VIA VT8601 PCI" rev 0x05
ppb0 at pci0 dev 1 function 0 "VIA VT82C601 AGP" rev 0x00
pci1 at ppb0 bus 1
vga1 at pci1 dev 0 function 0 "Trident CyberBlade i1" rev 0x6a
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcib0 at pci0 dev 7 function 0 "VIA VT82C686 ISA" rev 0x40
pciide0 at pci0 dev 7 function 1 "VIA VT82C571 IDE" rev 0x06: ATA100, channel 0 configured to compatibility, c
hannel 1 configured to compatibility
wd0 at pciide0 channel 0 drive 0: <SanDisk SDCFH-512>
wd0: 1-sector PIO, LBA, 488MB, 1000944 sectors
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
pciide0: channel 1 disabled (no drives)
uhci0 at pci0 dev 7 function 2 "VIA VT83C572 USB" rev 0x1a: irq 10
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 7 function 3 "VIA VT83C572 USB" rev 0x1a: irq 10
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: VIA UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
viaenv0 at pci0 dev 7 function 4 "VIA VT82C686 SMBus" rev 0x40
em0 at pci0 dev 16 function 0 "Intel PRO/1000MT (82540EM)" rev 0x02: irq 5, address: 00:03:1d:02:18:ab
fxp0 at pci0 dev 17 function 0 "Intel 82557" rev 0x10, i82551: irq 12, address 00:03:1d:02:18:ac
inphy0 at fxp0 phy 1: i82555 10/100 PHY, rev. 4
fxp1 at pci0 dev 18 function 0 "Intel 82557" rev 0x10, i82551: irq 10, address 00:03:1d:02:18:ad
inphy1 at fxp1 phy 1: i82555 10/100 PHY, rev. 4
fxp2 at pci0 dev 19 function 0 "Intel 82557" rev 0x10, i82551: irq 11, address 00:03:1d:02:18:ae
inphy2 at fxp2 phy 1: i82555 10/100 PHY, rev. 4
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
sysbeep0 at pcppi0
lpt0 at isa0 port 0x378/4 irq 7
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
biomask e745 netmask ff65 ttymask ffe7
pctr: user-level cycle counter enabled
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a
rootdev=0x0 rrootdev=0x300 rawdev=0x302

ddb trace:

Debugger(0,0,0,d38c7800,d05d2760) at Debugger+0x4
panic(d04f6bc0,d04f8b89,69bcd1dc,d38c7000,d38c7800) at panic+0x63
pool_get(d05d2760,0,d3908400,1,d06f1dfc) at pool_get+0x315
fxp_start(d0958040,d06f1dfc,e,d09e3a80) at fxp_start+0x2ac
bridge_ifenqueue(d09ed000,d0958040,d392d500,d0958040,d392d500) at bridge_ifenqu
eue+0xa2
bridgeintr_frame(d09ed000,d392d500,3,d06f1e9c) at bridgeintr_frame+0x270
bridgeintr(58,10,10,10,d06f1e9c) at bridgeintr+0x6a
Bad frame pointer: 0xd06f1e44

ddb ps:

   PID   PPID   PGRP    UID  S       FLAGS  WAIT       COMMAND
 22174      1  22174      0  3      0x4086  ttyin      getty
 21980      1  21980      0  3      0x4086  ttyin      getty
 28224      1  28224      0  3      0x4086  ttyin      getty
  1377      1   1377      0  3      0x4086  ttyin      getty
  9042      1   9042      0  3      0x4086  ttyin      getty
    29      1     29      0  3      0x4086  ttyin      getty
 15231      1  15231      0  3        0x84  select     cron
  9721      1   9721      0  3        0x84  select     sshd
 12433      1  12433      0  3       0x184  select     inetd
 28925      1  28925      0  3        0x84  poll       ntpd
 16155      1  17961     83  3       0x186  poll       ntpd
 23319  20717  20717     74  3       0x184  bpf        pflogd
 20717      1  20717      0  3        0x84  netio      pflogd
  9191  28146  28146     73  3       0x184  poll       syslogd
 28146      1  28146      0  3        0x84  netio      syslogd
    13      0      0      0  3    0x100204  crypto_wa  crypto
    12      0      0      0  3    0x100204  aiodoned   aiodoned
    11      0      0      0  3    0x100204  syncer     update
    10      0      0      0  3    0x100204  cleaner    cleaner
     9      0      0      0  3    0x100204  reaper     reaper
     8      0      0      0  3    0x100204  pgdaemon   pagedaemon
     7      0      0      0  3    0x100204  pftm       pfpurge
     6      0      0      0  3    0x100204  usbevt     usb1
     5      0      0      0  3    0x100204  usbtsk     usbtask
     4      0      0      0  3    0x100204  usbevt     usb0
     3      0      0      0  3    0x100204  apmev      apm0
     2      0      0      0  3    0x100204  kmalloc    kmthread
     1      0      1      0  3      0x4084  wait       init
     0     -1      0      0  3     0x80204  scheduler  swapper

>How-To-Repeat:
        On system with 3 Intel NICs, configure 1 with an IP, and bridge the other two.  Configure some minimal pf rules to filter inbound traffic on bridge.  Generate traffic across bridge.  After some time (a few days, initially), the kernel will panic.


>Fix:
>Release-Note:
>Audit-Trail:
>Unformatted:
 Sorry for the SPAM, but sendbug ignored my Reply-To field (or, more
 likely,  I messed something up).  Correct reply to address is:  
 [hidden email].  [hidden email] is not a valiad address.  
 Please adjust bug details.  Thanks!
 
 -------- Original Message --------
 Subject: fxp nics + pf + bridge = panic
 Date: Mon, 7 Nov 2005 10:29:38 -0500 (EST)
 From: [hidden email]
 Reply-To: [hidden email]
 To: [hidden email]
 CC: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]

Pedro Martelletto
The following reply was made to PR kernel/4604; it has been noted by GNATS.

From: Pedro Martelletto <[hidden email]>
To: [hidden email]
Cc: [hidden email]
Subject: Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]
Date: Mon, 7 Nov 2005 16:37:09 -0200

 What's the panic message?
 
 -p.

Reply | Threaded
Open this post in threaded view
|

Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]

Don Feliciano
In reply to this post by Don Feliciano
The following reply was made to PR kernel/4604; it has been noted by GNATS.

From: Don Feliciano <[hidden email]>
To: Pedro Martelletto <[hidden email]>
Cc: [hidden email], [hidden email]
Subject: Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]
Date: Tue, 08 Nov 2005 08:01:03 -0500

 Right, sorry.
 
 kernel: page fault trap, code=0
 Stopped at      phtree_SPLAY+0x46:      cmpl    0x1c(%edx),%eax
 
 
 Pedro Martelletto wrote the following on 11/7/2005 1:37 PM:
 
 >What's the panic message?
 >
 >-p.

Reply | Threaded
Open this post in threaded view
|

Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]

Don Feliciano
In reply to this post by Don Feliciano
Some more info...

I disabled pf, and am running as a plain transparent bridge.  I hammered
the box overnight with iperf, and no panic.  I will continue to run this
way, but my feeling is that the panic only occurs when pf is enabled.  
Not sure if that helps.

Here's my ruleset, in case it's of any value:

#########################################################################
# OpenBSD bridged packet filter /etc/pf.conf
# Created: Don Feliciano 7/13/2004

# Matching TCP packets based on flags is most often used to filter TCPs
# packets that are attempting to open a new connection. The TCP flags and
# their meanings are listed here:

#   * F : Fin - Finish; end of session
#   * S : SYN - Synchronize; indicates request to start session
#   * R : RST - Reset; drop a connection
#   * P : PUSH - Push; packet is sent immediately
#   * A : ACK - Acknowledgement
#   * U : URG - Urgent
#   * E : ECE - Explicit Congestion Notification Echo
#   * W : CWR - Congestion Window Reduced

tcp_flags = "S/SA"

#### Interface aliases
# Interface aliases for ease of administration.

ext_if = "fxp0"        # Untrusted (to HYSL LAN)
int_if = "fxp1"        # Trusted (to isolated switch)

#### Set the interface for which PF should gather statistics such as bytes
# in/out and packets passed/blocked. Statistics can only be gathered for one
# interface at a time.
set loginterface $ext_if

#### Trusted hosts
# Allow greater access to certain hosts
table <essbase_servers> {rtfm.mydomain.com}
table <reports_servers> {obsolete.mydomain.com}
table <oracle_servers> {rtfm.mydomain.com}
table <websphere_nodemanagers> {obsolete.mydomain.com}
table <nfs_servers> {ogre.mydomain.com}
table <nfs_clients> {rtfm.mydomain.com,obsolete.mydomain.com}

#### Traffic Normalization
# Prevent fragmentation attacks
scrub in on $ext_if all fragment reassemble no-df
scrub out on $ext_if all fragment reassemble random-id no-df

#### Enable queueing on the external interface
#altq on $ext_if cbq bandwidth 100Mb queue { std_out, wan }

#### Define the parameters for the child queues.
# std_out      - the standard queue. any filter rule below that does not
#                explicitly specify a queue will have its traffic added
#                to this one.
# wan          - Hyperion Reports traffic
#queue std_out bandwidth 100Mb cbq(default)
#queue wan bandwidth 768Kb cbq(red)

### Pass traffic on the loopback interface in either direction
pass quick on lo0 all

#### Internal Bridge interface rules
# Filter on external interface - in bridge mode,
# we only filter on one interface.
pass in quick on $int_if all
pass out quick on $int_if all

#### External Bridge interface rules (main ruleset)
# Rule order does not matter

# Block (Deny) all inbound by default
#block return in log on $ext_if all
block return in on $ext_if all

### Inbound Filtering Rules
# Allow selected TCP traffic in to all, full throttle
# NOTE: Add netbios-ssn to allow mapping of drives
pass in quick on $ext_if proto tcp from any to any port \
              {ssh,5900,3389} \
              keep state flags $tcp_flags

# Allow DHCP & DNS
pass in quick on $ext_if proto udp from any to any \
              port {bootpc,domain} keep state

# Allow http through to rtfm
pass in quick on $ext_if proto tcp from any to rtfm.mydomain.com port http \
              keep state flags $tcp_flags

# Allow license server access through to obsolete
#pass in quick on $ext_if proto tcp from any to obsolete.mydomain.com
port 27000 \
#              keep state flags $tcp_flags

# Allow Reports traffic to pass through - optionally throttle bandwidth
#pass in quick on $ext_if proto tcp from any to <reports_servers> port \
#              { 1098><1105, 8200, 19000 } keep state flags $tcp_flags #
queue(wan)

# Allow 9090 in to WebSphere ND Servers
pass in quick on $ext_if proto tcp from any to <websphere_nodemanagers> \
              port 9090 keep state flags $tcp_flags

# Allow 1521 in to Oracle Servers
pass in quick on $ext_if proto tcp from any to <oracle_servers> \
              port 1521 keep state flags $tcp_flags

# Allow 1422><1430 in for Essbase
pass in quick on $ext_if proto tcp from any to <essbase_servers> \
              port { 1422><1430 } keep state flags $tcp_flags

# Allow ALL in from <nfs_servers> to <nfs_clients>
pass in quick on $ext_if proto {tcp,udp} from <nfs_servers> to
<nfs_clients> \
              keep state flags $tcp_flags

# Allow ICMP ping requests
pass in quick on $ext_if inet proto icmp all icmp-type 8 code 0 keep state

### Outbound Filtering Rules
# Allow ICMP ping requests
pass out quick on $ext_if inet proto icmp all icmp-type 8 code 0 keep state

# Allow all UDP/TCP, and keep state
pass out quick on $ext_if proto udp all keep state
pass out quick on $ext_if proto tcp all modulate state

########################################################################


Gnats wrote the following on 11/7/2005 12:10 PM:

>Thank you very much for your problem report.
>It has the internal identification `kernel/4604'.
>The individual assigned to look at your
>report is: bugs.
>
>  
>
>>Category:       kernel
>>Responsible:    bugs
>>Synopsis:       fxp nics + pf + bridge = panic
>>Arrival-Date:   Mon Nov 07 17:10:02 GMT 2005

Reply | Threaded
Open this post in threaded view
|

Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]

Don Feliciano
In reply to this post by Don Feliciano
The following reply was made to PR kernel/4604; it has been noted by GNATS.

From: Don Feliciano <[hidden email]>
To: [hidden email], [hidden email]
Cc:  
Subject: Re: kernel/4604: [Fwd: fxp nics + pf + bridge = panic]
Date: Wed, 09 Nov 2005 09:29:04 -0500

 Some more info...
 
 I disabled pf, and am running as a plain transparent bridge.  I hammered
 the box overnight with iperf, and no panic.  I will continue to run this
 way, but my feeling is that the panic only occurs when pf is enabled.  
 Not sure if that helps.
 
 Here's my ruleset, in case it's of any value:
 
 #########################################################################
 # OpenBSD bridged packet filter /etc/pf.conf
 # Created: Don Feliciano 7/13/2004
 
 # Matching TCP packets based on flags is most often used to filter TCPs
 # packets that are attempting to open a new connection. The TCP flags and
 # their meanings are listed here:
 
 #   * F : Fin - Finish; end of session
 #   * S : SYN - Synchronize; indicates request to start session
 #   * R : RST - Reset; drop a connection
 #   * P : PUSH - Push; packet is sent immediately
 #   * A : ACK - Acknowledgement
 #   * U : URG - Urgent
 #   * E : ECE - Explicit Congestion Notification Echo
 #   * W : CWR - Congestion Window Reduced
 
 tcp_flags = "S/SA"
 
 #### Interface aliases
 # Interface aliases for ease of administration.
 
 ext_if = "fxp0"        # Untrusted (to HYSL LAN)
 int_if = "fxp1"        # Trusted (to isolated switch)
 
 #### Set the interface for which PF should gather statistics such as bytes
 # in/out and packets passed/blocked. Statistics can only be gathered for one
 # interface at a time.
 set loginterface $ext_if
 
 #### Trusted hosts
 # Allow greater access to certain hosts
 table <essbase_servers> {rtfm.mydomain.com}
 table <reports_servers> {obsolete.mydomain.com}
 table <oracle_servers> {rtfm.mydomain.com}
 table <websphere_nodemanagers> {obsolete.mydomain.com}
 table <nfs_servers> {ogre.mydomain.com}
 table <nfs_clients> {rtfm.mydomain.com,obsolete.mydomain.com}
 
 #### Traffic Normalization
 # Prevent fragmentation attacks
 scrub in on $ext_if all fragment reassemble no-df
 scrub out on $ext_if all fragment reassemble random-id no-df
 
 #### Enable queueing on the external interface
 #altq on $ext_if cbq bandwidth 100Mb queue { std_out, wan }
 
 #### Define the parameters for the child queues.
 # std_out      - the standard queue. any filter rule below that does not
 #                explicitly specify a queue will have its traffic added
 #                to this one.
 # wan          - Hyperion Reports traffic
 #queue std_out bandwidth 100Mb cbq(default)
 #queue wan bandwidth 768Kb cbq(red)
 
 ### Pass traffic on the loopback interface in either direction
 pass quick on lo0 all
 
 #### Internal Bridge interface rules
 # Filter on external interface - in bridge mode,
 # we only filter on one interface.
 pass in quick on $int_if all
 pass out quick on $int_if all
 
 #### External Bridge interface rules (main ruleset)
 # Rule order does not matter
 
 # Block (Deny) all inbound by default
 #block return in log on $ext_if all
 block return in on $ext_if all
 
 ### Inbound Filtering Rules
 # Allow selected TCP traffic in to all, full throttle
 # NOTE: Add netbios-ssn to allow mapping of drives
 pass in quick on $ext_if proto tcp from any to any port \
               {ssh,5900,3389} \
               keep state flags $tcp_flags
 
 # Allow DHCP & DNS
 pass in quick on $ext_if proto udp from any to any \
               port {bootpc,domain} keep state
 
 # Allow http through to rtfm
 pass in quick on $ext_if proto tcp from any to rtfm.mydomain.com port http \
               keep state flags $tcp_flags
 
 # Allow license server access through to obsolete
 #pass in quick on $ext_if proto tcp from any to obsolete.mydomain.com
 port 27000 \
 #              keep state flags $tcp_flags
 
 # Allow Reports traffic to pass through - optionally throttle bandwidth
 #pass in quick on $ext_if proto tcp from any to <reports_servers> port \
 #              { 1098><1105, 8200, 19000 } keep state flags $tcp_flags #
 queue(wan)
 
 # Allow 9090 in to WebSphere ND Servers
 pass in quick on $ext_if proto tcp from any to <websphere_nodemanagers> \
               port 9090 keep state flags $tcp_flags
 
 # Allow 1521 in to Oracle Servers
 pass in quick on $ext_if proto tcp from any to <oracle_servers> \
               port 1521 keep state flags $tcp_flags
 
 # Allow 1422><1430 in for Essbase
 pass in quick on $ext_if proto tcp from any to <essbase_servers> \
               port { 1422><1430 } keep state flags $tcp_flags
 
 # Allow ALL in from <nfs_servers> to <nfs_clients>
 pass in quick on $ext_if proto {tcp,udp} from <nfs_servers> to
 <nfs_clients> \
               keep state flags $tcp_flags
 
 # Allow ICMP ping requests
 pass in quick on $ext_if inet proto icmp all icmp-type 8 code 0 keep state
 
 ### Outbound Filtering Rules
 # Allow ICMP ping requests
 pass out quick on $ext_if inet proto icmp all icmp-type 8 code 0 keep state
 
 # Allow all UDP/TCP, and keep state
 pass out quick on $ext_if proto udp all keep state
 pass out quick on $ext_if proto tcp all modulate state
 
 ########################################################################
 
 
 Gnats wrote the following on 11/7/2005 12:10 PM:
 
 >Thank you very much for your problem report.
 >It has the internal identification `kernel/4604'.
 >The individual assigned to look at your
 >report is: bugs.
 >
 >  
 >
 >>Category:       kernel
 >>Responsible:    bugs
 >>Synopsis:       fxp nics + pf + bridge = panic
 >>Arrival-Date:   Mon Nov 07 17:10:02 GMT 2005