ieee80211 panic on athn reconfig

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

ieee80211 panic on athn reconfig

Jan Stary
This is current/i386 on an ALIX (dmesg below) with

  athn0 at pci0 dev 12 function 0 "Atheros AR9280" rev 0x01: irq 9
  athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:01:d6:86

# cat hostname.athn0
inet 192.168.33.1 255.255.255.0 NONE
media autoselect mode 11g mediaopt hostap chan 2
nwid stare.cz wpakey hovnoPrdel123

After changing the password, or the channel, or the mode, and doing

# sh /etc/netstart athn0

the machine reproducibly panics (cereal script below).

I have no idea why it panics in ieee80211_encrypt().
It happens both with clients associated and not.

Is this known with athn(4)?
How can I help debug this?

        Jan


ddb> show panic
ieee80211_encrypt: key unset for sw crypto: 0

ddb> trace
db_enter() at db_enter+0x4
panic(d0b83788) at panic+0xcc
ieee80211_encrypt(d194e030,d195bc00,d194eb00) at ieee80211_encrypt+0x70
ar5008_tx(d194e000,d195bc00,d19a0000,2) at ar5008_tx+0x9a
ar5008_swba_intr(d194e000) at ar5008_swba_intr+0x238
ar5008_intr(d194e000) at ar5008_intr+0x12f
intr_handler(f3b1d67c,d1945480) at intr_handler+0x18
Xintr_legacy9_untramp() at Xintr_legacy9_untramp+0xf7
end of kernel

ddb> ps
   PID   TID PPID UID  S      FLAGS  WAIT   COMMAND
*51886 239363 46526   0  7 0x3   ifconfig
 46526 94296 51119   0  3   0x10008b  pause   sh
 51119 210938    1   0  3   0x10008b  pause   ksh
  4074 355325    1   0  3   0x100098  poll   cron
 26296 319573 67907  74  3   0x100092  bpf   pflogd
 67907 395783    1   0  3       0x80  netio   pflogd
 78788 338207    1  79  3   0x100090  kqread   tftpd
 64936 466062 45121  95  3   0x100092  kqread   smtpd
 78617 19784 45121 103  3   0x100092  kqread   smtpd
 89735 422787 45121  95  3   0x100092  kqread   smtpd
 40031 127196 45121  95  3   0x100092  kqread   smtpd
  3003 366634 45121  95  3   0x100092  kqread   smtpd
 67429 418102 45121  95  3   0x100092  kqread   smtpd
 45121 87978    1   0  3   0x100080  kqread   smtpd
 89293  9339    1  77  3   0x100090  poll   dhcpd
 32523 33766    1   0  3       0x80  select   sshd
 16723 521208    1   0  3   0x100080  poll   ntpd
 41803 404697 97594  83  3   0x100092  poll   ntpd
 97594 270290    1  83  3   0x100092  poll   ntpd
 27672 104019    1  53  3       0x90  kqread   unbound
 85121 72700 81754  97  3   0x100090  kqread   nsd
-81754or133366 40270  97  3   0x100090  poll   nsd
 40270 98718    1  97  3   0x100090  kqread   nsd
  2198 36071 55390  74  3   0x100092  bpf   pflogd
 55390 372523    1   0  3       0x80  netio   pflogd
 82531 87748 87247  73  3   0x100090  kqread   syslogd
 87247 309403    1   0  3   0x100082  netio   syslogd
 98924 510112 35334 115  3   0x100092  kqread   slaacd
 93418 61048 35334 115  3   0x100092  kqread   slaacd
 35334 428553    1   0  3   0x100080  kqread   slaacd
  5139 163288    0   0  3    0x14200  bored   smr
 22383 413035    0   0  2    0x14200   zerothread
 77055 99704    0   0  3    0x14200  aiodoned   aiodoned
 61899 379872    0   0  3    0x14200  syncer   update
 81836 124433    0   0  3    0x14200  cleaner   cleaner
 55117 45992    0   0  3    0x14200  reaper   reaper
 50811 60573    0   0  3    0x14200  pgdaemon   pagedaemon
 15077 352797    0   0  3    0x14200  bored   crynlk
  6357 442984    0   0  3    0x14200  bored   crypto
 45388 138131    0   0  3    0x14200  usbtsk   usbtask
 45899 192598    0   0  3    0x14200  usbatsk   usbatsk
 77300 116231    0   0  3    0x14200  bored   sensors
 36473 508486    0   0  3    0x14200  bored   softnet
 73636 394873    0   0  3    0x14200  bored   systqmp
-64894or356410    0   0  3    0x14200  bored   systq
  4636 461286    0   0  3 0x40014200  bored   softclock
 59424 288681    0   0  3 0x40014200   idle0
 19281 244490    0   0  3    0x14200  kmalloc   kmthread
     1 431117    0   0  3       0x82  wait   init
     0     0   -1   0  3    0x10200  scheduler   swapper



OpenBSD 6.7-beta (GENERIC) #108: Thu Apr  9 11:00:54 MDT 2020
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
real mem  = 267931648 (255MB)
avail mem = 247336960 (235MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 11/05/08, BIOS32 rev. 0 @ 0xfd088
pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
pcibios0: pcibios_get_intr_routing - function not supported
pcibios0: PCI IRQ Routing information unavailable.
pcibios0: PCI bus #0 is the last bus
bios0: ROM list: 0xe0000/0xa800
cpu0 at mainbus0: (uniprocessor)
cpu0: Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class) 499 MHz, 05-0a-02
cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX,MMXX,3DNOW2,3DNOW
mtrr: K6-family MTRR support (2 registers)
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 1 function 0 "AMD Geode LX" rev 0x33
glxsb0 at pci0 dev 1 function 2 "AMD Geode LX Crypto" rev 0x00: RNG AES
vr0 at pci0 dev 9 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 10, address 00:0d:b9:1a:a4:10
ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI 0x004063, model 0x0034
vr1 at pci0 dev 10 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 11, address 00:0d:b9:1a:a4:11
ukphy1 at vr1 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI 0x004063, model 0x0034
vr2 at pci0 dev 11 function 0 "VIA VT6105M RhineIII" rev 0x96: irq 15, address 00:0d:b9:1a:a4:12
ukphy2 at vr2 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI 0x004063, model 0x0034
athn0 at pci0 dev 12 function 0 "Atheros AR9280" rev 0x01: irq 9
athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:01:d6:86
glxpcib0 at pci0 dev 15 function 0 "AMD CS5536 ISA" rev 0x03: rev 3, 32-bit 3579545Hz timer, watchdog, gpio, i2c
gpio0 at glxpcib0: 32 pins
iic0 at glxpcib0
maxtmp0 at iic0 addr 0x4c: lm86
pciide0 at pci0 dev 15 function 2 "AMD CS5536 IDE" rev 0x01: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: <SDCFHS-016G>
wd0: 1-sector PIO, LBA48, 15279MB, 31293360 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
pciide0: channel 1 ignored (disabled)
ohci0 at pci0 dev 15 function 4 "AMD CS5536 USB" rev 0x02: irq 12, version 1.0, legacy support
ehci0 at pci0 dev 15 function 5 "AMD CS5536 USB" rev 0x02: irq 12
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "AMD EHCI root hub" rev 2.00/1.00 addr 1
isa0 at glxpcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
usb1 at ohci0: USB revision 1.0
uhub1 at usb1 configuration 1 interface 0 "AMD OHCI root hub" rev 1.00/1.00 addr 1
nvram: invalid checksum
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
root on wd0a (9cd0e5ba033bd225.a) swap on wd0b dump on wd0b
clock: unknown CMOS layout

Reply | Threaded
Open this post in threaded view
|

Re: ieee80211 panic on athn reconfig

Stefan Sperling-5
On Fri, Apr 17, 2020 at 12:08:39PM +0200, Jan Stary wrote:

> This is current/i386 on an ALIX (dmesg below) with
>
>   athn0 at pci0 dev 12 function 0 "Atheros AR9280" rev 0x01: irq 9
>   athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:01:d6:86
>
> # cat hostname.athn0
> inet 192.168.33.1 255.255.255.0 NONE
> media autoselect mode 11g mediaopt hostap chan 2
> nwid stare.cz wpakey hovnoPrdel123
>
> After changing the password, or the channel, or the mode, and doing
>
> # sh /etc/netstart athn0
>
> the machine reproducibly panics (cereal script below).
>
> I have no idea why it panics in ieee80211_encrypt().
> It happens both with clients associated and not.
>
> Is this known with athn(4)?

No, but it is definitely a bug.

> How can I help debug this?

Could you try to find a short sequence of 'ifconfig athn0' commands that
will trigger it, instead of /etc/netstart? That would help me already.

Reply | Threaded
Open this post in threaded view
|

Re: ieee80211 panic on athn reconfig

Stefan Sperling-5
In reply to this post by Jan Stary
On Fri, Apr 17, 2020 at 12:08:39PM +0200, Jan Stary wrote:

> This is current/i386 on an ALIX (dmesg below) with
>
>   athn0 at pci0 dev 12 function 0 "Atheros AR9280" rev 0x01: irq 9
>   athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:01:d6:86
>
> # cat hostname.athn0
> inet 192.168.33.1 255.255.255.0 NONE
> media autoselect mode 11g mediaopt hostap chan 2
> nwid stare.cz wpakey hovnoPrdel123
>
> After changing the password, or the channel, or the mode, and doing
>
> # sh /etc/netstart athn0
>
> the machine reproducibly panics (cereal script below).
>
> I have no idea why it panics in ieee80211_encrypt().
> It happens both with clients associated and not.
>
> Is this known with athn(4)?
> How can I help debug this?
>
> Jan
>
>
> ddb> show panic
> ieee80211_encrypt: key unset for sw crypto: 0
>
> ddb> trace
> db_enter() at db_enter+0x4
> panic(d0b83788) at panic+0xcc
> ieee80211_encrypt(d194e030,d195bc00,d194eb00) at ieee80211_encrypt+0x70
> ar5008_tx(d194e000,d195bc00,d19a0000,2) at ar5008_tx+0x9a
> ar5008_swba_intr(d194e000) at ar5008_swba_intr+0x238
> ar5008_intr(d194e000) at ar5008_intr+0x12f
> intr_handler(f3b1d67c,d1945480) at intr_handler+0x18
> Xintr_legacy9_untramp() at Xintr_legacy9_untramp+0xf7
> end of kernel

Are you using clients which use powersave mode, such as phones?

This trace goes through ar5008_swba_intr(). The only way to get into
ar5008_tx() from there is when group-addressed frames are queued on the
powersave queue of the AP (ic_bss->ni_savedq).

I cannot see this queue being purged anywhere when the interface goes down.
So it seems what happened is that a stale frame was sitting on this queue
and a fatal transmit attempt occurred when the interface came back up after
being re-configured.

Can you please try this diff?

The same panic and trace has also been reported to me by Ted Patterson.

diff ffca677e9e7ca9efd316fa2f2b6572b193c50cf8 /usr/src
blob - f6349c70279687b18ce89f670b732a62f3696271
file + sys/net80211/ieee80211_node.c
--- sys/net80211/ieee80211_node.c
+++ sys/net80211/ieee80211_node.c
@@ -1595,6 +1595,10 @@ ieee80211_node_cleanup(struct ieee80211com *ic, struct
  free(ni->ni_unref_arg, M_DEVBUF, ni->ni_unref_arg_size);
  ni->ni_unref_arg = NULL;
  ni->ni_unref_arg_size = 0;
+
+#ifndef IEEE80211_STA_ONLY
+ mq_purge(&ni->ni_savedq);
+#endif
 }
 
 void
@@ -2047,7 +2051,7 @@ ieee80211_free_allnodes(struct ieee80211com *ic, int c
  splx(s);
 
  if (clear_ic_bss && ic->ic_bss != NULL)
- ieee80211_node_cleanup(ic, ic->ic_bss); /* for station mode */
+ ieee80211_node_cleanup(ic, ic->ic_bss);
 }
 
 void

Reply | Threaded
Open this post in threaded view
|

Re: ieee80211 panic on athn reconfig

Jan Stary
On May 03 19:21:17, [hidden email] wrote:

> On Fri, Apr 17, 2020 at 12:08:39PM +0200, Jan Stary wrote:
> > This is current/i386 on an ALIX (dmesg below) with
> >
> >   athn0 at pci0 dev 12 function 0 "Atheros AR9280" rev 0x01: irq 9
> >   athn0: AR9280 rev 2 (2T2R), ROM rev 22, address 04:f0:21:01:d6:86
> >
> > # cat hostname.athn0
> > inet 192.168.33.1 255.255.255.0 NONE
> > media autoselect mode 11g mediaopt hostap chan 2
> > nwid stare.cz wpakey hovnoPrdel123
> >
> > After changing the password, or the channel, or the mode, and doing
> >
> > # sh /etc/netstart athn0
> >
> > the machine reproducibly panics (cereal script below).
> >
> > I have no idea why it panics in ieee80211_encrypt().
> > It happens both with clients associated and not.
> >
> > Is this known with athn(4)?
> > How can I help debug this?
> >
> > Jan
> >
> >
> > ddb> show panic
> > ieee80211_encrypt: key unset for sw crypto: 0
> >
> > ddb> trace
> > db_enter() at db_enter+0x4
> > panic(d0b83788) at panic+0xcc
> > ieee80211_encrypt(d194e030,d195bc00,d194eb00) at ieee80211_encrypt+0x70
> > ar5008_tx(d194e000,d195bc00,d19a0000,2) at ar5008_tx+0x9a
> > ar5008_swba_intr(d194e000) at ar5008_swba_intr+0x238
> > ar5008_intr(d194e000) at ar5008_intr+0x12f
> > intr_handler(f3b1d67c,d1945480) at intr_handler+0x18
> > Xintr_legacy9_untramp() at Xintr_legacy9_untramp+0xf7
> > end of kernel

Sorry for being so late; apparently, the fix is already in.

I can confirm that none of the above happens any more:
changing the password, the mode, or the channel
does not result in a panic, with and without clients connected.


> Are you using clients which use powersave mode, such as phones?

yes, androids.

        Jan