recent troubles with iwn(4)

classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

recent troubles with iwn(4)

Bryan Stenson
Hi all -

I'm writing to "misc" rather than "bugs" as I'm not yet sure this is a
bug.  I'm hoping to help triage this with assistance from this list.

I'm running -CURRENT and the iwn(4) driver for my wireless card.  Over
the past year, this has been working great, but recently (within the
last month or so), I've had issues where the NIC just stops working
after a few hours of usage.  I don't have a solid steps for
reproduction.

I realize "stops working" is not a very accurate account here...but
I'm confused on how to get more descriptive information of the
problem.  When it stops, "ifconfig" shows iwn0 with an IP address, but
I'm unable to ping.  Additionally, I'm not seeing any
warnings/messages in "dmesg" about the device...so I'm confused.

A simple "ifconfig iwn0 down; sh /etc/netstart iwn0" seems to fix the
problem, but I haven't had to do that in the past...it just feels like
a recent change (iwn(4) work?) has put me in this state.

I'm really wanting to help here.  How can I run the iwn(4) in debug
mode, or increase logging verbosity?  And/or, should I try to capture
packets via tcpdump?  And/or, can I run an older bsd.mp (without
having to downgrade packages to older versions) in order to try and
"bisect" where the problem may have been introduced?

The following are my kernel and wireless details:

# uname -a
OpenBSD e530c.siliconvortex.com 6.6 GENERIC.MP#289 amd64

# pcidump -v
...
 3:0:0: Intel Centrino Wireless-N 2230
        0x0000: Vendor ID: 8086, Product ID: 0888
        0x0004: Command: 0006, Status: 0010
        0x0008: Class: 02 Network, Subclass: 80 Miscellaneous,
                Interface: 00, Revision: c4
        0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
                Cache Line Size: 10
        0x0010: BAR mem 64bit addr: 0x00000000f2d00000/0x00002000
        0x0018: BAR empty (00000000)
        0x001c: BAR empty (00000000)
        0x0020: BAR empty (00000000)
        0x0024: BAR empty (00000000)
        0x0028: Cardbus CIS: 00000000
        0x002c: Subsystem Vendor ID: 8086 Product ID: 4262
        0x0030: Expansion ROM Base Address: 00000000
        0x0038: 00000000
        0x003c: Interrupt Pin: 01 Line: 0b Min Gnt: 00 Max Lat: 00
        0x00c8: Capability 0x01: Power Management
                State: D0
        0x00d0: Capability 0x05: Message Signalled Interrupts (MSI)
                Enabled: yes
        0x00e0: Capability 0x10: PCI Express
                Link Speed: 2.5 / 2.5 GT/s, Link Width: x1 / x1
        0x0100: Enhanced Capability 0x01: Advanced Error Reporting
        0x0140: Enhanced Capability 0x03: Device Serial Number
                Serial Number: 6036ddffffed4a81

With humility, an open mind, and eagerness to learn/help:

Bryan

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Magnus Wild
On Sun, 2019-09-08 at 20:31 +0000, Bryan Stenson wrote:

> I'm running -CURRENT and the iwn(4) driver for my wireless
> card.  Over
> the past year, this has been working great, but recently (within the
> last month or so), I've had issues where the NIC just stops working
> after a few hours of usage.  I don't have a solid steps for
> reproduction.

> A simple "ifconfig iwn0 down; sh /etc/netstart iwn0" seems to fix the
> problem, but I haven't had to do that in the past...it just feels
> like
> a recent change (iwn(4) work?) has put me in this state.

Hello! I have no solution for your problem, but when I read your mail I
was pretty interested, because I've been having the same issue with
previous snapshots from the past year or so, but the other way around;
I've had issues on certain networks, for example the network in my
home, where I get disconnected or my connections times out frequently.

Sometimes it would require me to do the exact thing you are suggesting,
namely take down the interface with either a "ifconfig iwn0 down" or
sometimes also "ifconfig iwn0 delete" and then "sh /etc/netstart".
Other times it was just a matter of having to wait forever before I
could connect to the wireless at home. Or rather, wait to use it,
because almost anything timed out at first. Then after a while it
started working as expected.

This issue has completely gone away some time during the last few weeks
though, which I found relieving. My hardware is a Thinkpad x230 with
the following output from "pcidump -v":

3:0:0: Intel Centrino Advanced-N 6205
        0x0000: Vendor ID: 8086, Product ID: 0085
        0x0004: Command: 0006, Status: 0010
        0x0008: Class: 02 Network, Subclass: 80 Miscellaneous,
                Interface: 00, Revision: 34
        0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
                Cache Line Size: 10
        0x0010: BAR mem 64bit addr: 0x00000000f1c00000/0x00002000
        0x0018: BAR empty (00000000)
        0x001c: BAR empty (00000000)
        0x0020: BAR empty (00000000)
        0x0024: BAR empty (00000000)
        0x0028: Cardbus CIS: 00000000
        0x002c: Subsystem Vendor ID: 8086 Product ID: 1311
        0x0030: Expansion ROM Base Address: 00000000
        0x0038: 00000000
        0x003c: Interrupt Pin: 01 Line: 0a Min Gnt: 00 Max Lat: 00
        0x00c8: Capability 0x01: Power Management
                State: D0
        0x00d0: Capability 0x05: Message Signalled Interrupts (MSI)
                Enabled: yes
        0x00e0: Capability 0x10: PCI Express
                Link Speed: 2.5 / 2.5 GT/s, Link Width: x1 / x1
        0x0100: Enhanced Capability 0x01: Advanced Error Reporting
        0x0140: Enhanced Capability 0x03: Device Serial Number
                Serial Number: 6c8814ffffdaa8e4

I never bothered to report it, because I rarely use this particular
machine at home, and since I only had issues there I assumed that it
was some kind of issue with my AP.

/Magnus

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Paulm-7
In reply to this post by Bryan Stenson
On Sun, Sep 08, 2019 at 08:31:55PM +0000, Bryan Stenson wrote:

> Hi all -
>
> I'm writing to "misc" rather than "bugs" as I'm not yet sure this is a
> bug.  I'm hoping to help triage this with assistance from this list.
>
> I'm running -CURRENT and the iwn(4) driver for my wireless card.  Over
> the past year, this has been working great, but recently (within the
> last month or so), I've had issues where the NIC just stops working
> after a few hours of usage.  I don't have a solid steps for
> reproduction.
>
> I realize "stops working" is not a very accurate account here...but
> I'm confused on how to get more descriptive information of the
> problem.  When it stops, "ifconfig" shows iwn0 with an IP address, but
> I'm unable to ping.  Additionally, I'm not seeing any
> warnings/messages in "dmesg" about the device...so I'm confused.

The iwn driver on my host runnning 6.4 stable writes to
/var/log/messages (as well as the system message buffer).  Is there
nothing in the log files on your system?

--Paul


Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Stefan Sperling-5
In reply to this post by Bryan Stenson
On Sun, Sep 08, 2019 at 08:31:55PM +0000, Bryan Stenson wrote:

> Hi all -
>
> I'm writing to "misc" rather than "bugs" as I'm not yet sure this is a
> bug.  I'm hoping to help triage this with assistance from this list.
>
> I'm running -CURRENT and the iwn(4) driver for my wireless card.  Over
> the past year, this has been working great, but recently (within the
> last month or so), I've had issues where the NIC just stops working
> after a few hours of usage.  I don't have a solid steps for
> reproduction.
>
> I realize "stops working" is not a very accurate account here...but
> I'm confused on how to get more descriptive information of the
> problem.  When it stops, "ifconfig" shows iwn0 with an IP address, but
> I'm unable to ping.  Additionally, I'm not seeing any
> warnings/messages in "dmesg" about the device...so I'm confused.
>
> A simple "ifconfig iwn0 down; sh /etc/netstart iwn0" seems to fix the
> problem, but I haven't had to do that in the past...it just feels like
> a recent change (iwn(4) work?) has put me in this state.
>
> I'm really wanting to help here.  How can I run the iwn(4) in debug
> mode, or increase logging verbosity?  And/or, should I try to capture
> packets via tcpdump?  And/or, can I run an older bsd.mp (without
> having to downgrade packages to older versions) in order to try and
> "bisect" where the problem may have been introduced?

Please enable debugging with 'ifconfig iwn0 debug' and when the problem
reoccurs check whether you can find any relevant information in the
file /var/log/messages.

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Raf Czlonka-2
In reply to this post by Bryan Stenson
On Sun, Sep 08, 2019 at 09:31:55PM BST, Bryan Stenson wrote:

> Hi all -
>
> I'm writing to "misc" rather than "bugs" as I'm not yet sure this is a
> bug.  I'm hoping to help triage this with assistance from this list.
>
> I'm running -CURRENT and the iwn(4) driver for my wireless card.  Over
> the past year, this has been working great, but recently (within the
> last month or so), I've had issues where the NIC just stops working
> after a few hours of usage.  I don't have a solid steps for
> reproduction.
>
> I realize "stops working" is not a very accurate account here...but
> I'm confused on how to get more descriptive information of the
> problem.  When it stops, "ifconfig" shows iwn0 with an IP address, but
> I'm unable to ping.  Additionally, I'm not seeing any
> warnings/messages in "dmesg" about the device...so I'm confused.
>
> A simple "ifconfig iwn0 down; sh /etc/netstart iwn0" seems to fix the
> problem, but I haven't had to do that in the past...it just feels like
> a recent change (iwn(4) work?) has put me in this state.
>

Hi Bryan,

I've noticed exactly the same(?) behaviour since about July but
hadn't used the machine enough and, despite some changes to the
iwn(4) and/or ieee80211, I thought it might have had something to
do with the new AP I got.

Your email prompted me today to have a look at what is happening
while the laptop loses connectivity - it's been disconnected for 3
days after about 30 minutes of being connected (enough time to run
'pkg_add -u'). After putting the interface into debug modes:

        # ifconfig iwn0 debug

/var/log/messages shows:

        iwn0: AUTH -> SCAN
        iwn0: end active scan
        iwn0: - [...]
        iwn0: - 12:34:56:12:34:56   11  +212 54M ess  privacy   rsn  "MYNWID2"!
        iwn0: + 12:34:56:12:34:57   44  +203 54M ess  privacy   rsn  "MYNWID5"
        iwn0: - [...]
        iwn0: SCAN -> AUTH
        iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a

What seems strange there is the mode - after a very brief look, it
seems like the laptop is "stuck" trying to use 11a mode.

After forcing the 11n mode:

        # ifconfig iwn0 mode 11n

it connects instantaneously:

        iwn0: SCAN -> INIT
        iwn0: begin active scan
        iwn0: INIT -> SCAN
        iwn0: end active scan
        iwn0: AP 12:34:56:12:34:56 "MYNWID5" score 52
        iwn0: best AP 12:34:56:12:34:57 "MYNWID2" score 52
        iwn0: switching to network "MYNWID2"
        iwn0: [...]
        iwn0: + 12:34:56:12:34:56   11  +216 54M   ess  privacy   rsn "MYNWID2"
        iwn0: - 12:34:56:12:34:57   44    +0 54M   ess  privacy   rsn "MYNWID5"!
        iwn0: [...]
        iwn0: SCAN -> AUTH
        iwn0: sending auth to 12:34:56:12:34:56 on channel 11 mode 11g
        iwn0: AUTH -> ASSOC
        iwn0: sending assoc_req to 12:34:56:12:34:56 on channel 11 mode 11g
        iwn0: ASSOC -> RUN
        iwn0: associated with 12:34:56:12:34:56 ssid "MYNWID2" channel 11
        start MCS 0 short preamble long slot time HT enabled
        iwn0: missed beacon threshold set to 7 beacons, beacon interval is 100 TU
        iwn0: received msg 1/4 of the 4-way handshake from 12:34:56:12:34:56
        iwn0: sending msg 2/4 of the 4-way handshake to 12:34:56:12:34:56
        iwn0: received msg 3/4 of the 4-way handshake from 12:34:56:12:34:56
        iwn0: sending msg 4/4 of the 4-way handshake to 12:34:56:12:34:56
        iwn0: sending action to 12:34:56:12:34:56 on channel 11 mode 11n

MYNWID2 and MYNWID5 are the two NWIDs I use - 2.4 and 5GHz respectively.

I use the same hostname.if on all of my laptops:

        join MYNWID5 wpakey ...
        join MYNWID2 wpakey ...
        [...]
        dhcp

BTW, after reboot (upgrade to a new snapshot) and ~1h run, it
disconnected again.

Any thoughts, Stefan?

Regards,

Raf

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Stefan Sperling-5
On Mon, Sep 09, 2019 at 10:26:37AM +0100, Raf Czlonka wrote:

> Your email prompted me today to have a look at what is happening
> while the laptop loses connectivity - it's been disconnected for 3
> days after about 30 minutes of being connected (enough time to run
> 'pkg_add -u'). After putting the interface into debug modes:
>
> # ifconfig iwn0 debug
>
> /var/log/messages shows:
>
> iwn0: AUTH -> SCAN
> iwn0: end active scan
> iwn0: - [...]
> iwn0: - 12:34:56:12:34:56   11  +212 54M ess  privacy   rsn  "MYNWID2"!
> iwn0: + 12:34:56:12:34:57   44  +203 54M ess  privacy   rsn  "MYNWID5"
> iwn0: - [...]
> iwn0: SCAN -> AUTH
> iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a

This small part of the log is not useful by itself, unfortunately.
You need to show debug output where iwn left RUN state in the first place.

> What seems strange there is the mode - after a very brief look, it
> seems like the laptop is "stuck" trying to use 11a mode.
>
> After forcing the 11n mode:
>
> # ifconfig iwn0 mode 11n
>
> it connects instantaneously:

This command resets everything and the condition which triggered the
problem is now gone.

It connected fine to your 2 GHz AP. You have only shown failing connection
attempts to your 5 GHz AP. Did your 5 GHz AP ever actually work with iwn?

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Bryan Stenson
In reply to this post by Stefan Sperling-5
doh...I don't know why I didn't think of that...

Good news, with 'ifconfig iwn0 debug' set, once the strange behavior
starts, I see LOTS of repeated messages, the pattern happens about
once every 4 seconds, and dumps the following into /var/log/messages:

...
# continuous spamming of /var/log/messages from after the network has
been in the troubled/failed state for a while
Sep 10 09:00:06 e530c /bsd: iwn0: SCAN -> AUTH
Sep 10 09:00:06 e530c /bsd: iwn0: sending auth to 80:2a:a8:57:5e:17 on
channel 6 mode 11g
Sep 10 09:00:10 e530c /bsd: iwn0: AUTH -> SCAN
Sep 10 09:00:11 e530c /bsd: iwn0: end active scan
Sep 10 09:00:11 e530c /bsd: iwn0: - 00:0d:67:7d:a9:41    1  +173 54M
ess  privacy   rsn! "XFINITY"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 00:0d:67:7d:a9:42    1  +173 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 00:0d:67:7d:a9:43    1  +172 54M
ess       no!  rsn! "CableWiFi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 00:0d:67:7d:a9:48    1  +175 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 00:11:32:5f:d5:cb    9  +171 54M
ess  privacy   rsn  "CBArchitectural"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 00:1d:d4:3a:46:e0    1  +170 54M
ess  privacy   rsn  "HOME-46E2"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 08:86:3b:b6:2f:80    1  +177 54M
ess  privacy   rsn  "belkin.f80"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 10:86:8c:53:a7:84   11  +175 54M
ess  privacy   rsn  "Bentley"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 10:86:8c:77:cc:2c   11  +194 54M
ess  privacy   rsn  "Rivera-2.4"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 12:86:8c:53:a7:84   11  +172 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 14:59:c0:69:7b:23    6  +182 54M
ess  privacy   rsn  "My WiFi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 20:b0:01:40:f2:7c   11  +187 54M
ess  privacy   rsn  "CenturyLink3466"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 20:c9:d0:1a:88:91    1  +173 54M
ess  privacy   rsn  "Byron Bentley's N
etwork"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 22:86:8c:77:cc:2c   11  +191 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 24:de:c6:7e:1c:42    1  +172 54M
ess  privacy   rsn  "foss-vessel"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 24:de:c6:7e:1c:43    1  +174 54M
ess  privacy   rsn  "foss-guest"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 28:80:88:14:76:53   11  +172 54M
ess  privacy   rsn  "NETGEAR51"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 28:80:88:ec:b7:ba    1  +192 54M
ess  privacy   rsn  "Elephants1008_EXT
"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 32:86:8c:53:a7:84   11  +172 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 32:86:8c:77:cc:2c   11  +190 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 3c:7a:8a:9a:5a:d8   11  +171 54M
ess  privacy   rsn  "5ADA-2.4"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 40:b0:34:e1:fb:78    1  +174 54M
ess  privacy   rsn  "DIRECT-77-HP ENVY
 4520 series"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 4e:7a:8a:9a:5a:d8   11  +170 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 50:95:51:c7:c9:db   11  +173 54M
ess  privacy   rsn  "broadway-2.4"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 52:86:8c:53:a7:84   11  +172 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 52:86:8c:77:cc:2c   11  +190 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 60:38:e0:0e:be:02   11  +170 54M
ess  privacy   rsn  "Linksys02322"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 62:95:51:c7:c9:db   11  +170 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 70:e4:22:b5:1d:80    6  +172 54M
ess  privacy   rsn! 0x00!
Sep 10 09:00:11 e530c /bsd: iwn0: - 72:95:51:c7:c9:db   11  +172 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 74:85:2a:e5:f4:c8    6  +182 54M
ess  privacy   rsn  "Idontknow"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 7e:85:2a:e5:f4:c8    6  +183 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: + 80:2a:a8:57:5e:17    6  +201 54M
ess  privacy   rsn  "SV2"
Sep 10 09:00:11 e530c /bsd: iwn0: - 80:37:73:e0:a3:c4   11  +170 54M
ess  privacy   rsn  "DarneilleGraham-Back"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 82:2a:a8:57:5e:17    6  +203 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 82:85:2a:e5:f4:c8    6  +182 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 84:1b:5e:d8:fe:f0    7  +186 54M
ess  privacy   rsn  "Trevman"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 8a:85:2a:e5:f4:c8    6  +182 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 8c:0f:6f:e7:84:18    1  +191 54M
ess  privacy   rsn  "Elephants1008"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 8c:0f:6f:eb:24:78   11  +201 54M
ess  privacy   rsn  "Cleveland-Bales Casa"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 90:c7:92:4f:28:80    1  +175 54M
ess  privacy   rsn  "HOME-2882"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 92:95:51:c7:c9:db   11  +172 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 96:0f:6f:e7:84:18    1  +191 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 96:c7:92:4f:28:80    1  +172 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - 9a:0f:6f:e7:84:18    1  +192 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 9a:0f:6f:eb:24:78   11  +200 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - 9c:3d:cf:43:74:8a    8  +172 54M
ess  privacy   rsn  "NETGEAR18"!
Sep 10 09:00:11 e530c /bsd: iwn0: - a2:0f:6f:e7:84:18    1  +192 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - a2:0f:6f:eb:24:78   11  +199 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - a4:56:cc:cd:e6:89    1  +174 54M
ess  privacy   rsn  "No ID"!
Sep 10 09:00:11 e530c /bsd: iwn0: - a4:56:cc:cd:e6:8c    1  +172 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - a4:56:cc:cd:e6:8d    1  +173 54M
ess  privacy   rsn! "OutOfService"!
Sep 10 09:00:11 e530c /bsd: iwn0: - a4:56:cc:cd:e6:8e    1  +174 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - aa:93:5b:0c:9b:3d    6  +184 54M
ess       no!  rsn! "xfinitywifi"!
Sep 10 09:00:11 e530c /bsd: iwn0: - ae:93:5b:0c:9b:3d    6  +193 54M
ess  privacy   rsn  ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - b0:39:56:23:b4:6b    8  +172 54M
ess  privacy   rsn  "Meul"!
Sep 10 09:00:11 e530c /bsd: iwn0: - b0:93:5b:0c:9b:3d    6  +191 54M
ess  privacy   rsn  "2wernergals"!
Sep 10 09:00:11 e530c /bsd: iwn0: - b2:93:5b:0c:9b:3d    6  +191 54M
ess  privacy   rsn! "OutOfService"!
Sep 10 09:00:11 e530c /bsd: iwn0: - b6:93:5b:0c:9b:3d    6  +190 54M
ess  privacy   rsn! ""!
Sep 10 09:00:11 e530c /bsd: iwn0: - cc:40:d0:17:22:c3    1  +182 54M
ess  privacy   rsn  "Cleveland-Bales Casa 5_2GEXT"!
Sep 10 09:00:11 e530c /bsd: iwn0: - d8:97:ba:be:19:70    6  +170 54M
ess  privacy   rsn! "CUSPNet_2.4"!
Sep 10 09:00:11 e530c /bsd: iwn0: - e8:37:7a:be:c4:a7    1  +173 54M
ess  privacy   rsn  "CenturyLink3718"!
Sep 10 09:00:11 e530c /bsd: iwn0: SCAN -> AUTH
Sep 10 09:00:11 e530c /bsd: iwn0: sending auth to 80:2a:a8:57:5e:17 on
channel 6 mode 11g
Sep 10 09:00:15 e530c /bsd: iwn0: AUTH -> SCAN
Sep 10 09:00:16 e530c /bsd: iwn0: end active scan
...

This scan repeats about once every 4 seconds.

A few other data points (forgive me if this is obvious...I mostly
wanted to share my general thoughts on what I've look into):
* I wondered if this was triggered by dhcp lease renewal (iwn0 uses
IPv4 dhcp only, bad packet or something), but I have observed the iwn0
driver in the "troubled" state well before my current lease expires.
* timing seems odd : once the iwn0 stops working, it takes a while for
the "scan" log entries (above) to appear in /var/log/messages
(gathering specific data on this now, but it seems to be at least 10s
of minutes).
* I have verified there are no states in the firewall, other than
those marked "SINGLE:NO_TRAFFIC" from my machine attempting to query
DNS.
* During the repeated scans above, `tcpdump` reports no UDP traffic
for this iwn0.

tl;dr -
1.) I still don't know what's triggering this.
2.) When it's triggered, it takes a while to report anything in
/var/log/messages
3.) Once it does, iwn seems to be in a 4-5 second loop, continuously
scanning all APs.
4.) I'm able to reset the driver via "doas ifconfig iwn0 down; doas sh
/etc/netstart iwn0" and it all works again...for a while (see #1).

Thank you for reading this far.  Any other pointers/suggestions?  So
many opportunities to learn. :)

Bryan


On Mon, Sep 9, 2019 at 9:23 AM Stefan Sperling <[hidden email]> wrote:

>
> On Sun, Sep 08, 2019 at 08:31:55PM +0000, Bryan Stenson wrote:
> > Hi all -
> >
> > I'm writing to "misc" rather than "bugs" as I'm not yet sure this is a
> > bug.  I'm hoping to help triage this with assistance from this list.
> >
> > I'm running -CURRENT and the iwn(4) driver for my wireless card.  Over
> > the past year, this has been working great, but recently (within the
> > last month or so), I've had issues where the NIC just stops working
> > after a few hours of usage.  I don't have a solid steps for
> > reproduction.
> >
> > I realize "stops working" is not a very accurate account here...but
> > I'm confused on how to get more descriptive information of the
> > problem.  When it stops, "ifconfig" shows iwn0 with an IP address, but
> > I'm unable to ping.  Additionally, I'm not seeing any
> > warnings/messages in "dmesg" about the device...so I'm confused.
> >
> > A simple "ifconfig iwn0 down; sh /etc/netstart iwn0" seems to fix the
> > problem, but I haven't had to do that in the past...it just feels like
> > a recent change (iwn(4) work?) has put me in this state.
> >
> > I'm really wanting to help here.  How can I run the iwn(4) in debug
> > mode, or increase logging verbosity?  And/or, should I try to capture
> > packets via tcpdump?  And/or, can I run an older bsd.mp (without
> > having to downgrade packages to older versions) in order to try and
> > "bisect" where the problem may have been introduced?
>
> Please enable debugging with 'ifconfig iwn0 debug' and when the problem
> reoccurs check whether you can find any relevant information in the
> file /var/log/messages.

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Stefan Sperling-5
On Wed, Sep 11, 2019 at 12:16:06AM -0700, Bryan Stenson wrote:
> doh...I don't know why I didn't think of that...
>
> Good news, with 'ifconfig iwn0 debug' set, once the strange behavior
> starts, I see LOTS of repeated messages, the pattern happens about
> once every 4 seconds, and dumps the following into /var/log/messages:
>
> ...
> # continuous spamming of /var/log/messages from after the network has
> been in the troubled/failed state for a while

You snipped the exciting part.

I need to know why it decides to do a transition of the form:

        RUN -> something

This should be somewhere at the top of this stream of output.

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Bryan Stenson
sorry about that...here's the most recent one:

Sep 11 06:31:13 e530c /bsd: iwn0: sending probe_req to
80:2a:a8:57:5e:17 on channel 6 mode 11n
Sep 11 06:31:15 e530c ntpd[87584]: DNS lookup tempfail
Sep 11 06:31:16 e530c dhclient[9122]: iwn0: writev(DHCPREQUEST): No
buffer space available
Sep 11 06:31:18 e530c /bsd: iwn0: RUN -> SCAN
Sep 11 06:31:18 e530c /bsd: iwn0: end active scan
Sep 11 06:31:18 e530c /bsd: iwn0: - 00:0d:67:7d:a9:43    1  +172 54M
ess       no!  rsn! "CableWiFi"!
Sep 11 06:31:18 e530c /bsd: iwn0: - 08:86:3b:b6:2f:80    1  +182 54M
ess  privacy   rsn  "belkin.f80"!
...

On Wed, Sep 11, 2019 at 7:53 AM Stefan Sperling <[hidden email]> wrote:

>
> On Wed, Sep 11, 2019 at 12:16:06AM -0700, Bryan Stenson wrote:
> > doh...I don't know why I didn't think of that...
> >
> > Good news, with 'ifconfig iwn0 debug' set, once the strange behavior
> > starts, I see LOTS of repeated messages, the pattern happens about
> > once every 4 seconds, and dumps the following into /var/log/messages:
> >
> > ...
> > # continuous spamming of /var/log/messages from after the network has
> > been in the troubled/failed state for a while
>
> You snipped the exciting part.
>
> I need to know why it decides to do a transition of the form:
>
>         RUN -> something
>
> This should be somewhere at the top of this stream of output.

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Stefan Sperling-5
On Wed, Sep 11, 2019 at 08:01:25AM +0000, Bryan Stenson wrote:
> sorry about that...here's the most recent one:
>
> Sep 11 06:31:13 e530c /bsd: iwn0: sending probe_req to
> 80:2a:a8:57:5e:17 on channel 6 mode 11n

> Sep 11 06:31:18 e530c /bsd: iwn0: RUN -> SCAN

This means it has stopped receiving beacons from your AP.

Did the AP decide to switch its channel?

Reply | Threaded
Open this post in threaded view
|

Re: recent troubles with iwn(4)

Raf Czlonka-2
In reply to this post by Stefan Sperling-5
On Mon, Sep 09, 2019 at 10:38:25AM BST, Stefan Sperling wrote:

> On Mon, Sep 09, 2019 at 10:26:37AM +0100, Raf Czlonka wrote:
> > Your email prompted me today to have a look at what is happening
> > while the laptop loses connectivity - it's been disconnected for 3
> > days after about 30 minutes of being connected (enough time to run
> > 'pkg_add -u'). After putting the interface into debug modes:
> >
> > # ifconfig iwn0 debug
> >
> > /var/log/messages shows:
> >
> > iwn0: AUTH -> SCAN
> > iwn0: end active scan
> > iwn0: - [...]
> > iwn0: - 12:34:56:12:34:56   11  +212 54M ess  privacy   rsn  "MYNWID2"!
> > iwn0: + 12:34:56:12:34:57   44  +203 54M ess  privacy   rsn  "MYNWID5"
> > iwn0: - [...]
> > iwn0: SCAN -> AUTH
> > iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a
>
> This small part of the log is not useful by itself, unfortunately.
> You need to show debug output where iwn left RUN state in the first place.

After 10 hours of running without issues it had a similar blip but
managed to reconnect fine:

        Sep 10 20:44:35 host /bsd: iwn0: RUN -> AUTH
        Sep 10 20:44:35 host /bsd: iwn0: sending action to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:44:35 host /bsd: iwn0: sending action to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:44:35 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:44:39 host /bsd: iwn0: AUTH -> SCAN
        Sep 10 20:44:42 host /bsd: iwn0: end active scan
        Sep 10 20:44:42 host /bsd: iwn0: - 12:34:56:12:34:56   11  +214 54M   ess  privacy   rsn  "MYNWID2"!
        Sep 10 20:44:42 host /bsd: iwn0: + 12:34:56:12:34:57   44  +198 54M   ess  privacy   rsn  "MYNWID5"
        Sep 10 20:44:42 host /bsd: iwn0: SCAN -> AUTH
        Sep 10 20:44:42 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:44:47 host /bsd: iwn0: AUTH -> SCAN
        Sep 10 20:44:50 host /bsd: iwn0: end active scan
        Sep 10 20:44:50 host /bsd: iwn0: - 12:34:56:12:34:56   11  +214 54M   ess  privacy   rsn  "MYNWID2"!
        Sep 10 20:44:50 host /bsd: iwn0: + 12:34:56:12:34:57   44  +198 54M   ess  privacy   rsn  "MYNWID5"
        Sep 10 20:44:50 host /bsd: iwn0: SCAN -> AUTH
        Sep 10 20:44:50 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:44:50 host /bsd: iwn0: AUTH -> ASSOC
        Sep 10 20:44:50 host /bsd: iwn0: sending assoc_req to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:44:50 host /bsd: iwn0: ASSOC -> RUN
        Sep 10 20:44:50 host /bsd: iwn0: associated with 12:34:56:12:34:57 ssid "MYNWID5" channel 44 start MCS 0 long preamble short slot time HT enabled
        Sep 10 20:44:50 host /bsd: iwn0: missed beacon threshold set to 7 beacons, beacon interval is 100 TU
        Sep 10 20:44:50 host /bsd: iwn0: received msg 1/4 of the 4-way handshake from 12:34:56:12:34:57
        Sep 10 20:44:50 host /bsd: iwn0: sending msg 2/4 of the 4-way handshake to 12:34:56:12:34:57
        Sep 10 20:44:50 host /bsd: iwn0: received msg 3/4 of the 4-way handshake from 12:34:56:12:34:57
        Sep 10 20:44:50 host /bsd: iwn0: sending msg 4/4 of the 4-way handshake to 12:34:56:12:34:57
        Sep 10 20:44:50 host /bsd: iwn0: sending action to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:44:51 host last message repeated 3 times
        Sep 10 20:44:54 host /bsd: iwn0: RUN -> AUTH
        Sep 10 20:44:54 host /bsd: iwn0: sending action to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:44:54 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:44:59 host /bsd: iwn0: AUTH -> SCAN
        Sep 10 20:45:02 host /bsd: iwn0: end active scan
        Sep 10 20:45:02 host /bsd: iwn0: - 12:34:56:12:34:56   11  +214 54M   ess  privacy   rsn  "MYNWID2"!
        Sep 10 20:45:02 host /bsd: iwn0: + 12:34:56:12:34:57   44  +197 54M   ess  privacy   rsn  "MYNWID5"
        Sep 10 20:45:02 host /bsd: iwn0: SCAN -> AUTH
        Sep 10 20:45:02 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:45:07 host /bsd: iwn0: AUTH -> SCAN
        Sep 10 20:45:10 host /bsd: iwn0: end active scan
        Sep 10 20:45:10 host /bsd: iwn0: - 12:34:56:12:34:56   11  +214 54M   ess  privacy   rsn  "MYNWID2"!
        Sep 10 20:45:10 host /bsd: iwn0: + 12:34:56:12:34:57   44  +197 54M   ess  privacy   rsn  "MYNWID5"
        Sep 10 20:45:10 host /bsd: iwn0: SCAN -> AUTH
        Sep 10 20:45:10 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:45:15 host /bsd: iwn0: AUTH -> SCAN
        Sep 10 20:45:18 host /bsd: iwn0: end active scan
        Sep 10 20:45:18 host /bsd: iwn0: - 12:34:56:12:34:56   11  +208 54M   ess  privacy   rsn  "MYNWID2"!
        Sep 10 20:45:18 host /bsd: iwn0: + 12:34:56:12:34:57   44  +197 54M   ess  privacy   rsn  "MYNWID5"
        Sep 10 20:45:18 host /bsd: iwn0: SCAN -> AUTH
        Sep 10 20:45:18 host /bsd: iwn0: sending auth to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:45:18 host /bsd: iwn0: AUTH -> ASSOC
        Sep 10 20:45:18 host /bsd: iwn0: sending assoc_req to 12:34:56:12:34:57 on channel 44 mode 11a
        Sep 10 20:45:18 host /bsd: iwn0: ASSOC -> RUN
        Sep 10 20:45:18 host /bsd: iwn0: associated with 12:34:56:12:34:57 ssid "MYNWID5" channel 44 start MCS 0 long preamble short slot time HT enabled
        Sep 10 20:45:18 host /bsd: iwn0: missed beacon threshold set to 7 beacons, beacon interval is 100 TU
        Sep 10 20:45:18 host /bsd: iwn0: received msg 1/4 of the 4-way handshake from 12:34:56:12:34:57
        Sep 10 20:45:18 host /bsd: iwn0: sending msg 2/4 of the 4-way handshake to 12:34:56:12:34:57
        Sep 10 20:45:18 host /bsd: iwn0: received msg 3/4 of the 4-way handshake from 12:34:56:12:34:57
        Sep 10 20:45:18 host /bsd: iwn0: sending msg 4/4 of the 4-way handshake to 12:34:56:12:34:57
        Sep 10 20:45:18 host /bsd: iwn0: sending action to 12:34:56:12:34:57 on channel 44 mode 11n
        Sep 10 20:45:21 host last message repeated 3 times
        Sep 10 21:45:14 host /bsd: iwn0: received msg 1/2 of the group key handshake from 12:34:56:12:34:57
        Sep 10 21:45:14 host /bsd: iwn0: sending msg 2/2 of the group key handshake to 12:34:56:12:34:57

> > What seems strange there is the mode - after a very brief look, it
> > seems like the laptop is "stuck" trying to use 11a mode.
> >
> > After forcing the 11n mode:
> >
> > # ifconfig iwn0 mode 11n
> >
> > it connects instantaneously:
>
> This command resets everything and the condition which triggered the
> problem is now gone.

Sure, that was only to show how to get iwn to reconnect with the AP.

Was there anything I could have done/checked in that state which
would have given more information? As per above, I've only enabled
the debug mode post factum.

> It connected fine to your 2 GHz AP. You have only shown failing connection
> attempts to your 5 GHz AP. Did your 5 GHz AP ever actually work with iwn?
>

As per above debug log - yes, iwn and my 5 GHz AP work fine together:

iwn0: flags=808847<UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST,AUTOCONF4> mtu 1500
        lladdr 00:11:22:aa:bb:cc
        index 2 priority 4 llprio 3
        groups: wlan egress
        media: IEEE802.11 autoselect (HT-MCS7 mode 11n)
        status: active
        ieee80211: join MYNWID5 chan 44 bssid 12:34:56:12:34:57 -56dBm wpakey wpaprotos wpa2 wpaakms psk wpaciphers ccmp wpagroupcipher ccmp
        inet 10.0.0.12 netmask 0xffffff00 broadcast 10.0.0.255

And no, I don't know whether channel has changed before.

Is this at all useful?

Regards,

Raf