net80211: more steady Tx rate with MiRa (please test)

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

net80211: more steady Tx rate with MiRa (please test)

Stefan Sperling-5
While working on Tx aggregation I noticed that TCP streams will start
stalling whenever MiRa decides to start sending frames at Tx rates
which the AP tends to fail to receive. This means we're dropping far
too many frames while trying to find an optimal Tx rate to use.

The problem can be observed with tcpbench falling into 0.0 Mbps
occasionally and taking lots of time to recover.
This problem disappeared when I hacked a fixed working Tx rate into
the driver and disabled MiRa.

So I took a closer look at MiRa again and measured percentage of good/bad
MCS used for transmission. The diff below eliminates a lot of frames being
sent at bad rates and results in more steady overall Tx performance while
testing my WIP Tx aggregation code.

I can my numbers by recording outgoing frame headers in pcap file while
running tcpbench: tcpdump -n -i iwn0 -y IEEE802_11_RADIO -w /tmp/iwn.pcap
Now I filter this pcap file by MCS index in wireshark, with an expression
such as 'radiotap.mcs.index == 7', and look at the 'Displayed:' percentage
in the bottom right corner of wireshark's window.

Since I am knee-deep in Tx aggregation right now, I would like to delegate
testing of the diff below against plain -current to the community.
If some of you could test the diff below and report back to me I would
appreciate it.
You don't need to get numbers from wireshark for this if you don't want to.
Letting me know if Tx is faster or not and whether there are any perceived
regressions is sufficient.

The drivers affected by this change are athn(4), iwn(4), and iwm(4).
Don't bother testing with any other drivers.
I have tested this diff with iwn(4) only so far.

diff refs/heads/master refs/heads/txagg
blob - 0d14daff4d1c8973b1d1180a9885a25c6754940b
blob + 8a0f3b90198e9707ae08ef53aa4e68706ce2fa15
--- sys/net80211/ieee80211_mira.c
+++ sys/net80211/ieee80211_mira.c
@@ -510,6 +510,12 @@ ieee80211_mira_reset_driver_stats(struct ieee80211_mir
 /* Number of bytes which, alternatively, render a probe valid. */
 #define IEEE80211_MIRA_MIN_PROBE_BYTES IEEE80211_MAX_LEN
 
+/* Number of Tx failures which, alternatively, render a probe valid. */
+#define IEEE80211_MIRA_MAX_PROBE_TXFAIL 1
+
+/* Number of Tx retries which, alternatively, render a probe valid. */
+#define IEEE80211_MIRA_MAX_PROBE_RETRIES 4
+
 int
 ieee80211_mira_next_lower_intra_rate(struct ieee80211_mira_node *mn,
     struct ieee80211_node *ni)
@@ -719,7 +725,9 @@ ieee80211_mira_probe_valid(struct ieee80211_mira_node
  struct ieee80211_mira_goodput_stats *g = &mn->g[ni->ni_txmcs];
 
  return (g->nprobes >= IEEE80211_MIRA_MIN_PROBE_FRAMES ||
-    g->nprobe_bytes >= IEEE80211_MIRA_MIN_PROBE_BYTES);
+    g->nprobe_bytes >= IEEE80211_MIRA_MIN_PROBE_BYTES ||
+    mn->txfail >= IEEE80211_MIRA_MAX_PROBE_TXFAIL ||
+    mn->retries >= IEEE80211_MIRA_MAX_PROBE_RETRIES);
 }
 
 void

Reply | Threaded
Open this post in threaded view
|

Re: net80211: more steady Tx rate with MiRa (please test)

Matthias Schmidt
Hi Stefan,

* Stefan Sperling wrote:
>
> Since I am knee-deep in Tx aggregation right now, I would like to delegate
> testing of the diff below against plain -current to the community.
> If some of you could test the diff below and report back to me I would
> appreciate it.
> You don't need to get numbers from wireshark for this if you don't want to.
> Letting me know if Tx is faster or not and whether there are any perceived
> regressions is sufficient.

I tested your diff for the last two days and noticed a regression
After some time one of the two things happens:

* Transfer rates drop to 0.  Directly visible if I run tcpbench,
  indirectly if I cannot work with the Network any longer.  I waited
  for quite some time (> 10m) for something to happened, however, nothing
  changes.  Then I restarted the interface.
* My Thinkpad completely looses connection to my AP (Fritzbox) such that
  I have to take iwm0 down and run sh /etc/netstart iwm0.

It happens when I work as usual (SSH, email, surfing, etc) and if I do
nothing else then running tcpbench between the Thinkpad and a APU2
running 6.5.

I run the diff with the following hardware on latest -current:

iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi
iwm0: hw rev 0x230, fw ver 22.361476.0, address 7c:2a:31:4d:1c:b9

Cheers

        Matthias

Reply | Threaded
Open this post in threaded view
|

Re: net80211: more steady Tx rate with MiRa (please test)

Stefan Sperling-5
On Fri, Jun 14, 2019 at 01:01:58PM +0200, Matthias Schmidt wrote:

> Hi Stefan,
>
> * Stefan Sperling wrote:
> >
> > Since I am knee-deep in Tx aggregation right now, I would like to delegate
> > testing of the diff below against plain -current to the community.
> > If some of you could test the diff below and report back to me I would
> > appreciate it.
> > You don't need to get numbers from wireshark for this if you don't want to.
> > Letting me know if Tx is faster or not and whether there are any perceived
> > regressions is sufficient.
>
> I tested your diff for the last two days and noticed a regression
> After some time one of the two things happens:

Are you sure these problem are introduced by this diff?
I am quite certain that these symptoms must be unrelated.

> * Transfer rates drop to 0.  Directly visible if I run tcpbench,
>   indirectly if I cannot work with the Network any longer.  I waited
>   for quite some time (> 10m) for something to happened, however, nothing
>   changes.  Then I restarted the interface.
> * My Thinkpad completely looses connection to my AP (Fritzbox) such that
>   I have to take iwm0 down and run sh /etc/netstart iwm0.
>
> It happens when I work as usual (SSH, email, surfing, etc) and if I do
> nothing else then running tcpbench between the Thinkpad and a APU2
> running 6.5.
>
> I run the diff with the following hardware on latest -current:
>
> iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi
> iwm0: hw rev 0x230, fw ver 22.361476.0, address 7c:2a:31:4d:1c:b9
>
> Cheers
>
> Matthias
>

Reply | Threaded
Open this post in threaded view
|

Re: net80211: more steady Tx rate with MiRa (please test)

Matthias Schmidt
Hi,

* Stefan Sperling wrote:

> On Fri, Jun 14, 2019 at 01:01:58PM +0200, Matthias Schmidt wrote:
> > Hi Stefan,
> >
> > * Stefan Sperling wrote:
> > >
> > > Since I am knee-deep in Tx aggregation right now, I would like to delegate
> > > testing of the diff below against plain -current to the community.
> > > If some of you could test the diff below and report back to me I would
> > > appreciate it.
> > > You don't need to get numbers from wireshark for this if you don't want to.
> > > Letting me know if Tx is faster or not and whether there are any perceived
> > > regressions is sufficient.
> >
> > I tested your diff for the last two days and noticed a regression
> > After some time one of the two things happens:
>
> Are you sure these problem are introduced by this diff?
> I am quite certain that these symptoms must be unrelated.

The first problem also shows up without your diff, however, the
reconnect happens a lot faster.  I will spend some more time testing.

Cheers

        Matthias

Reply | Threaded
Open this post in threaded view
|

Re: net80211: more steady Tx rate with MiRa (please test)

Stefan Sperling-5
On Fri, Jun 14, 2019 at 05:33:41PM +0200, Matthias Schmidt wrote:

> Hi,
>
> * Stefan Sperling wrote:
> > On Fri, Jun 14, 2019 at 01:01:58PM +0200, Matthias Schmidt wrote:
> > > Hi Stefan,
> > >
> > > * Stefan Sperling wrote:
> > > >
> > > > Since I am knee-deep in Tx aggregation right now, I would like to delegate
> > > > testing of the diff below against plain -current to the community.
> > > > If some of you could test the diff below and report back to me I would
> > > > appreciate it.
> > > > You don't need to get numbers from wireshark for this if you don't want to.
> > > > Letting me know if Tx is faster or not and whether there are any perceived
> > > > regressions is sufficient.
> > >
> > > I tested your diff for the last two days and noticed a regression
> > > After some time one of the two things happens:
> >
> > Are you sure these problem are introduced by this diff?
> > I am quite certain that these symptoms must be unrelated.
>
> The first problem also shows up without your diff, however, the
> reconnect happens a lot faster.  I will spend some more time testing.

This diff has no effect on management frames; it only affects transmit
rate of data frames while in assocated state; association state is
kept alive by received frames, not by frames being sent.

I don't see how the diff could be causing either of your issues.
They must have been present already.

Check 'ifconfig iwn0 debug' and see which messages correlate to disconnects.
You're probably running into known issues with background scan (sends deauth
to old AP but never switches to new AP; stays dead until down/up) and/or
dead AP detection (sends probe request to AP, never gets a response, drops
to SCAN state, takes some time to find the AP again, reconnects).

Reply | Threaded
Open this post in threaded view
|

Re: net80211: more steady Tx rate with MiRa (please test)

Matthias Schmidt
Hi,

* Stefan Sperling wrote:

>
> This diff has no effect on management frames; it only affects transmit
> rate of data frames while in assocated state; association state is
> kept alive by received frames, not by frames being sent.
>
> I don't see how the diff could be causing either of your issues.
> They must have been present already.
>
> Check 'ifconfig iwn0 debug' and see which messages correlate to disconnects.
> You're probably running into known issues with background scan (sends deauth
> to old AP but never switches to new AP; stays dead until down/up) and/or
> dead AP detection (sends probe request to AP, never gets a response, drops
> to SCAN state, takes some time to find the AP again, reconnects).

Seems you were quite right.  The first disconnect I had today was
related to a firmware error.

Cheers

        Matthias

2019-06-15T10:57:42.422Z sigma /bsd: iwm0: fatal firmware error
2019-06-15T10:57:42.422Z sigma /bsd: iwm0: RUN -> INIT
2019-06-15T10:57:42.592Z sigma /bsd: iwm0: begin active scan
2019-06-15T10:57:42.593Z sigma /bsd: iwm0: INIT -> SCAN
2019-06-15T10:57:45.797Z sigma /bsd: iwm0: end active scan
2019-06-15T10:57:45.797Z sigma /bsd: iwm0: - 00:1e:2a:e1:18:90    6   +17 54M   ess  privacy   rsn  "ChaosUnlimited"!
2019-06-15T10:57:45.798Z sigma /bsd: iwm0: - 12:62:e5:d1:fd:a9    6   +21 54M   ess  privacy   rsn  "DIRECT-A9-HP OfficeJet 5200"!
2019-06-15T10:57:45.798Z sigma /bsd: iwm0: - 1c:3a:de:64:90:5b   13   +18 54M   ess  privacy   rsn  "CelenoInitialAP64905B"!
2019-06-15T10:57:45.798Z sigma /bsd: iwm0: - 38:10:d5:79:a3:4a    6   +21 54M   ess  privacy   rsn  "FRITZ!BS"!
2019-06-15T10:57:45.799Z sigma /bsd: iwm0: - 44:4e:6d:98:c6:17    6   +21 54M   ess  privacy   rsn  "FRITZ!Box 6490 Cable"!
2019-06-15T10:57:45.799Z sigma /bsd: iwm0: - 44:4e:6d:ec:3c:d3    1   +21 54M   ess  privacy   rsn  "FRITZ!Box 6490 Cable"!
2019-06-15T10:57:45.799Z sigma /bsd: iwm0: - 44:fe:3b:10:c5:dc    1   +17 54M   ess       no!  rsn! "Telekom_FON"!
2019-06-15T10:57:45.800Z sigma /bsd: iwm0: - 46:4e:6d:ec:3c:d3    1   +21 54M   ess  privacy   rsn  "FRITZ!"!
2019-06-15T10:57:45.800Z sigma /bsd: iwm0: - 54:67:51:3d:90:46   11   +49 54M   ess  privacy   rsn  "melbourne2016"!
2019-06-15T10:57:45.800Z sigma /bsd: iwm0: - 54:67:51:3d:90:c8   36   +35 54M   ess  privacy   rsn  "melbourne2016"!
2019-06-15T10:57:45.800Z sigma /bsd: iwm0: - 56:67:11:3d:90:46   11   +48 54M   ess  privacy   rsn! "Unitymedia WifiSpot"!
2019-06-15T10:57:45.800Z sigma /bsd: iwm0: - 90:5c:44:24:40:fa  100   +21 54M   ess  privacy   rsn  "UPC6ED3663"!
2019-06-15T10:57:45.801Z sigma /bsd: iwm0: - 90:5c:44:24:41:03   11   +17 54M   ess  privacy   rsn  "UPC6ED3663"!
2019-06-15T10:57:45.801Z sigma /bsd: iwm0: - 90:5c:44:27:c6:1b    6   +17 54M   ess  privacy   rsn  "UPC89142D1"!
2019-06-15T10:57:45.801Z sigma /bsd: iwm0: - 90:5c:44:cf:2c:e6    6   +17 54M   ess  privacy   rsn  "UPCE5AEF49"!
2019-06-15T10:57:45.802Z sigma /bsd: iwm0: - 90:5c:44:db:c8:e5    6   +19 54M   ess  privacy   rsn  "UPC877738E"!
2019-06-15T10:57:45.802Z sigma /bsd: iwm0: - 90:5c:44:dd:72:48    6   +17 54M   ess  privacy   rsn  "UPCA7A2229"!
2019-06-15T10:57:45.802Z sigma /bsd: iwm0: - 92:5c:14:24:41:03   11   +17 54M   ess  privacy   rsn! "Unitymedia WifiSpot"!
2019-06-15T10:57:45.802Z sigma /bsd: iwm0: - ae:22:15:d0:db:bf   11   +17 54M   ess  privacy   rsn! "Unitymedia WifiSpot"!
2019-06-15T10:57:45.803Z sigma /bsd: iwm0: - cc:ce:1e:8b:cf:d1   60   +31 54M   ess  privacy   rsn  "hs.ka.v01d"!
2019-06-15T10:57:45.803Z sigma /bsd: iwm0: + cc:ce:1e:8b:cf:d2    3   +46 54M   ess  privacy   rsn  "karlsruhe.v01d"
2019-06-15T10:57:45.803Z sigma /bsd: iwm0: - d4:21:22:53:3a:9b    1   +17 54M   ess  privacy   rsn  "WLAN-604342"!
2019-06-15T10:57:45.804Z sigma /bsd: iwm0: - e0:28:6d:16:2b:18   11   +21 54M   ess  privacy   rsn  "FRITZ!Box 7490"!
2019-06-15T10:57:45.804Z sigma /bsd: iwm0: SCAN -> AUTH
2019-06-15T10:57:45.804Z sigma /bsd: iwm0: sending auth to cc:ce:1e:8b:cf:d2 on channel 3 mode 11g
2019-06-15T10:57:45.813Z sigma /bsd: iwm0: AUTH -> ASSOC
2019-06-15T10:57:45.813Z sigma /bsd: iwm0: sending assoc_req to cc:ce:1e:8b:cf:d2 on channel 3 mode 11g
2019-06-15T10:57:45.825Z sigma /bsd: iwm0: ASSOC -> AUTH
2019-06-15T10:57:50.549Z sigma /bsd: iwm0: AUTH -> SCAN
2019-06-15T10:57:53.682Z sigma /bsd: iwm0: end active scan
2019-06-15T10:57:53.682Z sigma /bsd: iwm0: - 00:1e:2a:e1:18:90    6   +18 54M   ess  privacy   rsn  "ChaosUnlimited"!
2019-06-15T10:57:53.682Z sigma /bsd: iwm0: - 44:4e:6d:ee:7e:ae    1   +21 54M   ess  privacy   rsn  "FRITZ!Box 6490 Cable"!
2019-06-15T10:57:53.683Z sigma /bsd: iwm0: - 44:fe:3b:42:1d:22   11   +17 54M   ess  privacy   rsn  "WLAN-331218"!
2019-06-15T10:57:53.683Z sigma /bsd: iwm0: - 44:fe:3b:42:1d:24   11   +17 54M   ess       no!  rsn! "Telekom_FON"!
2019-06-15T10:57:53.683Z sigma /bsd: iwm0: - 54:67:51:3d:90:46   11   +56 54M   ess  privacy   rsn  "melbourne2016"!
2019-06-15T10:57:53.683Z sigma /bsd: iwm0: - 56:67:11:3d:90:46   11   +55 54M   ess  privacy   rsn! "Unitymedia WifiSpot"!
2019-06-15T10:57:53.684Z sigma /bsd: iwm0: - 56:67:11:de:44:ae    1   +17 54M   ess  privacy   rsn! "Unitymedia WifiSpot"!
2019-06-15T10:57:53.684Z sigma /bsd: iwm0: - 90:5c:44:cf:2c:e6    6   +17 54M   ess  privacy   rsn  "UPCE5AEF49"!
2019-06-15T10:57:53.684Z sigma /bsd: iwm0: - 90:5c:44:db:c8:e5    6   +20 54M   ess  privacy   rsn  "UPC877738E"!
2019-06-15T10:57:53.684Z sigma /bsd: iwm0: - 90:5c:44:dd:72:48    6   +17 54M   ess  privacy   rsn  "UPCA7A2229"!
2019-06-15T10:57:53.685Z sigma /bsd: iwm0: - 9c:c7:a6:6e:63:ae   11   +17 54M   ess  privacy   rsn  "FRITZ!Box Fon WLAN 7360 SL"!
2019-06-15T10:57:53.685Z sigma /bsd: iwm0: - ac:22:05:1f:cc:ec    6   +17 54M   ess  privacy   rsn  "Wireless Taco Delight"!
2019-06-15T10:57:53.685Z sigma /bsd: iwm0: - ac:22:05:f9:83:ff   11   +17 54M   ess  privacy   rsn  "UPCB6CF764"!
2019-06-15T10:57:53.685Z sigma /bsd: iwm0: - b8:c7:5d:05:5e:a1    1   +17 54M   ess  privacy   rsn  ""!
2019-06-15T10:57:53.685Z sigma /bsd: iwm0: + cc:ce:1e:8b:cf:d2    3   +42 54M   ess  privacy   rsn  "karlsruhe.v01d"
2019-06-15T10:57:53.686Z sigma /bsd: iwm0: - f0:79:59:ce:3a:78   11   +17 54M   ess  privacy   rsn  "Keine W-LAN Netzwerke gefunden"!
2019-06-15T10:57:53.686Z sigma /bsd: iwm0: - fc:75:16:7a:a8:4b    6   +17 54M   ess  privacy   rsn  "Tallulah"!
2019-06-15T10:57:53.686Z sigma /bsd: iwm0: SCAN -> AUTH
2019-06-15T10:57:53.687Z sigma /bsd: iwm0: sending auth to cc:ce:1e:8b:cf:d2 on channel 3 mode 11g
2019-06-15T10:57:53.692Z sigma /bsd: iwm0: AUTH -> ASSOC
2019-06-15T10:57:53.692Z sigma /bsd: iwm0: sending assoc_req to cc:ce:1e:8b:cf:d2 on channel 3 mode 11g
2019-06-15T10:57:53.700Z sigma /bsd: iwm0: ASSOC -> RUN
2019-06-15T10:57:53.701Z sigma /bsd: iwm0: associated with cc:ce:1e:8b:cf:d2 ssid "karlsruhe.v01d" channel 3 start MCS 0 short preamble short slot time HT enabled
2019-06-15T10:57:53.701Z sigma /bsd: iwm0: missed beacon threshold set to 7 beacons, beacon interval is 100 TU
2019-06-15T10:57:53.709Z sigma /bsd: iwm0: received msg 1/4 of the 4-way handshake from cc:ce:1e:8b:cf:d2
2019-06-15T10:57:53.710Z sigma /bsd: iwm0: sending msg 2/4 of the 4-way handshake to cc:ce:1e:8b:cf:d2
2019-06-15T10:57:53.724Z sigma /bsd: iwm0: received msg 3/4 of the 4-way handshake from cc:ce:1e:8b:cf:d2
2019-06-15T10:57:53.726Z sigma /bsd: iwm0: sending msg 4/4 of the 4-way handshake to cc:ce:1e:8b:cf:d2
2019-06-15T10:57:53.735Z sigma /bsd: iwm0: sending action to cc:ce:1e:8b:cf:d2 on channel 3 mode 11n
2019-06-15T10:58:07.401Z sigma /bsd: iwm0: sending action to cc:ce:1e:8b:cf:d2 on channel 3 mode 11n