bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

Scott Cheloha
Hi,

I want to schedule timeouts more lazily from softclock() in order to
spend less time processing timeouts that will be cancelled and avoid
double-processing timeouts that will fire at the next softclock()
anyway.

This means introducing an additional timeout queue, probably named
"timeout_new", and appending timeouts to that queue from
timeout_add(9) instead of appending them to "timeout_todo" as we do
today.

However, before I do that, I need to make sure I'm not breaking
drivers that rely on the ability to spin the softclock():

There are a handful of timeouts in the tree that set a timeout of zero
ticks.  Currently these will fire *immediately* if they are added
during an ongoing softclock().  I hope that these are just typos, but
I need to be sure.

Here's a first batch of conversions: rx refill timeouts for bnxt(4),
myx(4), and vr(4).  All of these can run during a softclock().  Will
changing these to one tick break these drivers?

If not, ok?

Index: pci/if_myx.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_myx.c,v
retrieving revision 1.108
diff -u -p -r1.108 if_myx.c
--- pci/if_myx.c 3 Jul 2019 10:34:59 -0000 1.108
+++ pci/if_myx.c 16 Jan 2020 17:37:58 -0000
@@ -1792,7 +1792,7 @@ myx_rxeof(struct myx_softc *sc)
  if_rxr_put(&mrr->mrr_rxr, rxfree[ring]);
  myx_rx_fill(sc, mrr);
  if (mrr->mrr_prod == mrr->mrr_cons)
- timeout_add(&mrr->mrr_refill, 0);
+ timeout_add(&mrr->mrr_refill, 1);
  }
 }
 
Index: pci/if_bnxt.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_bnxt.c,v
retrieving revision 1.21
diff -u -p -r1.21 if_bnxt.c
--- pci/if_bnxt.c 3 Sep 2019 09:00:44 -0000 1.21
+++ pci/if_bnxt.c 16 Jan 2020 17:37:58 -0000
@@ -1306,7 +1306,7 @@ bnxt_intr(void *xsc)
  bnxt_rx_fill(sc);
  if ((sc->sc_rx_cons == sc->sc_rx_prod) ||
     (sc->sc_rx_ag_cons == sc->sc_rx_ag_prod))
- timeout_add(&sc->sc_rx_refill, 0);
+ timeout_add(&sc->sc_rx_refill, 1);
 
  if_input(&sc->sc_ac.ac_if, &ml);
  }
Index: pci/if_vr.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_vr.c,v
retrieving revision 1.153
diff -u -p -r1.153 if_vr.c
--- pci/if_vr.c 22 Jan 2017 10:17:38 -0000 1.153
+++ pci/if_vr.c 16 Jan 2020 17:37:58 -0000
@@ -807,7 +807,7 @@ vr_fill_rx_ring(struct vr_softc *sc)
 
  if_rxr_put(&sc->sc_rxring, slots);
  if (if_rxr_inuse(&sc->sc_rxring) == 0)
- timeout_add(&sc->sc_rxto, 0);
+ timeout_add(&sc->sc_rxto, 1);
 }
 
 /*

Reply | Threaded
Open this post in threaded view
|

Re: bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

Hrvoje Popovski
On 16.1.2020. 18:45, Scott Cheloha wrote:
> Here's a first batch of conversions: rx refill timeouts for bnxt(4),
> myx(4), and vr(4).  All of these can run during a softclock().  Will
> changing these to one tick break these drivers?

Hi all,

i tried this diff with myx and performance are the same as with plain
kernel.

myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f8
myx1 at pci5 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f9

Reply | Threaded
Open this post in threaded view
|

Re: bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

Scott Cheloha
> On Jan 20, 2020, at 4:47 AM, Hrvoje Popovski <[hidden email]> wrote:
>
> On 16.1.2020. 18:45, Scott Cheloha wrote:
>> Here's a first batch of conversions: rx refill timeouts for bnxt(4),
>> myx(4), and vr(4).  All of these can run during a softclock().  Will
>> changing these to one tick break these drivers?
>
> Hi all,
>
> i tried this diff with myx and performance are the same as with plain
> kernel.
>
> myx0 at pci4 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
> 10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f8
> myx1 at pci5 dev 0 function 0 "Myricom Z8E" rev 0x01: msi, model
> 10G-PCIE2-8BL2-2S, address 00:60:dd:45:ba:f9

Appreciate the testing.

Given what dlg@ has said in the past I think there should only be a
performance change in a livelock situation.

What metric did you use to compare performance?

I think you could induce a livelock with tcpbench(8) or iperf3 from ports.
CC'd bluhm@, who would know more.

To be clear, I'm of the opinion that we should not be spinning the softclock()
unless the alternative is palpably reduced performance for a significant group
of users.  However, I don't want to tread on network performance without analysis
and buy-in from the people who work on that, hence all the CCs.

Probably missing a few CCs, but I figured I'd start with the API and driver
authors.

Reply | Threaded
Open this post in threaded view
|

Re: bnxt(4), myx(4), vr(4): refill timeouts: timeout_add(..., 0) -> timeout_add(..., 1)

Hrvoje Popovski
On 20.1.2020. 17:40, Scott Cheloha wrote:
> Appreciate the testing.

np, i like testing network stuff :)

> Given what dlg@ has said in the past I think there should only be a
> performance change in a livelock situation.

yeah, that could be problem with this testing ...
kern.netlivelocks=6 after few minutes of sending 14Mpps through myx :)

> What metric did you use to compare performance?

i'm generating 64byte udp packets with pktgen through openbsd box (plain
forwarding only)
https://www.kernel.org/doc/Documentation/networking/pktgen.txt

on linux box i'm counting pps on interface from which i'm sending
packets to openbsd box and on interface on which i'm receiving packets
from openbsd box ..

on box with 12 x Xenon CPU E5-2620 v2 @ 2.10GHz, 2400.01 MHz, 06-3e-04
with or without this diff i'm getting 890Kpps of plain forwarding