Bufferbloat, FQ-CoDel, and performance

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

Bufferbloat, FQ-CoDel, and performance

Steven Shockley
I have OpenBSD 6.8 running on a Dell R210-II acting as a
firewall/router.  To combat bufferbloat I tried implementing FQ-CoDel
queueing.  The WAN bandwidth is advertised as 940 Mbit/sec down and 840
Mbit/sec up.

I've tried adding one or the other of these lines to my pf.conf:

queue outq on $ext_if flows 1024 bandwidth 1024M max 1024M qlimit 1024
default
or
queue outq on $ext_if flows 1024 qlimit 1024 default

In both cases, upload speeds drop from ~800 Mbit/sec to < 100 Mbit/sec.
Changing the 1024M to other values makes little or no difference.  To be
fair, bufferbloat does improve, but that's quite a hit.  I'm measuring
using the dslreports.com speed test via wired ethernet through a Cisco
3750x.

One possible complexity is that the internal interface is tagged VLANs,
but if it were an MTU issue I'd expect it to affect performance across
the board.

Any suggestions?  I'm happy to post dmesg/pf.conf/diagrams if they'd
help.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

brian-2


> On Feb 22, 2021, at 8:51 PM, Steven Shockley <[hidden email]> wrote:
>
> I have OpenBSD 6.8 running on a Dell R210-II acting as a firewall/router.  To combat bufferbloat I tried implementing FQ-CoDel queueing.  The WAN bandwidth is advertised as 940 Mbit/sec down and 840 Mbit/sec up.
>
> I've tried adding one or the other of these lines to my pf.conf:
>
> queue outq on $ext_if flows 1024 bandwidth 1024M max 1024M qlimit 1024 default
> or
> queue outq on $ext_if flows 1024 qlimit 1024 default
>
> In both cases, upload speeds drop from ~800 Mbit/sec to < 100 Mbit/sec. Changing the 1024M to other values makes little or no difference.  To be fair, bufferbloat does improve, but that's quite a hit.  I'm measuring using the dslreports.com speed test via wired ethernet through a Cisco 3750x.
>
> One possible complexity is that the internal interface is tagged VLANs, but if it were an MTU issue I'd expect it to affect performance across the board.
>
> Any suggestions?  I'm happy to post dmesg/pf.conf/diagrams if they'd help.  Thanks.
>

Hi, I have a connection with similar bandwidth.  I don’t have a solution for your problem but want to make one suggestion---don’t use a line like your first one.  pf ignores “flows” when the queue also specifies min/max bandwidth, so you won’t end up using FQ-CoDel.  Do something like this instead to get the benefit of capping upload bandwidth and also using FQ-CoDel:

queue outq_parent on $ext_if bandwidth 760M max 800M
queue outq  parent outq_parent bandwidth 760M flows 1024 qlimit 1024 default

I found I had better results capping upload bandwidth at 10% below my connection’s stated amount (880M in my case).


Best,
Brian

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Sebastien Marie-3
In reply to this post by Steven Shockley
On Mon, Feb 22, 2021 at 08:51:32PM -0500, Steven Shockley wrote:

> I have OpenBSD 6.8 running on a Dell R210-II acting as a firewall/router.
> To combat bufferbloat I tried implementing FQ-CoDel queueing.  The WAN
> bandwidth is advertised as 940 Mbit/sec down and 840 Mbit/sec up.
>
> I've tried adding one or the other of these lines to my pf.conf:
>
> queue outq on $ext_if flows 1024 bandwidth 1024M max 1024M qlimit 1024
> default
> or
> queue outq on $ext_if flows 1024 qlimit 1024 default
>
> In both cases, upload speeds drop from ~800 Mbit/sec to < 100 Mbit/sec.
> Changing the 1024M to other values makes little or no difference.  To be
> fair, bufferbloat does improve, but that's quite a hit.  I'm measuring using
> the dslreports.com speed test via wired ethernet through a Cisco 3750x.
>
> One possible complexity is that the internal interface is tagged VLANs, but
> if it were an MTU issue I'd expect it to affect performance across the
> board.

Here is what I am doing.

First rule, apply queue only on real interface, and not on a vlan
interface.

Next, I have a ADSL uplink. My router has only one physical interface:
re0. The router is connected to ADSL modem via a vlan, and a pppoe0
session is build over the vlan. On local network side, several vlan
exists.

Using http://www.dslreports.com/speedtest/:
- download: 8.140 Mbit/s (90% = 7.326M)
- upload:   0.827 Mbit/s (90% = 0.744M)

I am applying queue on pppoe0 to control outgoing traffic, and queue
on re0 to control the incoming traffic.

  # on pppoe0 : outgoing traffic
  queue rootq  on pppoe0 bandwidth 0.744M max 0.744M
  queue netq   on pppoe0 parent rootq flows 1024 bandwidth 0.744M max 0.744M qlimit 32 default
  queue guessq on pppoe0 parent rootq flows 1024 bandwidth 0.150M max 0.150M qlimit 32

  # on re0 : incoming traffic
  queue rootq  on re0 bandwidth  1G max 1G
  queue stdq   on re0 parent rootq flows 1024 bandwidth   1G     max 1G     qlimit 1024 default
  queue netq   on re0 parent rootq flows 1024 bandwidth   7.362M max 7.362M qlimit   32
  queue guessq on re0 parent rootq flows 1024 bandwidth   0.500M max 1.000M qlimit   16
         
and next, I am setting the queue to use using rules (please note I am
using "group" parameter on interfaces for the names like "guess",
"internet" or "with_internet").

  anchor "outgoing" out on internet received-on with_internet {
    pass out label "outgoing"
    match out set queue netq
    match out received-on guess set queue guessq
  }

I hope it helps, even if my network speeds isn't comparable to your :)

Thanks.
--
Sebastien Marie

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Stefan Sperling-5
In reply to this post by Steven Shockley
On Mon, Feb 22, 2021 at 08:51:32PM -0500, Steven Shockley wrote:

> I have OpenBSD 6.8 running on a Dell R210-II acting as a firewall/router.
> To combat bufferbloat I tried implementing FQ-CoDel queueing.  The WAN
> bandwidth is advertised as 940 Mbit/sec down and 840 Mbit/sec up.
>
> I've tried adding one or the other of these lines to my pf.conf:
>
> queue outq on $ext_if flows 1024 bandwidth 1024M max 1024M qlimit 1024
> default
> or
> queue outq on $ext_if flows 1024 qlimit 1024 default
>
> In both cases, upload speeds drop from ~800 Mbit/sec to < 100 Mbit/sec.
> Changing the 1024M to other values makes little or no difference.  To be
> fair, bufferbloat does improve, but that's quite a hit.  I'm measuring using
> the dslreports.com speed test via wired ethernet through a Cisco 3750x.
>
> One possible complexity is that the internal interface is tagged VLANs, but
> if it were an MTU issue I'd expect it to affect performance across the
> board.
>
> Any suggestions?  I'm happy to post dmesg/pf.conf/diagrams if they'd help.
> Thanks.

I've noticed a similar effect on a slower link (VDSL with 50 down/ 10 up).
In this case the VDSL modem presents an Ethernet switch, so there is no
pppoe or vlan involved in the box that runs pf.

As soon as I enable this example given in pf.conf(5):

         queue outq on em0 bandwidth 9M max 9M flows 1024 qlimit 1024 \
               default

I see only about 2 or 3 Mbit/s max upload during tcpbench.
Which is indeed quite a hit compared to 10M.

Without the queue tcpbench goes up to 9 Mbit/s. It varies a lot between
5 and 9, which I thought might be a reason for my issue with queueing
enabled.

Currently, I am simply running this setup without any queueing.

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Todd C. Miller-3
On Tue, 23 Feb 2021 11:29:00 +0100, Stefan Sperling wrote:

> I've noticed a similar effect on a slower link (VDSL with 50 down/ 10 up).
> In this case the VDSL modem presents an Ethernet switch, so there is no
> pppoe or vlan involved in the box that runs pf.
>
> As soon as I enable this example given in pf.conf(5):
>
> queue outq on em0 bandwidth 9M max 9M flows 1024 qlimit 1024 \
>       default
>
> I see only about 2 or 3 Mbit/s max upload during tcpbench.
> Which is indeed quite a hit compared to 10M.

That's odd.  I haven't had any problems with a VDSL connection with
100 down / 11 up.  My config is very similar to yours:

queue outq on em2 flows 1024 bandwidth 10M max 10M qlimit 1024 default

where em2 the underlying interface used by pppoe0.  Without queueing
I have major problems when utilizing the upstream bandwidth, probably
due to dropped ACKs.

 - todd

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Horia Racoviceanu
I noticed this effect as well. I'm not sure if it's the right thing to
do, but if a "min" value is added to the hfsc queue, fq_codel will use
the full link bandwith e.g.

queue outq on em0 bandwidth 9M min 1M max 9M flows 1024 qlimit 1024 \
      default

On 2/23/21, Todd C. Miller <[hidden email]> wrote:

> On Tue, 23 Feb 2021 11:29:00 +0100, Stefan Sperling wrote:
>
>> I've noticed a similar effect on a slower link (VDSL with 50 down/ 10
>> up).
>> In this case the VDSL modem presents an Ethernet switch, so there is no
>> pppoe or vlan involved in the box that runs pf.
>>
>> As soon as I enable this example given in pf.conf(5):
>>
>> queue outq on em0 bandwidth 9M max 9M flows 1024 qlimit 1024 \
>>       default
>>
>> I see only about 2 or 3 Mbit/s max upload during tcpbench.
>> Which is indeed quite a hit compared to 10M.
>
> That's odd.  I haven't had any problems with a VDSL connection with
> 100 down / 11 up.  My config is very similar to yours:
>
> queue outq on em2 flows 1024 bandwidth 10M max 10M qlimit 1024 default
>
> where em2 the underlying interface used by pppoe0.  Without queueing
> I have major problems when utilizing the upstream bandwidth, probably
> due to dropped ACKs.
>
>  - todd
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Jean-Pierre de Villiers
Have you tried running the same test over UDP?  This is done via the tcpbench's
"-u" option.

Regards,
Jean-Pierre

On 21/02/23 11:10am, Horia Racoviceanu wrote:

> I noticed this effect as well. I'm not sure if it's the right thing to
> do, but if a "min" value is added to the hfsc queue, fq_codel will use
> the full link bandwith e.g.
>
> queue outq on em0 bandwidth 9M min 1M max 9M flows 1024 qlimit 1024 \
>       default
>
> On 2/23/21, Todd C. Miller <[hidden email]> wrote:
> > On Tue, 23 Feb 2021 11:29:00 +0100, Stefan Sperling wrote:
> >
> >> I've noticed a similar effect on a slower link (VDSL with 50 down/ 10
> >> up).
> >> In this case the VDSL modem presents an Ethernet switch, so there is no
> >> pppoe or vlan involved in the box that runs pf.
> >>
> >> As soon as I enable this example given in pf.conf(5):
> >>
> >> queue outq on em0 bandwidth 9M max 9M flows 1024 qlimit 1024 \
> >>       default
> >>
> >> I see only about 2 or 3 Mbit/s max upload during tcpbench.
> >> Which is indeed quite a hit compared to 10M.
> >
> > That's odd.  I haven't had any problems with a VDSL connection with
> > 100 down / 11 up.  My config is very similar to yours:
> >
> > queue outq on em2 flows 1024 bandwidth 10M max 10M qlimit 1024 default
> >
> > where em2 the underlying interface used by pppoe0.  Without queueing
> > I have major problems when utilizing the upstream bandwidth, probably
> > due to dropped ACKs.
> >
> >  - todd
> >
> >
>

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Stuart Henderson
In reply to this post by Steven Shockley
On 2021-02-23, Steven Shockley <[hidden email]> wrote:
> I have OpenBSD 6.8 running on a Dell R210-II acting as a
> firewall/router.  To combat bufferbloat I tried implementing FQ-CoDel
> queueing.  The WAN bandwidth is advertised as 940 Mbit/sec down and 840
> Mbit/sec up.

Flow queues are broken in 6.8 on interfaces with hw checksum offloading.
Fix is in -current or sys/net/pf.c r1.1096


Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Stuart Henderson
On 2021-02-23, Stuart Henderson <[hidden email]> wrote:

> On 2021-02-23, Steven Shockley <[hidden email]> wrote:
>> I have OpenBSD 6.8 running on a Dell R210-II acting as a
>> firewall/router.  To combat bufferbloat I tried implementing FQ-CoDel
>> queueing.  The WAN bandwidth is advertised as 940 Mbit/sec down and 840
>> Mbit/sec up.
>
> Flow queues are broken in 6.8 on interfaces with hw checksum offloading.
> Fix is in -current or sys/net/pf.c r1.1096
>
>
>

Oops, on interfaces *without* hw checksum offloading, like this:

$ ifconfig em0 hwfeatures
em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        hwfeatures=10<VLAN_MTU> hardmtu 9216
..


Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Horia Racoviceanu
In reply to this post by Jean-Pierre de Villiers
I just tried tcpbench and UDP is OK, only TCP is slow unless I add a
"min" value to the queue

On 2/23/21, Jean-Pierre de Villiers <[hidden email]> wrote:

> Have you tried running the same test over UDP?  This is done via the
> tcpbench's
> "-u" option.
>
> Regards,
> Jean-Pierre
>
> On 21/02/23 11:10am, Horia Racoviceanu wrote:
>> I noticed this effect as well. I'm not sure if it's the right thing to
>> do, but if a "min" value is added to the hfsc queue, fq_codel will use
>> the full link bandwith e.g.
>>
>> queue outq on em0 bandwidth 9M min 1M max 9M flows 1024 qlimit 1024 \
>>       default
>>
>> On 2/23/21, Todd C. Miller <[hidden email]> wrote:
>> > On Tue, 23 Feb 2021 11:29:00 +0100, Stefan Sperling wrote:
>> >
>> >> I've noticed a similar effect on a slower link (VDSL with 50 down/ 10
>> >> up).
>> >> In this case the VDSL modem presents an Ethernet switch, so there is
>> >> no
>> >> pppoe or vlan involved in the box that runs pf.
>> >>
>> >> As soon as I enable this example given in pf.conf(5):
>> >>
>> >> queue outq on em0 bandwidth 9M max 9M flows 1024 qlimit 1024 \
>> >>       default
>> >>
>> >> I see only about 2 or 3 Mbit/s max upload during tcpbench.
>> >> Which is indeed quite a hit compared to 10M.
>> >
>> > That's odd.  I haven't had any problems with a VDSL connection with
>> > 100 down / 11 up.  My config is very similar to yours:
>> >
>> > queue outq on em2 flows 1024 bandwidth 10M max 10M qlimit 1024 default
>> >
>> > where em2 the underlying interface used by pppoe0.  Without queueing
>> > I have major problems when utilizing the upstream bandwidth, probably
>> > due to dropped ACKs.
>> >
>> >  - todd
>> >
>> >
>>
>

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

glaess
In reply to this post by Stuart Henderson
hi



i have the same issue , but only in the kernel pppoe interface.

interface re3  -> vlan 7 -> pppoe0


~ 4>ifconfig re3 hwfeatures
re3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> rdomain 40 mtu 1518
hwfeatures=8037<CSUM_IPv4,CSUM_TCPv4,CSUM_UDPv4,VLAN_MTU,VLAN_HWTAGGING,WOL>
hardmtu 9194
     lladdr 4c:02:89:0d:cb:78
     index 4 priority 0 llprio 3
     media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
     status: active
     inet 192.168.1.250 netmask 0xffffff00 broadcast 192.168.1.255


on my cable vlan5 is the queue working

interface aggr0 -> vlan5

the parent interfaces for the aggr are also re.


the speed are messured by the network test on my xbox  , pppoe speed
drops upload from 30 mbit to 9 mbit.

Holger



Am 23.02.21 um 22:04 schrieb Stuart Henderson:

> On 2021-02-23, Stuart Henderson <[hidden email]> wrote:
>> On 2021-02-23, Steven Shockley <[hidden email]> wrote:
>>> I have OpenBSD 6.8 running on a Dell R210-II acting as a
>>> firewall/router.  To combat bufferbloat I tried implementing FQ-CoDel
>>> queueing.  The WAN bandwidth is advertised as 940 Mbit/sec down and 840
>>> Mbit/sec up.
>> Flow queues are broken in 6.8 on interfaces with hw checksum offloading.
>> Fix is in -current or sys/net/pf.c r1.1096
>>
>>
>>
> Oops, on interfaces *without* hw checksum offloading, like this:
>
> $ ifconfig em0 hwfeatures
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> hwfeatures=10<VLAN_MTU> hardmtu 9216
> ..
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Stefan Sperling-5
In reply to this post by Stuart Henderson
On Tue, Feb 23, 2021 at 09:04:13PM -0000, Stuart Henderson wrote:

> On 2021-02-23, Stuart Henderson <[hidden email]> wrote:
> > On 2021-02-23, Steven Shockley <[hidden email]> wrote:
> >> I have OpenBSD 6.8 running on a Dell R210-II acting as a
> >> firewall/router.  To combat bufferbloat I tried implementing FQ-CoDel
> >> queueing.  The WAN bandwidth is advertised as 940 Mbit/sec down and 840
> >> Mbit/sec up.
> >
> > Flow queues are broken in 6.8 on interfaces with hw checksum offloading.
> > Fix is in -current or sys/net/pf.c r1.1096
>
> Oops, on interfaces *without* hw checksum offloading, like this:
>
> $ ifconfig em0 hwfeatures
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> hwfeatures=10<VLAN_MTU> hardmtu 9216
> ..
>

Thank you Stuart! This seems to have fixed the issue for me (on a 6.8
system which now also carries the r1.1096 pf.c patch).

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Steven Shockley
In reply to this post by Stuart Henderson
On 2/23/2021 4:04 PM, Stuart Henderson wrote:
> Oops, on interfaces *without* hw checksum offloading, like this:
>
> $ ifconfig em0 hwfeatures
> em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> hwfeatures=10<VLAN_MTU> hardmtu 9216
> ..

I can try it, but I don't think it'll help in my case:

bnx0: flags=808843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,AUTOCONF4> mtu
1500
         hwfeatures=26<CSUM_TCPv4,CSUM_UDPv4,VLAN_HWTAGGING> hardmtu 9008

Thanks, though.

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Sven F.
On Thu, Feb 25, 2021 at 8:38 PM Steven Shockley
<[hidden email]> wrote:

>
> On 2/23/2021 4:04 PM, Stuart Henderson wrote:
> > Oops, on interfaces *without* hw checksum offloading, like this:
> >
> > $ ifconfig em0 hwfeatures
> > em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
> >       hwfeatures=10<VLAN_MTU> hardmtu 9216
> > ..
>
> I can try it, but I don't think it'll help in my case:
>
> bnx0: flags=808843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,AUTOCONF4> mtu
> 1500
>          hwfeatures=26<CSUM_TCPv4,CSUM_UDPv4,VLAN_HWTAGGING> hardmtu 9008
>
> Thanks, though.
>


Can the patch  sys/net/pf.c r1.1096 be applied on 6.8 ?
or does it need some others files to be changed as well ?

--
--
---------------------------------------------------------------------------------------------------------------------
Knowing is not enough; we must apply. Willing is not enough; we must do

Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

Stuart Henderson
On 2021-02-26, Sven F. <[hidden email]> wrote:
> On Thu, Feb 25, 2021 at 8:38 PM Steven Shockley
><[hidden email]> wrote:
>>
>> I can try it, but I don't think it'll help in my case:

It's worth trying anyway I think.

> Can the patch  sys/net/pf.c r1.1096 be applied on 6.8 ?
> or does it need some others files to be changed as well ?

It csn be applied directly.


Reply | Threaded
Open this post in threaded view
|

Re: Bufferbloat, FQ-CoDel, and performance

David Higgs
On Fri, Feb 26, 2021 at 3:58 AM Stuart Henderson <[hidden email]>
wrote:

> On 2021-02-26, Sven F. <[hidden email]> wrote:
> > On Thu, Feb 25, 2021 at 8:38 PM Steven Shockley
> ><[hidden email]> wrote:
> >>
> >> I can try it, but I don't think it'll help in my case:
>
> It's worth trying anyway I think.
>
> > Can the patch  sys/net/pf.c r1.1096 be applied on 6.8 ?
> > or does it need some others files to be changed as well ?
>
> It csn be applied directly.
>

For a month or two my home router has been running 6.8-stable augmented
with this pf.c fix as well as sys/net/fq_codel.c r1.14, without any obvious
performance regressions.

The commit message (and code) on the fq_codel.c fix seemed to indicate that
it might improve performance in much the same way as the pf.c fix, though
perhaps under different circumstances / configurations.

However, I'm not a dev and my opinion isn't worth very much so YMMV.

--david