load balancing outgoing traffic with 4 uplinks

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

load balancing outgoing traffic with 4 uplinks

Thomas Huber
hi misc

it´s again about my OpenBSD -stable on a APU2-board as loadbalancer setup:

I´ve four ADSL-Uplinks provided by two different ISPs
- pppoe0 runs on em0 and is directly connected to a modem and has a
static IP-adress from the ISP.
- pppoe1 is running over a vlan2 via em1 to a managed switch (switch1)
on which a dedicated bridge-modem is conected with an dynamic IP from the
ISP.
- vlan[3|4] are running over em1 to switch1 and further to
two router-modems which are doing the pppoe-connection.

I didn´t manage - although I thought I did - to do the pppoe within OpenBSD
for the third and fourth uplink, that´s why it is setup like this.
see here for that issue: https://marc.info/?l=openbsd-misc&m=155277213709648

On the LAN-side I have
vlan32 (10.10.10.0/24) and
vlan64 (10.64.0.0/10) via em2 to another managed switch (switch2).

As a further information, this is a hotel-setup:
vlan32 is internaly (office-computers, VoIP and gear)
vlan64 is guest-wifi with unifi controller and 10 APs
with ~20-100 connected devices.

The hostname.pppoeX looks like that:

$hostname.pppoe0
inet 0.0.0.0 255.255.255.255 NONE \
        pppoedev em0 authproto pap authname 'xxx' authkey 'xx' up
dest 0.0.0.1
!/sbin/route add -mpath default -ifp pppoe0 0.0.0.1

$hostname.vlan3:
dhcp vlan 3 vlandev em1
!/sbin/route add -mpath default -ifp vlan3 192.168.3.1

$hostname.vlan4:
dhcp vlan 4 vlandev em1
!/sbin/route add -mpath default -ifp vlan4 192.168.4.1

all pppoe[0|1] and vlan[3|4] are successfully connected to the ISP
or router-modem and due to the -mpath in the !/sbin/route command
all interface are in the egress interface-group:

# ifconfig egress
pppoe0: flags=8851<UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST> mtu 1492
        index 7 priority 0 llprio 3
        dev: em0 state: session
        sid: 0x185 PADI retries: 0 PADR retries: 0 time: 1107d 16:47:37
        sppp: phase network authproto pap authname "my-first-adsl-username"
        groups: pppoe egress
        status: active
        inet 79.140.xxx.xxx --> 62.27.xxx.xxx netmask 0xffffffff
pppoe1: flags=8851<UP,POINTOPOINT,RUNNING,SIMPLEX,MULTICAST> mtu 1492
        index 8 priority 0 llprio 3
        dev: vlan2 state: session
        sid: 0x186 PADI retries: 0 PADR retries: 0 time: 1107d 16:47:37
        sppp: phase network authproto pap authname
"my-second-adsl-username"
        groups: pppoe egress
        status: active
        inet6 fe80::98f8:2562:d5f3:23a3%pppoe1 ->  prefixlen 64 scopeid 0x8
        inet 85.212.xxx.xxx --> 62.27.xxx.xxx netmask 0xffffffff
vlan4: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        lladdr 00:0d:b9:43:43:b5
        index 42 priority 0 llprio 3
        encap: vnetid 4 parent em1
        groups: vlan egress
        media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
        status: active
        inet 192.168.4.2 netmask 0xffffff00 broadcast 192.168.4.255
vlan3: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
        lladdr 00:0d:b9:43:43:b5
        index 43 priority 0 llprio 3
        encap: vnetid 3 parent em1
        groups: vlan egress
        media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
        status: active
        inet 192.168.3.2 netmask 0xffffff00 broadcast 192.168.3.255



#route show -gateway -inet
Routing tables

Internet:
Destination        Gateway            Flags   Refs      Use   Mtu  Prio
Iface
default            192.168.4.1        UGSP       8    30803     -     8
vlan4
default            62.27.93.140       UGSP       0        2     -     8
pppoe0
default            62.27.93.143       UGSP       0        4     -     8
pppoe1
default            192.168.3.1        UGSP       0        3     -     8
vlan3
base-address.mcast localhost          URS        0        0 32768     8 lo0

10.64/10           10.64.0.1          UCn       50        0     -     4
vlan64
10.10.10/24        10.10.10.1         UCn        8        4     -     4
vlan32
10.10.10.255       10.10.10.1         UHb        0        0     -     1
vlan32
10.127.255.255     10.64.0.1          UHb        0        1     -     1
vlan64
62.27.93.140       79.140.177.216     UHh        1        1     -     8
pppoe0
62.27.93.143       55d4e174.access.ec UHh        1        1     -     8
pppoe1
79.140.177.216     79.140.177.216     UHl        0     2779     -     1
pppoe0
55d4e174.access.ec 55d4e174.access.ec UHl        0     1657     -     1
pppoe1
127/8              localhost          UGRS       0        0 32768     8 lo0

localhost          localhost          UHhl      13     2010 32768     1 lo0

192.168.3/24       192.168.3.2        UCn        1        0     -     4
vlan3
192.168.3.255      192.168.3.2        UHb        0        0     -     1
vlan3
192.168.4/24       192.168.4.2        UCn        1        2     -     4
vlan4
192.168.4.255      192.168.4.2        UHb        0        0     -     1
vlan4


I would like to achieve the following:
1. almost even usage of the 4 ADSL-Uplinks
2. prefer VoIP-traffic over vlan32-traffic over vlan64 traffic
3. ssh should be always available through the static IP on pppoe0
4. vlan32 (internal) should not be reachable from vlan64 (hotel-guests)

To do so, I almost followed /faq/pf/pools.html with the following change:
I assume that alomost all traffic in my setup is https this days
so I don´t see the point in two different pass in rules for https and
non-https.
To adress the problem with "secure" web-applications*
I use the source-hash method for nat-to and route-to

This is may working pf.conf to do the loadbalancing across
the two pppoe interfaces:

# cat /etc/pf_pppoe.conf


int_if = "{ vlan32, vlan64 }"
int_lan = "{ 10.10.10.0/24, 10.64.0.0/10}"

table <martians> { 0.0.0.0/8 10.0.0.0/8 127.0.0.0/8 169.254.0.0/16     \
                   172.16.0.0/12 192.0.0.0/24 192.0.2.0/24 224.0.0.0/3 \
                   192.168.0.0/16 198.18.0.0/15 198.51.100.0/24        \
                   203.0.113.0/24 }
set block-policy drop
#set loginterface egress
set skip on lo0

match in all scrub (no-df random-id max-mss 1440)
match out on pppoe from $int_lan nat-to (pppoe) source-hash #least-states
sticky-address

# VOIP Prio
match on vlan32 proto { tcp udp } to port { 5060 5064 } set prio 7
match on vlan32 proto udp from port 11780:12780 set prio 7

#Internal prio
match on vlan32 set prio 5

block in quick on pppoe from <martians> to any
block return out quick on pppoe from any to <martians>
block in
pass quick on vlan32 to vlan32:network
pass quick on vlan64 to vlan64:network
pass out on egress

block return in on vlan from vlan64:network to vlan32:network #no guests to
office
block return in on vlan inet proto tcp from any to any port 25 #avoid spam
out

pass in on $int_if route-to { (pppoe0 pppoe0:network), (pppoe1
pppoe1:network) } source-hash

#this lines are commented because everything seems to work with the
source-hash method
#pass out on pppoe0 from pppoe1 route-to (pppoe1 pppoe1:network)
#pass out on pppoe1 from pppoe0 route-to (pppoe0 pppoe0:network)

pass in on egress inet proto icmp all
pass in on pppoe0 proto tcp from any to (pppoe0) port ssh



Basically everythinig works but i notice some strange things:.

1. Somtimes the traffic is not even distributed between the uplinks.
My guess is this is due to the source-hash method which
- when I understand correctly - distributes traffic per IP and not per
connection.
When I use [round-robin | least-state] sticky-address i´ve problems with my
VoIP.
An maybe some guests have problems with "secure" web apps* too.
Anybody an Idea how to do prober loadbalancing with almost only https
traffic?

2. I tried to custumize this rules to also include vlan[3|4] to the
load-balancing.
2.1. use egress-group instead of the pppoe-group for nat-to:

match out on egress from $int_lan nat-to (egress) source-hash

2.2. add vlan[3|4] to the route-to rule:

pass in on $int_if route-to { (pppoe0 pppoe0:network), (pppoe1
pppoe1:network),\
 (vlan3 vlan3:network), (vlan4 vlan4:network) } source-hash

But it didn´t work: No internet connection from vlan32 and vlan64


3. ping with the -I flag is strange.
To see if my uplinks are working I used to:
# ping -I [assigend or static IP] 8.8.8.8
somtimes it works for an IP and doens´t for another like:
#ping -I [my static IP] 8.8.8.8 works
#ping -I [my static IP] 1.1.1.1 doesn´t work
#ping 1.1.1.1 works

#ping -I [dynamic IP] 8.8.8.8 doesn´t work
#ping -I [dynamic IP] 1.1.1.1 works
#ping 8.8.8.8 works

I don´t have any clue about this and where to look besides routing table.
This problem is a little bit od, cause it stops me from proper investigating
the issue. ping from vlan-ip to vlan-gateway works fine:

# ping -I 192.168.3.2 192.168.3.1
PING 192.168.3.1 (192.168.3.1): 56 data bytes
64 bytes from 192.168.3.1: icmp_seq=0 ttl=64 time=1.475 ms
64 bytes from 192.168.3.1: icmp_seq=1 ttl=64 time=0.719 ms
64 bytes from 192.168.3.1: icmp_seq=2 ttl=64 time=0.762 ms

# ping -I 192.168.4.2 192.168.4.1
PING 192.168.4.1 (192.168.4.1): 56 data bytes
64 bytes from 192.168.4.1: icmp_seq=0 ttl=64 time=0.828 ms
64 bytes from 192.168.4.1: icmp_seq=1 ttl=64 time=0.834 ms


3. My static IP is not always reachable from the outside.
One day it works, the other day it doesn´t.
I guess this could be a problem with an update of the dynamic IPs,
but this is just a guess because they are updated every 24h.
Else, I don´t know where to further look or investigate here too.

Hope someone has a clue on this...
Thanks in advance and all the best

Thomas

*) when writing "secure" in quotation mark please understand it
as in the example at /faq/pf/pools.html
Reply | Threaded
Open this post in threaded view
|

Re: load balancing outgoing traffic with 4 uplinks

Stuart Henderson
On 2019-03-23, Thomas Huber <[hidden email]> wrote:
> I would like to achieve the following:
> 1. almost even usage of the 4 ADSL-Uplinks
> 2. prefer VoIP-traffic over vlan32-traffic over vlan64 traffic
> 3. ssh should be always available through the static IP on pppoe0
> 4. vlan32 (internal) should not be reachable from vlan64 (hotel-guests)


> 1. Somtimes the traffic is not even distributed between the uplinks.
> My guess is this is due to the source-hash method which
> - when I understand correctly - distributes traffic per IP and not per
> connection.
> When I use [round-robin | least-state] sticky-address i´ve problems with my
> VoIP.
> An maybe some guests have problems with "secure" web apps* too.
> Anybody an Idea how to do prober loadbalancing with almost only https
> traffic?

The only way you're likely to do better is to tunnel the traffic
to another machine on decent bandwidth using a multilink protocol
that knows how to deal with this - mlvpn comes to mind (it's in
packages).

> 2. I tried to custumize this rules to also include vlan[3|4] to the
> load-balancing.
> 2.1. use egress-group instead of the pppoe-group for nat-to:
>
> match out on egress from $int_lan nat-to (egress) source-hash
>
> 2.2. add vlan[3|4] to the route-to rule:
>
> pass in on $int_if route-to { (pppoe0 pppoe0:network), (pppoe1
> pppoe1:network),\
>  (vlan3 vlan3:network), (vlan4 vlan4:network) } source-hash
>
> But it didn´t work: No internet connection from vlan32 and vlan64

It's been a long time since I had to do this but at least you'll need to
nat on each pppoe interface individually to the correct address for that
interface. e.g. "match out on pppoe0 from ... nat-to (pppoe0)"

What you are doing now will rewrite the address to *one* of the egress
interface addresses. Which will only be correct if the packet is being
sent out of the interface with that address.

> 3. ping with the -I flag is strange.
> To see if my uplinks are working I used to:
> # ping -I [assigend or static IP] 8.8.8.8
> somtimes it works for an IP and doens´t for another like:
> #ping -I [my static IP] 8.8.8.8 works
> #ping -I [my static IP] 1.1.1.1 doesn´t work
> #ping 1.1.1.1 works
>
> #ping -I [dynamic IP] 8.8.8.8 doesn´t work
> #ping -I [dynamic IP] 1.1.1.1 works
> #ping 8.8.8.8 works

I never came up with a satisfying way to do this. Dirty method is to
find some specific "always on" addresses and direct one to one isp,
another to another isp, etc, and ping those ..

There's another method of diverting traffic over multiple ISPs,
using multiple route tables + rdomains, but the selector in PF is a
bit simpler, to achieve balancing you can use the "probability"
modifier, but there's no stickiness so you are likely to have the
problem with voip and banks etc.


Reply | Threaded
Open this post in threaded view
|

Re: load balancing outgoing traffic with 4 uplinks

Thomas Huber
I just read some tutorials and (again) the "Book _great_ Book of PF" and
simplified my
pf.conf and still everthing works fine:

int_if = "{ vlan32, vlan64 }"
int_lan = "{ 10.10.10.0/24, 10.64.0.0/10}"
table <martians> { 0.0.0.0/8 10.0.0.0/8 127.0.0.0/8 169.254.0.0/16     \
                   172.16.0.0/12 192.0.0.0/24 192.0.2.0/24 224.0.0.0/3 \
                   192.168.0.0/16 198.18.0.0/15 198.51.100.0/24        \
                   203.0.113.0/24 }

set block-policy drop
#set loginterface egress
set skip on lo0
match in all scrub (no-df random-id max-mss 1440)
match out on pppoe from $int_lan nat-to (pppoe)

# VOIP Prio
match on vlan32 proto { tcp udp } to port { 5060 5064 } set prio 7
match on vlan32 proto udp from port 11780:12780 set prio 7

#Internal prio
match on vlan32 set prio 5

block in quick on pppoe from <martians> to any
block return out quick on pppoe from any to <martians>

block in
pass out on egress
pass quick on vlan32 to vlan32:network
pass quick on vlan64 to vlan64:network

block return in on vlan from vlan64:network to vlan32:network #no guests to
office
block return in on vlan inet proto tcp from any to any port 25 #avoid spam
out

pass in on $int_if route-to { (pppoe0 pppoe0:network), (pppoe1
pppoe1:network) } least-states sticky-address

pass in on egress inet proto icmp all
pass in on pppoe0 proto tcp from any to (pppoe0) port ssh



> > 1. Somtimes the traffic is not even distributed between the uplinks.
> > My guess is this is due to the source-hash method which
> > - when I understand correctly - distributes traffic per IP and not per
> > connection.
> > When I use [round-robin | least-state] sticky-address i´ve problems with
> my
> > VoIP.
> > An maybe some guests have problems with "secure" web apps* too.
> > Anybody an Idea how to do prober loadbalancing with almost only https
> > traffic?
>
> The only way you're likely to do better is to tunnel the traffic
> to another machine on decent bandwidth using a multilink protocol
> that knows how to deal with this - mlvpn comes to mind (it's in
> packages).
>

I gues a clean and simple solution here would be a
"Provider Independent" IPv6 Range and mulit-path routing or I´m missing
something with this concept?


> > 2. I tried to custumize this rules to also include vlan[3|4] to the
> > load-balancing.
> > 2.1. use egress-group instead of the pppoe-group for nat-to:
> >
> > match out on egress from $int_lan nat-to (egress) source-hash
> >
> > 2.2. add vlan[3|4] to the route-to rule:
> >
> > pass in on $int_if route-to { (pppoe0 pppoe0:network), (pppoe1
> > pppoe1:network),\
> >  (vlan3 vlan3:network), (vlan4 vlan4:network) } source-hash
> >
> > But it didn´t work: No internet connection from vlan32 and vlan64
>
> It's been a long time since I had to do this but at least you'll need to
> nat on each pppoe interface individually to the correct address for that
> interface. e.g. "match out on pppoe0 from ... nat-to (pppoe0)"
>
> What you are doing now will rewrite the address to *one* of the egress
> interface addresses. Which will only be correct if the packet is being
> sent out of the interface with that address.
>
> > 3. ping with the -I flag is strange.
> > To see if my uplinks are working I used to:
> > # ping -I [assigend or static IP] 8.8.8.8
> > somtimes it works for an IP and doens´t for another like:
> > #ping -I [my static IP] 8.8.8.8 works
> > #ping -I [my static IP] 1.1.1.1 doesn´t work
> > #ping 1.1.1.1 works
> >
> > #ping -I [dynamic IP] 8.8.8.8 doesn´t work
> > #ping -I [dynamic IP] 1.1.1.1 works
> > #ping 8.8.8.8 works
>
> I never came up with a satisfying way to do this. Dirty method is to
> find some specific "always on" addresses and direct one to one isp,
> another to another isp, etc, and ping those ..
>
> There's another method of diverting traffic over multiple ISPs,
> using multiple route tables + rdomains, but the selector in PF is a
> bit simpler, to achieve balancing you can use the "probability"
> modifier, but there's no stickiness so you are likely to have the
> problem with voip and banks etc.
>

I just need that for testing purpose and was not sure if I do something
wrong with the ping command. But doesn´t seem so.
But it seems, that this issue is related to my ssh-connecting issue.
I tried to connect from somwhere else. It didn't work directly, but
going through my openbsd.amsterdam-vm worked...
kind of strange.


So basically, I'm again into getting up the pppoe connection for uplink
3 and 4 within the OpenBSD box and hope that loadbalancing works when
extending the pf-rules with this two (pppoe2, pppoe3) interfaces.

Thanks again Stuart and everybody else.
Thomas