Strange issue due to stale udp route cache

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Strange issue due to stale udp route cache

Yannick Gravel
> Synopsis: Strange issue due to stale udp route cache
> Category: kernel
> Environment:
        System      : OpenBSD 6.5
        Details     : OpenBSD 6.5 (GENERIC) #3: Sat Apr 13 14:42:43 MDT 2019
                         [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC

        Architecture: OpenBSD.amd64
        Machine     : amd64
> Description:

Problem with UDP Socket when there is a route change.

Back in november 2004, I was deploying VPN hubs with OpenBSD 3.6 using
OpenBGPD for dynamic route distribution and OpenVPN for tunnels.
(Problem is still present in 6.5)

The VPN part was working fine, but some added services installed were
not running as expected:

- Logging to a central syslog server stopped working on VPN link restart,
  even after the VPN was back up and the route to the central syslog server
  added again. Syslog packet were sent through the default route even after
  the more specific route toward the syslog server was restored.

- Similar problem with a DNS server running as slave/secoundary master
  running on the VPN hub that stopped fetching it's zone in the same
  context as the previous one.

At first, I could not find a solution, so I made a rule to always split
anything router/firewall/VPN from server/services. Not a fix but my way
of staying away from the issue.

For a while, I could not wrap my mind around this. But some reading and
research lead me to this explanation.

* On binding a UDP socket a route entry is cached
* On a routing fault the route entry in cache invalidated and replaced
(by a less specific rule)
* The cached route entry is never restored once the specific route is restored

The issue is pointing to the function in_pcbrtentry and related in
src/sys/netinet/in_pcb.c

Here is an Post shortly after when FreeBSD updated their code away from
the common BSD code that all BSD shared... A follow-up to a problem report
that I now know that is the same that what I am reporting here.

http://lists.freebsd.org/pipermail/freebsd-current/2004-May/027072.html

Back then the NetBSD code looked really similar to the OpenBSD code but
really changed in 2008 when I first investigated this.

> How-To-Repeat:

Following is a simple setup to reproduce the problem
Don't focus on syslogd but on the kernel and route cache. Because
this could affect other Base and third party apps using UDP.

                                 lo1: 172.31.255.129
+----------+                      +---------------+
|  router  |                      | Syslog server |
+-----+----+                      +-------+-------+
      | 172.16.247.2                      | em0: 172.16.247.129
      | 00:50:56:e3:ff:c4                 | 00:0c:29:eb:85:38
      |                                   |
------+--------------------+--------------+------
                           |
                           | em0: 172.16.247.128
                           | 00:0c:29:23:b7:9b
                   +-------+-------+
                   | OpenBSD Host  |
                   +---------------+

-- Configure Target (Syslog server)

- Enable Syslogd listening to UDP

# echo 'syslogd_flags="-u"' >> /etc/rc.conf.local

- Enable loopback interface

# echo 'inet 172.31.255.129 255.255.255.255 NONE' >> /etc/hostname.lo1

-- Configure Source

- Define loghost

# echo '172.31.255.129 loghost' >> /etc/hosts

- Send logs to loghost

# vi /etc/syslog.conf

- uncomment @loghost lines

# route add -host 172.31.255.129 172.16.247.129

- Restart syslogd

# pkill syslogd
# syslogd -a /var/empty/dev/log

---- Test sequence (from source)
(with network packet capture # tcpdump -n -vvv -e port syslog)

# logger "test before removing route"
- Entry received on syslog server

22:46:15.669993 00:0c:29:23:b7:9b 00:0c:29:eb:85:38 0800 94: 172.16.247.128.514 > 172.31.255.129.514: [udp sum ok] udp 52 (ttl 64, id 55483, len 80)

# route delete -host 172.31.255.129 172.16.247.129
# logger "test route removed"
- Entry not received on syslog server

22:46:59.965395 00:0c:29:23:b7:9b 00:50:56:e3:ff:c4 0800 86: 172.16.247.128.514 > 172.31.255.129.514: [udp sum ok] udp 44 (ttl 64, id 47566, len 72)

# route add -host 172.31.255.129 172.16.247.129
# logger "test route added again"
- Entry not received on syslog server

22:47:37.083375 00:0c:29:23:b7:9b 00:50:56:e3:ff:c4 0800 90: 172.16.247.128.514 > 172.31.255.129.514: [udp sum ok] udp 48 (ttl 64, id 33787, len 76)

> Fix:

- Restart the process / daemon.
- Addind some route-to rules in pf.conf
- Do not install such apps on server involved in dynamic routing.

I'm no developer, but there should be other event than fault to invalidate
a route cache entry. Such event that would allow cached route entry to get
replaced by more specific route not only slide to more general rule toward
the default router.

I have seen slides from 2009 explaining the issue and solution
http://quigon.bsws.de/papers/dcbsdcon2009/
slides 59 et 60.


dmesg:
OpenBSD 6.5 (GENERIC) #3: Sat Apr 13 14:42:43 MDT 2019
   [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC
real mem = 251658240 (240MB)
avail mem = 236101632 (225MB)
warning: no entropy supplied by boot loader
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0xf101f (9 entries)
bios0: vendor BHYVE version "1.00" date 03/14/2014
bios0: bhyve BHYVE
acpi0 at bios0: rev 2
acpi0: sleep states S5
acpi0: tables DSDT APIC FACP HPET MCFG
acpi0: wakeup devices
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD GX-412TC SOC, 998.64 MHz, 16-30-01
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,HV,NXE,MMXX,FFXSR,PAGE1GB,LONG,LAHF,CMPLEG,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,SKINIT,DBKP,PERFTSC,ITSC,BMI1,XSAVEOPT
cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu0: smt 0, core 1, package 0
mtrr: CPU supports MTRRs but not enabled by BIOS
cpu0: apic clock running at 134MHz
ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 11, 32 pins
acpihpet0 at acpi0: 16777216 Hz
acpimcfg0 at acpi0
acpimcfg0: addr 0xe0000000, bus 0-255
acpiprt0 at acpi0: bus 0 (PC00)
acpipci0 at acpi0 PC00: _OSC failed
acpicmos0 at acpi0
pvbus0 at mainbus0: bhyve
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 unknown vendor 0x1275 product 0x1275 rev 0x00
virtio0 at pci0 dev 4 function 0 "Qumranet Virtio Storage" rev 0x00
vioblk0 at virtio0
scsibus1 at vioblk0: 2 targets
sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, > SCSI3 0/direct fixed
sd0: 8192MB, 512 bytes/sector, 16777216 sectors
virtio0: msix shared
virtio1 at pci0 dev 5 function 0 "Qumranet Virtio Network" rev 0x00
vio0 at virtio1: address 58:9c:fc:ff:aa:aa
virtio1: msix shared
pcib0 at pci0 dev 31 function 0 "Intel 82371SB ISA" rev 0x00
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0 mux 1
pms0 at pckbc0 (aux slot)
wsmouse0 at pms0 mux 0
/dev/ksyms: Symbol table not valid.
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
root on sd0a (7b471a0413e650e5.a) swap on sd0b dump on sd0b

usbdevs:
usbdevs: no USB controllers found

pcidump:
Domain /dev/pci0:
0:0:0: unknown unknown
        0x0000: Vendor ID: 1275, Product ID: 1275
        0x0004: Command: 0007, Status: 0010
        0x0008: Class: 06 Bridge, Subclass: 00 Host,
                Interface: 00, Revision: 00
        0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
                Cache Line Size: 00
        0x0010: BAR empty (00000000)
        0x0014: BAR empty (00000000)
        0x0018: BAR empty (00000000)
        0x001c: BAR empty (00000000)
        0x0020: BAR empty (00000000)
        0x0024: BAR empty (00000000)
        0x0028: Cardbus CIS: 00000000
        0x002c: Subsystem Vendor ID: 0000 Product ID: 0000
        0x0030: Expansion ROM Base Address: 00000000
        0x0038: 00000000
        0x003c: Interrupt Pin: 00 Line: ff Min Gnt: 00 Max Lat: 00
        0x0040: Capability 0x10: PCI Express
                Link Speed: 2.5 / 2.5 GT/s, Link Width: x1 / x1
        0x0100: Enhanced Capability 0x00: Unknown
        0x0000: 12751275 00100007 06000000 00000000
        0x0010: 00000000 00000000 00000000 00000000
        0x0020: 00000000 00000000 00000000 00000000
        0x0030: 00000000 00000040 00000000 000000ff
        0x0040: 00420010 00000000 00000000 00000411
        0x0050: 00110000 00000000 00000000 00000000
        0x0060: 00000000 00000000 00000000 00000000
        0x0070: 00000000 00000000 00000000 00000000
        0x0080: 00000000 00000000 00000000 00000000
        0x0090: 00000000 00000000 00000000 00000000
        0x00a0: 00000000 00000000 00000000 00000000
        0x00b0: 00000000 00000000 00000000 00000000
        0x00c0: 00000000 00000000 00000000 00000000
        0x00d0: 00000000 00000000 00000000 00000000
        0x00e0: 00000000 00000000 00000000 00000000
        0x00f0: 00000000 00000000 00000000 00000000
0:4:0: Qumranet Virtio Storage
        0x0000: Vendor ID: 1af4, Product ID: 1001
        0x0004: Command: 0007, Status: 0010
        0x0008: Class: 01 Mass Storage, Subclass: 00 SCSI,
                Interface: 00, Revision: 00
        0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
                Cache Line Size: 00
        0x0010: BAR io addr: 0x00002000/0x0040
        0x0014: BAR mem 32bit addr: 0xc0000000/0x00002000
        0x0018: BAR empty (00000000)
        0x001c: BAR empty (00000000)
        0x0020: BAR empty (00000000)
        0x0024: BAR empty (00000000)
        0x0028: Cardbus CIS: 00000000
        0x002c: Subsystem Vendor ID: 1af4 Product ID: 0002
        0x0030: Expansion ROM Base Address: 00000000
        0x0038: 00000000
        0x003c: Interrupt Pin: 01 Line: 05 Min Gnt: 00 Max Lat: 00
        0x0040: Capability 0x11: Extended Message Signalled Interrupts (MSI-X)
                Enabled: yes; table size 2 (BAR 1:0)
        0x004c: Capability 0x05: Message Signalled Interrupts (MSI)
                Enabled: no
        0x0000: 10011af4 00100007 01000000 00000000
        0x0010: 00002001 c0000000 00000000 00000000
        0x0020: 00000000 00000000 00000000 00021af4
        0x0030: 00000000 00000040 00000000 00000105
        0x0040: 80014c11 00000001 00001001 00800005
        0x0050: 00000000 00000000 00000000 00000000
        0x0060: 00000000 00000000 00000000 00000000
        0x0070: 00000000 00000000 00000000 00000000
        0x0080: 00000000 00000000 00000000 00000000
        0x0090: 00000000 00000000 00000000 00000000
        0x00a0: 00000000 00000000 00000000 00000000
        0x00b0: 00000000 00000000 00000000 00000000
        0x00c0: 00000000 00000000 00000000 00000000
        0x00d0: 00000000 00000000 00000000 00000000
        0x00e0: 00000000 00000000 00000000 00000000
        0x00f0: 00000000 00000000 00000000 00000000
0:5:0: Qumranet Virtio Network
        0x0000: Vendor ID: 1af4, Product ID: 1000
        0x0004: Command: 0007, Status: 0010
        0x0008: Class: 02 Network, Subclass: 00 Ethernet,
                Interface: 00, Revision: 00
        0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
                Cache Line Size: 00
        0x0010: BAR io addr: 0x00002040/0x0020
        0x0014: BAR mem 32bit addr: 0xc0002000/0x00002000
        0x0018: BAR empty (00000000)
        0x001c: BAR empty (00000000)
        0x0020: BAR empty (00000000)
        0x0024: BAR empty (00000000)
        0x0028: Cardbus CIS: 00000000
        0x002c: Subsystem Vendor ID: 1af4 Product ID: 0001
        0x0030: Expansion ROM Base Address: 00000000
        0x0038: 00000000
        0x003c: Interrupt Pin: 01 Line: 06 Min Gnt: 00 Max Lat: 00
        0x0040: Capability 0x11: Extended Message Signalled Interrupts (MSI-X)
                Enabled: yes; table size 3 (BAR 1:0)
        0x004c: Capability 0x05: Message Signalled Interrupts (MSI)
                Enabled: no
        0x0000: 10001af4 00100007 02000000 00000000
        0x0010: 00002041 c0002000 00000000 00000000
        0x0020: 00000000 00000000 00000000 00011af4
        0x0030: 00000000 00000040 00000000 00000106
        0x0040: 80024c11 00000001 00001001 00800005
        0x0050: 00000000 00000000 00000000 00000000
        0x0060: 00000000 00000000 00000000 00000000
        0x0070: 00000000 00000000 00000000 00000000
        0x0080: 00000000 00000000 00000000 00000000
        0x0090: 00000000 00000000 00000000 00000000
        0x00a0: 00000000 00000000 00000000 00000000
        0x00b0: 00000000 00000000 00000000 00000000
        0x00c0: 00000000 00000000 00000000 00000000
        0x00d0: 00000000 00000000 00000000 00000000
        0x00e0: 00000000 00000000 00000000 00000000
        0x00f0: 00000000 00000000 00000000 00000000
0:31:0: Intel 82371SB ISA
        0x0000: Vendor ID: 8086, Product ID: 7000
        0x0004: Command: 0007, Status: 0000
        0x0008: Class: 06 Bridge, Subclass: 01 ISA,
                Interface: 00, Revision: 00
        0x000c: BIST: 00, Header Type: 00, Latency Timer: 00,
                Cache Line Size: 00
        0x0010: BAR empty (00000000)
        0x0014: BAR empty (00000000)
        0x0018: BAR empty (00000000)
        0x001c: BAR empty (00000000)
        0x0020: BAR empty (00000000)
        0x0024: BAR empty (00000000)
        0x0028: Cardbus CIS: 00000000
        0x002c: Subsystem Vendor ID: 0000 Product ID: 0000
        0x0030: Expansion ROM Base Address: 00000000
        0x0038: 00000000
        0x003c: Interrupt Pin: 00 Line: ff Min Gnt: 00 Max Lat: 00
        0x0000: 70008086 00000007 06010000 00000000
        0x0010: 00000000 00000000 00000000 00000000
        0x0020: 00000000 00000000 00000000 00000000
        0x0030: 00000000 00000000 00000000 000000ff
        0x0040: 00000000 00000000 00000000 00000000
        0x0050: 00000000 00000000 00000000 00000000
        0x0060: 80800605 00000000 80808080 00000000
        0x0070: 00000000 00000000 00000000 00000000
        0x0080: 00000000 00000000 00000000 00000000
        0x0090: 00000000 00000000 00000000 00000000
        0x00a0: 00000000 00000000 00000000 00000000
        0x00b0: 00000000 00000000 00000000 00000000
        0x00c0: 00000000 00000000 00000000 00000000
        0x00d0: 00000000 00000000 00000000 00000000
        0x00e0: 00000000 00000000 00000000 00000000
        0x00f0: 00000000 00000000 00000000 00000000

acpidump:
begin-base64 644 APIC.1
QVBJQ1oAAAABZ0JIWVZFIEJWTUFEVCAgAQAAAElOVEwDEBggAADg/gEAAAAACAAAAQAAAAEMAAAA
AMD+AAAAAAIKAAACAAAABQACCgAJCQAAAA8ABAb/BQAB
====
begin-base64 644 DSDT.3
RFNEVCwJAAACnUJIWVZFIEJWRFNEVCAgAQAAAElOVEwDEBggCF9TNV8SBQIKBQAIUElDTQAUDF9Q
SUMBcGhQSUNNEECLX1NCX1uCSIpQQzAwCF9ISUQMQdAKAwhfQURSABQIX0JCTgCkAAhfQ1JTEUYJ
CpKIDQACDAAAAAAAAAAAAAEARwH4DPgMAQiIDQABDAMAAAAA9wwAAPgMiA0AAQwDAAAADf8fAAAA
E4gNAAEMAwAAACB/IAAAgACHFwAADAEAAAAAAAAAwP//H8AAAAAAAAAgAIorAAAMAQAAAAAAAAAA
AAAAANAAAAD//w8A0AAAAAAAAAAAAAAAAAAQAAAAAAB5AAhQUFJUEigCEhIEDP//BAAALklTQV9M
TktBABISBAz//wUAAC5JU0FfTE5LQgAIQVBSVBIaAhILBAz//wQAAAAKEBILBAz//wUAAAAKERQY
X1BSVACgClBJQ02kQVBSVKEGpFBQUlRbgkR4SVNBXwhfQURSDAAAHwBbgExQQ1ICAAsAAVuBM0xQ
Q1IAAEAwUElSQQhQSVJCCFBJUkMIUElSRAgAIFBJUkUIUElSRghQSVJHCFBJUkgIW4ItS0JEXwhf
SElEDEHQAwMIX0NSUxEYChVHAWAAYAABAUcBZABkAAEBIgIAeQBbgi1NT1VfCF9ISUQMQdAPEwhf
Q1JTERgKFUcBYABgAAEBRwFkAGQAAQEiABB5ABQuUElSVgGgCHtoCoAApAB7aAoPYKAHlWAKA6QA
oAeTYAoIpACgB5NgCg2kAKQBW4JKCkxOS0EIX0hJRAxB0AwPCF9VSUQBFBhfU1RBAKAMUElSVlBJ
UkGkCguhBKQKCQhfUFJTEQkKBiP43hh5AAhDQjAxEQkKBiMAABh5AItDQjAxAUNJUkEUKV9DUlMA
e1BJUkEKj2CgDVBJUlZgeQFgQ0lSQaEHcABDSVJBpENCMDEUDV9ESVMAcAqAUElSQRQaX1NSUwGL
aAFTSVJBglNJUkFgcHZgUElSQVuCSwpMTktCCF9ISUQMQdAMDwhfVUlECgIUGF9TVEEAoAxQSVJW
UElSQqQKC6EEpAoJCF9QUlMRCQoGI/jeGHkACENCMDIRCQoGIwAAGHkAi0NCMDIBQ0lSQhQpX0NS
UwB7UElSQgqPYKANUElSVmB5AWBDSVJCoQdwAENJUkKkQ0IwMhQNX0RJUwBwCoBQSVJCFBpfU1JT
AYtoAVNJUkKCU0lSQmBwdmBQSVJCW4JLCkxOS0MIX0hJRAxB0AwPCF9VSUQKAxQYX1NUQQCgDFBJ
UlZQSVJDpAoLoQSkCgkIX1BSUxEJCgYj+N4YeQAIQ0IwMxEJCgYjAAAYeQCLQ0IwMwFDSVJDFClf
Q1JTAHtQSVJDCo9goA1QSVJWYHkBYENJUkOhB3AAQ0lSQ6RDQjAzFA1fRElTAHAKgFBJUkMUGl9T
UlMBi2gBU0lSQ4JTSVJDYHB2YFBJUkNbgksKTE5LRAhfSElEDEHQDA8IX1VJRAoEFBhfU1RBAKAM
UElSVlBJUkSkCguhBKQKCQhfUFJTEQkKBiP43hh5AAhDQjA0EQkKBiMAABh5AItDQjA0AUNJUkQU
KV9DUlMAe1BJUkQKj2CgDVBJUlZgeQFgQ0lSRKEHcABDSVJEpENCMDQUDV9ESVMAcAqAUElSRBQa
X1NSUwGLaAFTSVJEglNJUkRgcHZgUElSRFuCSwpMTktFCF9ISUQMQdAMDwhfVUlECgUUGF9TVEEA
oAxQSVJWUElSRaQKC6EEpAoJCF9QUlMRCQoGI/jeGHkACENCMDURCQoGIwAAGHkAi0NCMDUBQ0lS
RRQpX0NSUwB7UElSRQqPYKANUElSVmB5AWBDSVJFoQdwAENJUkWkQ0IwNRQNX0RJUwBwCoBQSVJF
FBpfU1JTAYtoAVNJUkWCU0lSRWBwdmBQSVJFW4JLCkxOS0YIX0hJRAxB0AwPCF9VSUQKBhQYX1NU
QQCgDFBJUlZQSVJGpAoLoQSkCgkIX1BSUxEJCgYj+N4YeQAIQ0IwNhEJCgYjAAAYeQCLQ0IwNgFD
SVJGFClfQ1JTAHtQSVJGCo9goA1QSVJWYHkBYENJUkahB3AAQ0lSRqRDQjA2FA1fRElTAHAKgFBJ
UkYUGl9TUlMBi2gBU0lSRoJTSVJGYHB2YFBJUkZbgksKTE5LRwhfSElEDEHQDA8IX1VJRAoHFBhf
U1RBAKAMUElSVlBJUkekCguhBKQKCQhfUFJTEQkKBiP43hh5AAhDQjA3EQkKBiMAABh5AItDQjA3
AUNJUkcUKV9DUlMAe1BJUkcKj2CgDVBJUlZgeQFgQ0lSR6EHcABDSVJHpENCMDcUDV9ESVMAcAqA
UElSRxQaX1NSUwGLaAFTSVJHglNJUkdgcHZgUElSR1uCSwpMTktICF9ISUQMQdAMDwhfVUlECggU
GF9TVEEAoAxQSVJWUElSSKQKC6EEpAoJCF9QUlMRCQoGI/jeGHkACENCMDgRCQoGIwAAGHkAi0NC
MDgBQ0lSSBQpX0NSUwB7UElSSAqPYKANUElSVmB5AWBDSVJIoQdwAENJUkikQ0IwOBQNX0RJUwBw
CoBQSVJIFBpfU1JTAYtoAVNJUkiCU0lSSGBwdmBQSVJIW4JIBlNJT18IX0hJRAxB0AwCCF9DUlMR
QgUKTkcBIAIgAgEERwEkAiQCAQSGCQABAAAA4AAAABBHAdAE0AQBAkcBYQBhAAEBRwEABAAEAQhH
AbIAsgABAUcBhACEAAEBRwFyAHIAAQZ5AFuCK0NPTTEIX0hJRAxB0AUBCF9VSUQBCF9DUlMREAoN
RwH4A/gDAQgiEAB5AFuCLENPTTIIX0hJRAxB0AUBCF9VSUQKAghfQ1JTERAKDUcB+AL4AgEIIggA
eQBbgiVSVENfCF9ISUQMQdALAAhfQ1JTERAKDUcBcABwAAECIgABeQBbgitQSUNfCF9ISUQLQdAI
X0NSUxEYChVHASAAIAABAkcBoACgAAECIgQAeQBbgiVUSU1SCF9ISUQMQdABAAhfQ1JTERAKDUcB
QABAAAEEIgEAeQAQOC5fU0JfUEMwMFuCLEhQRVQIX0hJRAxB0AEDCF9VSUQACF9DUlMREQoOhgkA
AQAA0P4ABAAAeQA=
====
begin-base64 644 FACP.2
RkFDUAwBAAAFFkJIWVZFIEJWRkFDUCAgAQAAAElOVEwDEBggwCcPAAAoDwABAAkAsgAAAKChAAAA
BAAAAAAAAAQEAAAAAAAAAAAAAAgEAAAAAAAAAAAAAAQCAAQAAAAAAAAAAAAAAAAAAAAAMhQAACUV
CAABCAAB+QwAAAAAAAAGAAABwCcPAAAAAAAAKA8AAAAAAAEgAAIABAAAAAAAAAEAAAAAAAAAAAAA
AAEQAAIEBAAAAAAAAAEAAAAAAAAAAAAAAAEIAAAAAAAAAAAAAAEgAAMIBAAAAAAAAAEAAAEAAAAA
AAAAAAEAAAAAAAAAAAAAAAEIAAEAAAAAAAAAAAEIAAEAAAAAAAAAAA==
====
begin-base64 644 HPET.4
SFBFVDgAAAABj0JIWVZFIEJWSFBFVCAgAQAAAElOVEwDEBggAQeGgAAAAAAAAND+AAAAAAAAAAE=
====
begin-base64 644 MCFG.5
TUNGRzwAAAABsUJIWVZFIEJWTUNGRyAgAQAAAElOVEwDEBggAAAAAAAAAAAAAADgAAAAAAAAAP8A
AAAA
====
begin-base64 644 RSDT.0
UlNEVDQAAAABw0JIWVZFIEJWUlNEVCAgAQAAAElOVEwDEBggACUPAAAmDwBAJw8AgCcPAA==
====
begin-base64 644 headers
ClJTRCBQVFI6IENoZWNrc3VtPTIwNiwgT0VNSUQ9QkhZVkUsIFJldmlzaW9uPTIsIFJzZHRBZGRy
ZXNzPTB4MDAwZjI0NDAKCUxlbmd0aD0zNiwgWHNkdEFkZHJlc3M9MHgwMDAwMDAwMDAwMGYyNDgw
LCBFeHRlbmRlZCBDaGVja3N1bT00MQoKClJTRFQ6IExlbmd0aD01MiwgUmV2aXNpb249MSwgQ2hl
Y2tzdW09MTk1LAoJT0VNSUQ9QkhZVkUsIE9FTSBUYWJsZSBJRD1CVlJTRFQsIE9FTSBSZXZpc2lv
bj0weDEsCglDcmVhdG9yIElEPUlOVEwsIENyZWF0b3IgUmV2aXNpb249MHgyMDE4MTAwMwoKCglF
bnRyaWVzPXsgMHgwMDBmMjUwMCwgMHgwMDBmMjYwMCwgMHgwMDBmMjc0MCwgMHgwMDBmMjc4MCB9
CgoKQVBJQzogTGVuZ3RoPTkwLCBSZXZpc2lvbj0xLCBDaGVja3N1bT0xMDMsCglPRU1JRD1CSFlW
RSwgT0VNIFRhYmxlIElEPUJWTUFEVCwgT0VNIFJldmlzaW9uPTB4MSwKCUNyZWF0b3IgSUQ9SU5U
TCwgQ3JlYXRvciBSZXZpc2lvbj0weDIwMTgxMDAzCgoKCURTRFQ9MHhmMjgwMAoJSU5UX01PREVM
PUFQSUMKCVNDSV9JTlQ9OQoJU01JX0NNRD0weGIyLCBBQ1BJX0VOQUJMRT0weGEwLCBBQ1BJX0RJ
U0FCTEU9MHhhMSwgUzRCSU9TX1JFUT0weDAKCVBNMWFfRVZUX0JMSz0weDQwMC0weDQwMwoJUE0x
YV9DTlRfQkxLPTB4NDA0LTB4NDA1CglQTTJfVE1SX0JMSz0weDQwOC0weDQwYgoJUF9MVkwyX0xB
VD0wbXMsIFBfTFZMM19MQVQ9MG1zCglGTFVTSF9TSVpFPTAsIEZMVVNIX1NUUklERT0wCglEVVRZ
X09GRlNFVD0wLCBEVVRZX1dJRFRIPTAKCURBWV9BTFJNPTAsIE1PTl9BTFJNPTAsIENFTlRVUlk9
NTAKCUZsYWdzPXtXQklOVkQsUFJPQ19DMSxTTFBfQlVUVE9OLFRNUl9WQUxfRVhUfQoKCkRTRFQ6
IExlbmd0aD0yMzQ4LCBSZXZpc2lvbj0yLCBDaGVja3N1bT0xNTcsCglPRU1JRD1CSFlWRSwgT0VN
IFRhYmxlIElEPUJWRFNEVCwgT0VNIFJldmlzaW9uPTB4MSwKCUNyZWF0b3IgSUQ9SU5UTCwgQ3Jl
YXRvciBSZXZpc2lvbj0weDIwMTgxMDAzCgoKSFBFVDogTGVuZ3RoPTU2LCBSZXZpc2lvbj0xLCBD
aGVja3N1bT0xNDMsCglPRU1JRD1CSFlWRSwgT0VNIFRhYmxlIElEPUJWSFBFVCwgT0VNIFJldmlz
aW9uPTB4MSwKCUNyZWF0b3IgSUQ9SU5UTCwgQ3JlYXRvciBSZXZpc2lvbj0weDIwMTgxMDAzCgoK
TUNGRzogTGVuZ3RoPTYwLCBSZXZpc2lvbj0xLCBDaGVja3N1bT0xNzcsCglPRU1JRD1CSFlWRSwg
T0VNIFRhYmxlIElEPUJWTUNGRywgT0VNIFJldmlzaW9uPTB4MSwKCUNyZWF0b3IgSUQ9SU5UTCwg
Q3JlYXRvciBSZXZpc2lvbj0weDIwMTgxMDAzCgo=
====

Reply | Threaded
Open this post in threaded view
|

Re: Strange issue due to stale udp route cache

Martin Pieuchot
Hello Yannick,

On 15/05/19(Wed) 15:50, Yannick Gravel wrote:

> > Synopsis: Strange issue due to stale udp route cache
> > Category: kernel
> > Environment:
> System      : OpenBSD 6.5
> Details     : OpenBSD 6.5 (GENERIC) #3: Sat Apr 13 14:42:43 MDT 2019
> [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC
>
> Architecture: OpenBSD.amd64
> Machine     : amd64
> > Description:
>
> Problem with UDP Socket when there is a route change.
>
> Back in november 2004, I was deploying VPN hubs with OpenBSD 3.6 using
> OpenBGPD for dynamic route distribution and OpenVPN for tunnels.
> (Problem is still present in 6.5)
>
> The VPN part was working fine, but some added services installed were
> not running as expected:
>
> - Logging to a central syslog server stopped working on VPN link restart,
>   even after the VPN was back up and the route to the central syslog server
>   added again. Syslog packet were sent through the default route even after
>   the more specific route toward the syslog server was restored.
>
> - Similar problem with a DNS server running as slave/secoundary master
>   running on the VPN hub that stopped fetching it's zone in the same
>   context as the previous one.
>
> At first, I could not find a solution, so I made a rule to always split
> anything router/firewall/VPN from server/services. Not a fix but my way
> of staying away from the issue.
>
> For a while, I could not wrap my mind around this. But some reading and
> research lead me to this explanation.
>
> * On binding a UDP socket a route entry is cached
> * On a routing fault the route entry in cache invalidated and replaced
> (by a less specific rule)
> * The cached route entry is never restored once the specific route is restored
>
> The issue is pointing to the function in_pcbrtentry and related in
> src/sys/netinet/in_pcb.c
>
> Here is an Post shortly after when FreeBSD updated their code away from
> the common BSD code that all BSD shared... A follow-up to a problem report
> that I now know that is the same that what I am reporting here.
>
> http://lists.freebsd.org/pipermail/freebsd-current/2004-May/027072.html
>
> Back then the NetBSD code looked really similar to the OpenBSD code but
> really changed in 2008 when I first investigated this.

What you're describing is a known limitation of the actual logic to
cache route entries.  There have been multiple attempts to improve it
but none of them landed in the tree.

Cheers,
Martin