TCP Out-of-order packets on a machine behind an OpenBGPd based router

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

TCP Out-of-order packets on a machine behind an OpenBGPd based router

bernd-34
Hi misc,

I'm about to set up two OpenBGPd machines. At the moment they are each
connected to two different upstream providers running OpenBGPd (and
OpenOSFPd on the internal interfaces). Operating system is

OpenBSD test-a.openbgp.bla.com 5.0 GENERIC.MP#0 amd64

(dmesg below)

On a host reserved for testing (CentOS 6.2 x86_64), which sits
logically
(seen from the internet) behind those machines, in a otherwise empty
/22, I see weird network problems (tcpdumping traffic on port 25, and
loading it into wireshark for further analysis):

Receiving mails (port 25, plain SMTP, a 3MiByte attachment) from an
external mail server, which comes in via one of the new BGP machines, I
see massive 'TCP out of order' messages in wireshark, as well as 'TCP
Dup ACK' messages. This is on the testbed machine itself.

On the OpenBGPd router, captured exactly the same traffic, all seems
perfect.

There are two Cisco switches sitting between test-a.openbgp.bla.com and
the testbed mail server, all interfaces perfectly clean, no duplex
problems, no underruns, no runts, nothing -- perfect.

Traffic within my AS is also absolutely no problem, the Linux machine
runs here perfectly as well.

Any idea where to look?

Thanks,

Bernd

$ dmesg

OpenBSD 5.0-stable (GENERIC.MP) #0: Mon Mar 19 08:29:55 CET 2012
     [hidden email]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4285071360 (4086MB)
avail mem = 4156882944 (3964MB)
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.6 @ 0x9f000 (74 entries)
bios0: vendor American Megatrends Inc. version "1.0c" date 05/27/2010
bios0: Supermicro X8SIE
acpi0 at bios0: rev 2
acpi0: sleep states S0 S1 S4 S5
acpi0: tables DSDT FACP APIC MCFG OEMB HPET GSCI SSDT EINJ BERT ERST
HEST
acpi0: wakeup devices P0P1(S4) P0P3(S4) P0P4(S4) P0P5(S4) P0P6(S4)
BR1E(S4) PS2K(S4) PS2M(S4) USB0(S4) USB1(S4) USB2(S4) USB3(S4) USB4(S4)
USB5(S4) USB6(S4) GBE_(S4) BR20(S4) BR21(S4) BR22(S4) BR23(S4) BR24(S4)
BR25(S4) BR26(S4) BR27(S4) EUSB(S4) USBE(S4) SLPB(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.35 MHz
cpu0:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: apic clock running at 133MHz
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu1:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG
cpu1: 256KB 64b/line 8-way L2 cache
cpu2 at mainbus0: apid 4 (application processor)
cpu2: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu2:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG
cpu2: 256KB 64b/line 8-way L2 cache
cpu3 at mainbus0: apid 6 (application processor)
cpu3: Intel(R) Xeon(R) CPU X3430 @ 2.40GHz, 2400.00 MHz
cpu3:
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,POPCNT,NXE,LONG
cpu3: 256KB 64b/line 8-way L2 cache
ioapic0 at mainbus0: apid 7 pa 0xfec00000, version 20, 24 pins
ioapic0: misconfigured as apic 1, remapped to apid 7
acpimcfg0 at acpi0 addr 0xe0000000, bus 0-255
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (P0P1)
acpiprt2 at acpi0: bus 1 (P0P3)
acpiprt3 at acpi0: bus -1 (P0P6)
acpiprt4 at acpi0: bus 7 (BR1E)
acpiprt5 at acpi0: bus 2 (BR20)
acpiprt6 at acpi0: bus 3 (BR24)
acpiprt7 at acpi0: bus 4 (BR25)
acpiprt8 at acpi0: bus 5 (BR26)
acpiprt9 at acpi0: bus 6 (BR27)
acpicpu0 at acpi0: C3, C2, C1, PSS
acpicpu1 at acpi0: C3, C2, C1, PSS
acpicpu2 at acpi0: C3, C2, C1, PSS
acpicpu3 at acpi0: C3, C2, C1, PSS
acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
ipmi at mainbus0 not configured
cpu0: Enhanced SpeedStep 2400 MHz: speeds: 2401, 2400, 2267, 2133,
2000,
1867, 1733, 1600, 1467, 1333, 1200 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core DMI" rev 0x11
ppb0 at pci0 dev 3 function 0 "Intel Core PCIE" rev 0x11: msi
pci1 at ppb0 bus 1
em0 at pci1 dev 0 function 0 "Intel PRO/1000 (82576)" rev 0x01: msi,
address 00:1b:21:b7:29:bc
em1 at pci1 dev 0 function 1 "Intel PRO/1000 (82576)" rev 0x01: msi,
address 00:1b:21:b7:29:bd
"Intel Core Management" rev 0x11 at pci0 dev 8 function 0 not
configured
"Intel Core Scratch" rev 0x11 at pci0 dev 8 function 1 not configured
"Intel Core Control" rev 0x11 at pci0 dev 8 function 2 not configured
"Intel Core Misc" rev 0x11 at pci0 dev 8 function 3 not configured
"Intel Core QPI Link" rev 0x11 at pci0 dev 16 function 0 not configured
"Intel Core QPI Routing" rev 0x11 at pci0 dev 16 function 1 not
configured
ehci0 at pci0 dev 26 function 0 "Intel 3400 USB" rev 0x05: apic 7 int
21
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb1 at pci0 dev 28 function 0 "Intel 3400 PCIE" rev 0x05: msi
pci2 at ppb1 bus 2
ppb2 at pci0 dev 28 function 4 "Intel 3400 PCIE" rev 0x05: msi
pci3 at ppb2 bus 3
em2 at pci3 dev 0 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00:
msi,
address 00:25:90:39:72:44
ppb3 at pci0 dev 28 function 5 "Intel 3400 PCIE" rev 0x05: msi
pci4 at ppb3 bus 4
em3 at pci4 dev 0 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00:
msi,
address 00:25:90:39:72:45
ppb4 at pci0 dev 28 function 6 "Intel 3400 PCIE" rev 0x05: msi
pci5 at ppb4 bus 5
em4 at pci5 dev 0 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00:
msi,
address 00:25:90:39:72:46
ppb5 at pci0 dev 28 function 7 "Intel 3400 PCIE" rev 0x05: msi
pci6 at ppb5 bus 6
em5 at pci6 dev 0 function 0 "Intel PRO/1000 MT (82574L)" rev 0x00:
msi,
address 00:25:90:39:72:47
ehci1 at pci0 dev 29 function 0 "Intel 3400 USB" rev 0x05: apic 7 int
23
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 "Intel EHCI root hub" rev 2.00/1.00 addr 1
ppb6 at pci0 dev 30 function 0 "Intel 82801BA Hub-to-PCI" rev 0xa5
pci7 at ppb6 bus 7
vga1 at pci7 dev 3 function 0 "Matrox MGA G200eW" rev 0x0a
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
pcib0 at pci0 dev 31 function 0 "Intel 3420 LPC" rev 0x05
ahci0 at pci0 dev 31 function 2 "Intel 3400 AHCI" rev 0x05: msi, AHCI
1.3
scsibus0 at ahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0: <ATA, INTEL SSDSA2CT04, 4PC1> SCSI3
0/direct fixed naa.5001517959519da6
sd0: 38166MB, 512 bytes/sector, 78165360 sectors, thin
ichiic0 at pci0 dev 31 function 3 "Intel 3400 SMBus" rev 0x05: apic 7
int 18
iic0 at ichiic0
iic0: addr 0x18 00=00 01=02 02=00 03=00 04=04 05=42 06=10 07=03 08=01
09=00 0a=00 0b=00 words 00=006f 01=020c 02=0000 03=0000 04=0490 05=4204
06=104a 07=0300
iic0: addr 0x1a 00=00 01=02 02=00 03=00 04=04 05=42 06=10 07=03 08=01
09=00 0a=00 0b=00 words 00=006f 01=020c 02=0000 03=0000 04=0490 05=4208
06=104a 07=0300
"eeprom" at iic0 addr 0x50 not configured
"eeprom" at iic0 addr 0x52 not configured
isa0 at pcib0
isadma0 at isa0
com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
wbsio0 at isa0 port 0x2e/2: W83627DHG rev 0x25
lm1 at wbsio0 port 0xa10/8: W83627DHG
mtrr: Pentium Pro MTRR support
uhub2 at uhub0 port 1 "Intel Rate Matching Hub" rev 2.00/0.00 addr 2
uhidev0 at uhub2 port 2 configuration 1 interface 0 "Winbond
Electronics
Corp Hermon USB hidmouse Device" rev 1.10/0.01 addr 3
uhidev0: iclass 3/1
ums0 at uhidev0: 3 buttons, Z dir
wsmouse0 at ums0 mux 0
uhidev1 at uhub2 port 2 configuration 1 interface 1 "Winbond
Electronics
Corp Hermon USB hidmouse Device" rev 1.10/0.01 addr 3
uhidev1: iclass 3/1
ukbd0 at uhidev1: 8 modifier keys, 6 key codes
wskbd1 at ukbd0 mux 1
wskbd1: connecting to wsdisplay0
uhub3 at uhub1 port 1 "Intel Rate Matching Hub" rev 2.00/0.00 addr 2
vscsi0 at root
scsibus1 at vscsi0: 256 targets
softraid0 at root
scsibus2 at softraid0: 256 targets
root on sd0a (4d69391e1a09846f.a) swap on sd0b dump on sd0b

Reply | Threaded
Open this post in threaded view
|

Re: TCP Out-of-order packets on a machine behind an OpenBGPd based router

Stuart Henderson
On 2012-05-08, [hidden email] <[hidden email]> wrote:

> Hi misc,
>
> I'm about to set up two OpenBGPd machines. At the moment they are each
> connected to two different upstream providers running OpenBGPd (and
> OpenOSFPd on the internal interfaces). Operating system is
>
> OpenBSD test-a.openbgp.bla.com 5.0 GENERIC.MP#0 amd64
>
> (dmesg below)
>
> On a host reserved for testing (CentOS 6.2 x86_64), which sits
> logically
> (seen from the internet) behind those machines, in a otherwise empty
> /22, I see weird network problems (tcpdumping traffic on port 25, and
> loading it into wireshark for further analysis):
>
> Receiving mails (port 25, plain SMTP, a 3MiByte attachment) from an
> external mail server, which comes in via one of the new BGP machines, I
> see massive 'TCP out of order' messages in wireshark, as well as 'TCP
> Dup ACK' messages. This is on the testbed machine itself.
>
> On the OpenBGPd router, captured exactly the same traffic, all seems
> perfect.
>
> There are two Cisco switches sitting between test-a.openbgp.bla.com and
> the testbed mail server, all interfaces perfectly clean, no duplex
> problems, no underruns, no runts, nothing -- perfect.
>
> Traffic within my AS is also absolutely no problem, the Linux machine
> runs here perfectly as well.
>
> Any idea where to look?

Is PF in use? if so, have you done anything to make sure that you
aren't running into problems due to stateful firewall only seeing
half the packets (i.e. inbound via one machine, outbound via the
other)?

(Specifically, if this is happening and unavoidable, you could look
at 'defer' in pfsync, or sloppy states in PF).


>
> Thanks,
>
> Bernd
>
> $ dmesg

[ snipped from quote, but thanks for including it :) ]

Reply | Threaded
Open this post in threaded view
|

Re: TCP Out-of-order packets on a machine behind an OpenBGPd based router

bernd-34
Am 2012-05-08 16:02, schrieb Stuart Henderson:

> On 2012-05-08, [hidden email] <[hidden email]>
> wrote:
>> Hi misc,
>>
>> I'm about to set up two OpenBGPd machines. At the moment they are
>> each
>> connected to two different upstream providers running OpenBGPd (and
>> OpenOSFPd on the internal interfaces). Operating system is
>>
>> OpenBSD test-a.openbgp.bla.com 5.0 GENERIC.MP#0 amd64
>>
>> (dmesg below)
>>
>> On a host reserved for testing (CentOS 6.2 x86_64), which sits
>> logically
>> (seen from the internet) behind those machines, in a otherwise empty
>> /22, I see weird network problems (tcpdumping traffic on port 25,
>> and
>> loading it into wireshark for further analysis):
>>
>> Receiving mails (port 25, plain SMTP, a 3MiByte attachment) from an
>> external mail server, which comes in via one of the new BGP
>> machines, I
>> see massive 'TCP out of order' messages in wireshark, as well as
>> 'TCP
>> Dup ACK' messages. This is on the testbed machine itself.
>>
>> On the OpenBGPd router, captured exactly the same traffic, all seems
>> perfect.
>>
>> There are two Cisco switches sitting between test-a.openbgp.bla.com
>> and
>> the testbed mail server, all interfaces perfectly clean, no duplex
>> problems, no underruns, no runts, nothing -- perfect.
>>
>> Traffic within my AS is also absolutely no problem, the Linux
>> machine
>> runs here perfectly as well.
>>
>> Any idea where to look?
>
> Is PF in use? if so, have you done anything to make sure that you
> aren't running into problems due to stateful firewall only seeing
> half the packets (i.e. inbound via one machine, outbound via the
> other)?

That's a point, I do have asymmetric routing at the moment, as only the
(now active) Ciscos announce the /22 in question to the rest of the
world. So, ingress traffic crosses my OpenBSD machine, while egress
traffic does not.

However, the problem remains if I issued 'pfctl -d'.

> (Specifically, if this is happening and unavoidable, you could look
> at 'defer' in pfsync, or sloppy states in PF).
>
>
>>
>> Thanks,
>>
>> Bernd
>>
>> $ dmesg
>
> [ snipped from quote, but thanks for including it :) ]