IPv6 fragmentation woes

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

IPv6 fragmentation woes

Laurent Caron (Mobile)
Hi,

Setup:
OpenBSD 5.9 box
Network interface: ix (Intel 1G/10G X520)

ix0: flags=18843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST,MPSAFE> mtu 1500
         lladdr 90:e2:ba:ba:c5:cc
         priority: 0
         media: Ethernet autoselect (1000baseT full-duplex,rxpause,txpause)
         status: active

vlan4: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
         lladdr 90:e2:ba:ba:c5:cc
         priority: 0
         vlan: 4 parent interface: ix0
         groups: vlan
         status: active
         inet 37.49.236.145 netmask 0xfffffe00 broadcast 37.49.237.255
         inet6 fe80::92e2:baff:feba:c5cc%vlan4 prefixlen 64 scopeid 0x12
         inet6 2001:7f8:54::145 prefixlen 64

When sending a ping 6 to a destination not accepting fragmented packets,
I experience loss with "big" (but < 1500) packets.

% ping6 -s 1234 2001:7f8:54::250

Ex:
14:03:07.959532 2001:7f8:54::145 > 2001:7f8:54::250: frag
(0xbfb11fea:1232@0+) icmp6: echo request
14:03:07.959536 2001:7f8:54::145 > 2001:7f8:54::250: frag
(0xbfb11fea:10@1232)
14:03:07.960179 2001:7f8:54::250 > 2001:7f8:54::145: icmp6: echo reply

IPv4 however is fine (as IPv6 with ping -s $SIZE with $SIZE <= 1232

Do you guys have any idea about it ?

Thanks

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Christoph Viethen-2
Hello,

on 17.05.2016 14:05, Laurent CARON wrote:

> When sending a ping 6 to a destination not accepting fragmented
> packets, I experience loss with "big" (but < 1500) packets.
>
> % ping6 -s 1234 2001:7f8:54::250
>
> Ex:
> 14:03:07.959532 2001:7f8:54::145 > 2001:7f8:54::250: frag
> (0xbfb11fea:1232@0+) icmp6: echo request
> 14:03:07.959536 2001:7f8:54::145 > 2001:7f8:54::250: frag
> (0xbfb11fea:10@1232)
> 14:03:07.960179 2001:7f8:54::250 > 2001:7f8:54::145: icmp6: echo reply

Sorry, just been to lunch and probably my brain is off ... but where
exactly is the loss here? You are getting a reply, right? Maybe the
receiver did receive all the fragments, figured the path MTU is large
enough and sent you back a non-fragmented echo reply?

And what do you mean by saying "destination not accepting fragmented
packets"? Do all the fragments reach the destination (and the
destination then has to figure out what to do with them)? Or does some
middlebox (I'm thinking switches / routers) touch the traffic,
discarding e.g. non-initial fragments?

Can you do tcpdump or tshark etc. on the source as well as on the
destination box to see what exactly gets through and whether your
problem is sender/destination-related or rather middlebox-related?


Oh, and what exactly caused the fragmentation in the first place? Your
local MTU seems larger than the required 1242 bytes, so path MTU
discovery must have kicked in somewhere on the way. Shouldn't you have
gotten ICMPv6 packet too big (type 2) messages on ..::145 ? Those
should be visible in tcpdump .. but you are not showing any.


I tried "ping6 2001:7f8:54::250" myself right now. Seems that my ICMP
echo request goes through unfragmented, whereas I'm getting two
fragments (first: 1232 bytes) back. So on the path from ...::145 to
me, fragmentation seems to happen, but not in the other direction.
IMHO, this means that ..::145 must have successfully performed PMTUD.


Cheers,

   Christoph

--
  [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Christoph Viethen-2
> I tried "ping6 2001:7f8:54::250" myself right now.

Sorry, that should have read:

"I tried

$ ping6 -s 1234 2001:7f8:54::145  "

(I did send the packets to ..::145 , not the other one)

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Laurent Caron (Mobile)
Hi,

Back to this issue:

Setup:
Source: Linux box: 2a02:27d0:100:115:6000::200
Destination: OpenBSD 5.9-stable box: 2a02:27d0:116::3

Source#:  ping6 -M do -s 1232 2a02:27d0:100:114::3
PING 2a02:27d0:100:114::3(2a02:27d0:100:114::3) 1232 data bytes
1240 bytes from 2a02:27d0:100:114::3: icmp_seq=1 ttl=63 time=0.224 ms
...
1240 bytes from 2a02:27d0:100:114::3: icmp_seq=4 ttl=63 time=0.274 ms
^C
--- 2a02:27d0:100:114::3 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2998ms
rtt min/avg/max/mdev = 0.203/0.241/0.274/0.034 ms

Destination#: tcpdump -ni trunk0 host 2a02:27d0:100:115:6000::200
tcpdump: listening on trunk0, link-type EN10MB
16:26:22.236667 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:26:22.236684 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200:
icmp6: echo reply
16:26:23.235712 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:26:23.235732 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200:
icmp6: echo reply
16:26:24.234770 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:26:24.234786 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200:
icmp6: echo reply


Now when increasing to 1233 data bytes:

Source#: ping6 -M do -s 1233 2a02:27d0:100:114::3
PING 2a02:27d0:100:114::3(2a02:27d0:100:114::3) 1233 data bytes
1241 bytes from 2a02:27d0:100:114::3: icmp_seq=1 ttl=63 time=0.212 ms
...
1241 bytes from 2a02:27d0:100:114::3: icmp_seq=12 ttl=63 time=0.232 ms
^C
--- 2a02:27d0:100:114::3 ping statistics ---
12 packets transmitted, 12 received, 0% packet loss, time 10998ms
rtt min/avg/max/mdev = 0.206/0.236/0.342/0.043 ms

Destination#: tcpdump -ni trunk0 host 2a02:27d0:100:115:6000::200
16:28:23.922257 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:23.922284 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x11cd00dd:1232@0+) icmp6: echo reply
16:28:23.922289 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x11cd00dd:9@1232)
16:28:24.921229 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:24.921256 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x1be2a72d:1232@0+) icmp6: echo reply
16:28:24.921260 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x1be2a72d:9@1232)
16:28:25.920252 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:25.920290 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x15850f70:1232@0+) icmp6: echo reply
16:28:25.920294 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x15850f70:9@1232)
16:28:26.919167 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:26.919194 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x30d81daa:1232@0+) icmp6: echo reply
16:28:26.919200 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x30d81daa:9@1232)


Sounds like replies are fragmented.

Please note trunk0 is composed of one em and one bnx interface:

trunk0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
         lladdr 00:1b:21:b5:2a:8d
         priority: 0
         trunk: trunkproto lacp
         trunk id: [(8000,00:1b:21:b5:2a:8d,4074,0000,0000),
                  (007F,64:87:88:bc:db:00,0006,0000,0000)]
                 trunkport bnx2 active,collecting,distributing
                 trunkport em3 active,collecting,distributing
         groups: trunk
         media: Ethernet autoselect
         status: active

Am I mistaken on something, or is this behavior perfectly normal ?

Please note # tracepath6 from the linux box to the openbsd one reports:
Resume: pmtu 1500 hops 2 back 2

Thanks

Laurent

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Laurent Caron (Mobile)
Hi,

After some more tests:

Source: Linux machine with IPv6: 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4
Destination: Linux machine with IPv6: 2a02:27d0:0:5e0d:428d:5cff:fea5:501e

source# ping6 -M do -s 1300 2a02:27d0:0:5e0d:428d:5cff:fea5:501e
destination# tcpdump -ni enp3s0 host 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4
09:38:55.735387 IP6 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4 > 2a02:27d0:0:5e0d:428d:5cff:fea5:501e: ICMP6, echo request, seq 1, length 1308
09:38:55.735826 IP6 2a02:27d0:0:5e0d:428d:5cff:fea5:501e > 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4: ICMP6, echo reply, seq 1, length 1308
09:38:56.736882 IP6 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4 > 2a02:27d0:0:5e0d:428d:5cff:fea5:501e: ICMP6, echo request, seq 2, length 1308
09:38:56.736998 IP6 2a02:27d0:0:5e0d:428d:5cff:fea5:501e > 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4: ICMP6, echo reply, seq 2, length 1308
...

Source: Linux machine with IPv6: 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4
Destination machine OpenBSD 5.8 with IPv6: 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236

source# ping6 -M do -s 1300 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236
destination# tcpdump -ni bge0 host 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4
17:58:03.040184 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4 > 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236: icmp6: echo request [flowlabel 0x467ca]
17:58:03.040295 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236 > 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4: frag (0x04183a32:1232@0+) icmp6: echo reply
17:58:03.040296 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236 > 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4: frag (0x04183a32:76@1232)
17:58:04.034362 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4 > 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236: icmp6: echo request [flowlabel 0x467ca]
17:58:04.034430 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236 > 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4: frag (0x36bc0632:1232@0+) icmp6: echo reply
17:58:04.034432 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236 > 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4: frag (0x36bc0632:76@1232)

source: Linux machine with IPv6: 2a02:27d0:0:5e0d:1a03:73ff:feba:50b4
Destination machine OpenBSD 5.8 with IPv6: 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236

source# ping6 -M do -s 1232 2a02:27d0:0:5e0c:d6be:d9ff:fe95:a236
leads to answers fitting in one packet only (1280 byte) with no
fragmentation whatsoever.

source# ping6 -M do -s 1233 leads to fragmentation

==================================================================
I wonder why ICMPv6 answers bigger than 1232 are always fragmented.
==================================================================

# netstat -rnf inet6 | grep bge0
default                            fe80::42b4:f0ff:fec6:7201%bge0 UG         0      211     -    56 bge0
2a02:27d0:0:5e0c::/64              fe80::d6be:d9ff:fe95:a236%bge0 UC         0        0     -     4 bge0
fe80::%bge0/64                     fe80::d6be:d9ff:fe95:a236%bge0 UC         1        0     -     4 bge0
fe80::42b4:f0ff:fec6:7201%bge0     40:b4:f0:c6:72:01              UHLc       1       13     -     4 bge0
fe80::d6be:d9ff:fe95:a236%bge0     d4:be:d9:95:a2:36              UHLl       0        0     -     1 lo0  
ff01::%bge0/32                     fe80::d6be:d9ff:fe95:a236%bge0 UC         0        0     -     4 bge0
ff02::%bge0/32                     fe80::d6be:d9ff:fe95:a236%bge0 UC         0        0     -     4 bge0

Please note, replacing bge0 with em0 leads to the very same result,
ruling out a difference of ICMPv6 handling between the 2 drivers.

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Laurent Caron (Mobile)
In reply to this post by Laurent Caron (Mobile)
Hi,

Does anybody have a clue about this issue ? Thanks



Setup:
Source: Linux box: 2a02:27d0:100:115:6000::200
Destination: OpenBSD 5.9-stable box: 2a02:27d0:116::3

Source#:  ping6 -M do -s 1232 2a02:27d0:100:114::3
PING 2a02:27d0:100:114::3(2a02:27d0:100:114::3) 1232 data bytes
1240 bytes from 2a02:27d0:100:114::3: icmp_seq=1 ttl=63 time=0.224 ms
...
1240 bytes from 2a02:27d0:100:114::3: icmp_seq=4 ttl=63 time=0.274 ms
^C
--- 2a02:27d0:100:114::3 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2998ms
rtt min/avg/max/mdev = 0.203/0.241/0.274/0.034 ms

Destination#: tcpdump -ni trunk0 host 2a02:27d0:100:115:6000::200
tcpdump: listening on trunk0, link-type EN10MB
16:26:22.236667 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:26:22.236684 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200:
icmp6: echo reply
16:26:23.235712 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:26:23.235732 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200:
icmp6: echo reply
16:26:24.234770 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:26:24.234786 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200:
icmp6: echo reply


Now when increasing to 1233 data bytes:

Source#: ping6 -M do -s 1233 2a02:27d0:100:114::3
PING 2a02:27d0:100:114::3(2a02:27d0:100:114::3) 1233 data bytes
1241 bytes from 2a02:27d0:100:114::3: icmp_seq=1 ttl=63 time=0.212 ms
...
1241 bytes from 2a02:27d0:100:114::3: icmp_seq=12 ttl=63 time=0.232 ms
^C
--- 2a02:27d0:100:114::3 ping statistics ---
12 packets transmitted, 12 received, 0% packet loss, time 10998ms
rtt min/avg/max/mdev = 0.206/0.236/0.342/0.043 ms

Destination#: tcpdump -ni trunk0 host 2a02:27d0:100:115:6000::200
16:28:23.922257 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:23.922284 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x11cd00dd:1232@0+) icmp6: echo reply
16:28:23.922289 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x11cd00dd:9@1232)
16:28:24.921229 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:24.921256 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x1be2a72d:1232@0+) icmp6: echo reply
16:28:24.921260 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x1be2a72d:9@1232)
16:28:25.920252 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:25.920290 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x15850f70:1232@0+) icmp6: echo reply
16:28:25.920294 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x15850f70:9@1232)
16:28:26.919167 2a02:27d0:100:115:6000::200 > 2a02:27d0:100:114::3:
icmp6: echo request [flowlabel 0x9d84e]
16:28:26.919194 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x30d81daa:1232@0+) icmp6: echo reply
16:28:26.919200 2a02:27d0:100:114::3 > 2a02:27d0:100:115:6000::200: frag
(0x30d81daa:9@1232)


Sounds like replies are fragmented.

Please note trunk0 is composed of one em and one bnx interface:

trunk0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
         lladdr 00:1b:21:b5:2a:8d
         priority: 0
         trunk: trunkproto lacp
         trunk id: [(8000,00:1b:21:b5:2a:8d,4074,0000,0000),
                  (007F,64:87:88:bc:db:00,0006,0000,0000)]
                 trunkport bnx2 active,collecting,distributing
                 trunkport em3 active,collecting,distributing
         groups: trunk
         media: Ethernet autoselect
         status: active

Am I mistaken on something, or is this behavior perfectly normal ?

Please note # tracepath6 from the linux box to the openbsd one reports:
Resume: pmtu 1500 hops 2 back 2

Thanks

Laurent

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Fernando Gont-2
On 08/09/2016 07:42 AM, Laurent CARON wrote:
> Hi,
>
> Does anybody have a clue about this issue ? Thanks

Based on a quick look at what you sent, this is not what I would expect.


> Am I mistaken on something, or is this behavior perfectly normal ?
>
> Please note # tracepath6 from the linux box to the openbsd one reports:
> Resume: pmtu 1500 hops 2 back 2

This doesn't really matter. PMTU can be assymetric. So you should use
tracepath6 from OpenBSD ot Linux, since that's the direction in which
traffic is being fragmented.

Thanks,
--
Fernando Gont
e-mail: [hidden email] || [hidden email]
PGP Fingerprint: 7809 84F5 322E 45C7 F1C9 3945 96EE A9EF D076 FFF1

Reply | Threaded
Open this post in threaded view
|

Re: IPv6 fragmentation woes

Stuart Henderson
On 2016-09-13, Fernando Gont <[hidden email]> wrote:

> On 08/09/2016 07:42 AM, Laurent CARON wrote:
>> Hi,
>>
>> Does anybody have a clue about this issue ? Thanks
>
> Based on a quick look at what you sent, this is not what I would expect.
>
>
>> Am I mistaken on something, or is this behavior perfectly normal ?
>>
>> Please note # tracepath6 from the linux box to the openbsd one reports:
>> Resume: pmtu 1500 hops 2 back 2
>
> This doesn't really matter. PMTU can be assymetric. So you should use
> tracepath6 from OpenBSD ot Linux, since that's the direction in which
> traffic is being fragmented.
>
> Thanks,

Last time I looked at tracepath it was quite Linux-specific code and
not very portable. You can get similar information from the useful tool
"scamper":

$ scamper -I 'trace -M [ip_address]'

or with the OpenBSD package you can do 'scamper-trace -M [ip]'
- note that it wants a numeric v4/v6 address not a hostname.