fec(0) malformed packets?

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

fec(0) malformed packets?

Otto Moerbeek
Hi,

I am seeing packet corruption using a fec(4) on my armv7 machine.

A case that goes wrong is a packet of length 59 (a DNS query asking a
root server for the DS record of the nl. domain). Other queries of
length 59 also go wrong. The strange things is that if I manually do
the same query with dig (this also produces also 59 bytes):

        dig @199.9.14.201 nl. ds +dnssec +bufsize=1232

it does get through uncorrupted....

In the first case the packet is sent out with sendto(2) from
pdns_recursor, with dig it is sendmsg(2).

Two packet captures of the case that goes wrong:

Sent out by fec(4) as seen by tcpdump

16:14:46.961423 10.1.1.9.47893 > 199.9.14.201.53: [udp sum ok] 36284
[1au] DS? nl. ar: . OPT UDPsize=1232 DO(31) (ttl 64, id 46567, len 59)
  0000: 4500 003b b5e7 0000 4011 e3ee 0a01 0109  E..;....@.......
  0010: c709 0ec9 bb15 0035 0027 6d9b 8dbc 0000  .......5.'m.....
  0020: 0001 0000 0000 0001 026e 6c00 002b 0001  .........nl..+..
  0030: 0000 2904 d000 0080 0000 00              ..)........

Same packet as received on the router with em(4), the incoming
interface:

16:14:46.962582 10.1.1.9.47893 > 199.9.14.201.53: [bad udp cksum 9b6d!
-> 9a81] 36284 [1au] DS? nl. ar: . OPT[|domain] (ttl 64, id 46567, len 59)
  0000: 4500 003b b5e7 0000 4011 e3ee 0a01 0109  E..;....@.......
  0010: c709 0ec9 bb15 0035 0027 6d9b 8dbc 0000  .......5.'m.....
  0020: 0001 0000 0000 0001 026e 6c00 002b 0001  .........nl..+..
  0030: 0000 2904 d000 0080 0000 ec              ..)........

Note that the last byte differs.

dmesg below

        -Otto

OpenBSD 6.6 (GENERIC) #226: Sat Oct 12 08:36:17 MDT 2019
    [hidden email]:/usr/src/sys/arch/armv7/compile/GENERIC
real mem  = 2111655936 (2013MB)
avail mem = 2060648448 (1965MB)
mainbus0 at root: Wandboard i.MX6 Quad Board rev B1
cpu0 at mainbus0 mpidr 0: ARM Cortex-A9 r2p10
cpu0: 32KB 32b/line 4-way L1 VIPT I-cache, 32KB 32b/line 4-way L1 D-cache
cortex0 at mainbus0
amptimer0 at cortex0: tick rate 396000 KHz
armliicc0 at cortex0: rtl 7 waymask: 0x0000000f
imxtemp0 at mainbus0simplebus0 at mainbus0: "soc"
ampintc0 at simplebus0 nirq 160, ncpu 4: "interrupt-controller"
"dma-apbh" at simplebus0 not configured
"hdmi" at simplebus0 not configured
"gpu" at simplebus0 not configured
"gpu" at simplebus0 not configured
"timer" at simplebus0 not configured
"l2-cache" at simplebus0 not configured
simplebus1 at simplebus0: "aips-bus"
imxccm0 at simplebus1
imxanatop0 at simplebus1
syscon0 at simplebus1: "snvs"
imxrtc0 at syscon0
"snvs-lpgpr" at syscon0 not configured
syscon1 at simplebus1: "iomuxc-gpr"
"mux-controller" at syscon1 not configured
"ipu1_csi0_mux" at syscon1 not configured
"ipu2_csi1_mux" at syscon1 not configured
imxiomuxc0 at simplebus1
simplebus2 at simplebus1: "spba-bus"
"spdif" at simplebus2 not configured
imxuart0 at simplebus2: console
"ssi" at simplebus2 not configured
"asrc" at simplebus2 not configured
"vpu" at simplebus1 not configured
"gpt" at simplebus1 not configured
imxgpio0 at simplebus1
imxgpio1 at simplebus1
imxgpio2 at simplebus1
imxgpio3 at simplebus1
imxgpio4 at simplebus1
imxgpio5 at simplebus1
imxgpio6 at simplebus1
imxdog0 at simplebus1
"usbphy" at simplebus1 not configured
"usbphy" at simplebus1 not configured
"src" at simplebus1 not configured
imxgpc0 at simplebus1
"sdma" at simplebus1 not configured
simplebus3 at simplebus0: "aips-bus"
syscon2 at simplebus3: "ocotp"
"caam" at simplebus3 not configured
imxehci0 at simplebus3
usb0 at imxehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "i.MX EHCI root hub" rev 2.00/1.00 addr 1
imxehci1 at simplebus3
usb1 at imxehci1: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "i.MX EHCI root hub" rev 2.00/1.00 addr 1
"usbmisc" at simplebus3 not configured
fec0 at simplebus3
fec0: address 00:1f:7b:b4:06:10
atphy0 at fec0 phy 1: AR8035 10/100/1000 PHY, rev. 4
imxesdhc0 at simplebus3
imxesdhc0: 198 MHz base clock
sdmmc0 at imxesdhc0: 4-bit, sd high-speed, mmc high-speed, dma
imxesdhc1 at simplebus3
imxesdhc1: 198 MHz base clock
sdmmc1 at imxesdhc1: 4-bit, sd high-speed, mmc high-speed, dma
imxesdhc2 at simplebus3
imxesdhc2: 198 MHz base clock
sdmmc2 at imxesdhc2: 4-bit, sd high-speed, mmc high-speed, dma
imxiic0 at simplebus3
iic0 at imxiic0
imxiic1 at simplebus3
iic1 at imxiic1
"fsl,sgtl5000" at iic1 addr 0xa not configured
"mmdc" at simplebus3 not configured
"audmux" at simplebus3 not configured
"vdoa" at simplebus3 not configured
imxuart1 at simplebus3
"ipu" at simplebus0 not configured
"sram" at simplebus0 not configured
imxahci0 at simplebus0: AHCI 1.3
imxahci0: port 0: 3.0Gb/s
scsibus0 at imxahci0: 32 targets
sd0 at scsibus0 targ 0 lun 0: <ATA, KINGSTON SV300S3, 608A> naa.50026b736503c6f1
sd0: 114473MB, 512 bytes/sector, 234441648 sectors, thin
"gpu" at simplebus0 not configured
"ipu" at simplebus0 not configured
scsibus1 at sdmmc2: 2 targets, initiator 0
sd1 at scsibus1 targ 1 lun 0: <SD/MMC, SU16G, 0080> removable
sd1: 15193MB, 512 bytes/sector, 31116288 sectors
bwfm0 at sdmmc1 function 1
bwfm0: SoC interconnect SB not implemented
bwfm0: cannot attach chip
manufacturer 0x02d0, product 0x4329 at sdmmc1 function 2 not configured
manufacturer 0x02d0, product 0x4329 at sdmmc1 function 3 not configured
axen0 at uhub1 port 1 configuration 1 interface 0 "Sitecom Europe BV Sitecom USB 3.0 Gigabit" rev 2.10/1.00 addr 2
axen0: AX88179, address 00:0c:f6:ff:e0:3b
rgephy0 at axen0 phy 3: RTL8169S/8110S/8211 PHY, rev. 5
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
bootfile: sd0a:/bsd
boot device: sd0
root on sd1a (d1e4f9a54e2211c2.a) swap on sd1b dump on sd1b



Reply | Threaded
Open this post in threaded view
|

Re: fec(0) malformed packets?

Theo de Raadt-2
I've been reporting this since the dawn of the architecture.  I
saw it as a network hickup.  I didn't notice an earlier copy
of the packet that was corrupted.

Reply | Threaded
Open this post in threaded view
|

Re: fec(0) malformed packets?

Otto Moerbeek
On Tue, Oct 29, 2019 at 09:58:08AM -0600, Theo de Raadt wrote:

> I've been reporting this since the dawn of the architecture.  I
> saw it as a network hickup.  I didn't notice an earlier copy
> of the packet that was corrupted.
>

So it *is* size related and can be reproduced with ping:

[otto@wand:61]$ ping -c 1 -s 31 10.1.1.3
PING 10.1.1.3 (10.1.1.3): 31 data bytes

Outgoing:
# tcpdump -Xn -i fec0 -s 1500  -vvv icmp  
tcpdump: listening on fec0, link-type EN10MB
20:07:09.231929 10.1.1.9 > 10.1.1.3: icmp: echo request (id:9678
seq:0) [icmp cksum ok] (ttl 255, id 27755, len 59)
  0000: 4500 003b 6c6b 0000 ff01 3949 0a01 0109  E..;lk....9I....
  0010: 0a01 0103 0800 eb11 9678 0000 aabf 55f1  .........x....U.
  0020: 3356 ef18 87a0 ca13 74a6 c8ea 995c 4475  3V......t....\Du
  0030: 90dc e90f 1819 1a1b 1c1d 1e              ...........

Incoming:
# tcpdump -Xn -i em1 -s 1500  -vvv icmp
tcpdump: listening on em1, link-type EN10MB
20:07:09.231896 10.1.1.9 > 10.1.1.3: icmp: echo request (id:9678
seq:0) [bad icmp cksum eb11! -> b11] (ttl 255, id 27755, len 59)
  0000: 4500 003b 6c6b 0000 ff01 3949 0a01 0109  E..;lk....9I....
  0010: 0a01 0103 0800 eb11 9678 0000 aabf 55f1  .........x....U.
  0020: 3356 ef18 87a0 ca13 74a6 c8ea 995c 4475  3V......t....\Du
  0030: 90dc e90f 1819 1a1b 1c1d fe              ...........

Reply | Threaded
Open this post in threaded view
|

Re: fec(0) malformed packets?

Otto Moerbeek
On Tue, Oct 29, 2019 at 08:09:59PM +0100, Otto Moerbeek wrote:

> On Tue, Oct 29, 2019 at 09:58:08AM -0600, Theo de Raadt wrote:
>
> > I've been reporting this since the dawn of the architecture.  I
> > saw it as a network hickup.  I didn't notice an earlier copy
> > of the packet that was corrupted.
> >
>
> So it *is* size related and can be reproduced with ping:
>
> [otto@wand:61]$ ping -c 1 -s 31 10.1.1.3
> PING 10.1.1.3 (10.1.1.3): 31 data bytes
>
> Outgoing:
> # tcpdump -Xn -i fec0 -s 1500  -vvv icmp  
> tcpdump: listening on fec0, link-type EN10MB
> 20:07:09.231929 10.1.1.9 > 10.1.1.3: icmp: echo request (id:9678
> seq:0) [icmp cksum ok] (ttl 255, id 27755, len 59)
>   0000: 4500 003b 6c6b 0000 ff01 3949 0a01 0109  E..;lk....9I....
>   0010: 0a01 0103 0800 eb11 9678 0000 aabf 55f1  .........x....U.
>   0020: 3356 ef18 87a0 ca13 74a6 c8ea 995c 4475  3V......t....\Du
>   0030: 90dc e90f 1819 1a1b 1c1d 1e              ...........
>
> Incoming:
> # tcpdump -Xn -i em1 -s 1500  -vvv icmp
> tcpdump: listening on em1, link-type EN10MB
> 20:07:09.231896 10.1.1.9 > 10.1.1.3: icmp: echo request (id:9678
> seq:0) [bad icmp cksum eb11! -> b11] (ttl 255, id 27755, len 59)
>   0000: 4500 003b 6c6b 0000 ff01 3949 0a01 0109  E..;lk....9I....
>   0010: 0a01 0103 0800 eb11 9678 0000 aabf 55f1  .........x....U.
>   0020: 3356 ef18 87a0 ca13 74a6 c8ea 995c 4475  3V......t....\Du
>   0030: 90dc e90f 1819 1a1b 1c1d fe              ...........
>

I did not mention that any other size I tried works ok.

        -Otto

Reply | Threaded
Open this post in threaded view
|

Re: fec(0) malformed packets?

Otto Moerbeek
On Tue, Oct 29, 2019 at 08:12:04PM +0100, Otto Moerbeek wrote:

> On Tue, Oct 29, 2019 at 08:09:59PM +0100, Otto Moerbeek wrote:
>
> > On Tue, Oct 29, 2019 at 09:58:08AM -0600, Theo de Raadt wrote:
> >
> > > I've been reporting this since the dawn of the architecture.  I
> > > saw it as a network hickup.  I didn't notice an earlier copy
> > > of the packet that was corrupted.
> > >
> >
> > So it *is* size related and can be reproduced with ping:
> >
> > [otto@wand:61]$ ping -c 1 -s 31 10.1.1.3
> > PING 10.1.1.3 (10.1.1.3): 31 data bytes
> >
> > Outgoing:
> > # tcpdump -Xn -i fec0 -s 1500  -vvv icmp  
> > tcpdump: listening on fec0, link-type EN10MB
> > 20:07:09.231929 10.1.1.9 > 10.1.1.3: icmp: echo request (id:9678
> > seq:0) [icmp cksum ok] (ttl 255, id 27755, len 59)
> >   0000: 4500 003b 6c6b 0000 ff01 3949 0a01 0109  E..;lk....9I....
> >   0010: 0a01 0103 0800 eb11 9678 0000 aabf 55f1  .........x....U.
> >   0020: 3356 ef18 87a0 ca13 74a6 c8ea 995c 4475  3V......t....\Du
> >   0030: 90dc e90f 1819 1a1b 1c1d 1e              ...........
> >
> > Incoming:
> > # tcpdump -Xn -i em1 -s 1500  -vvv icmp
> > tcpdump: listening on em1, link-type EN10MB
> > 20:07:09.231896 10.1.1.9 > 10.1.1.3: icmp: echo request (id:9678
> > seq:0) [bad icmp cksum eb11! -> b11] (ttl 255, id 27755, len 59)
> >   0000: 4500 003b 6c6b 0000 ff01 3949 0a01 0109  E..;lk....9I....
> >   0010: 0a01 0103 0800 eb11 9678 0000 aabf 55f1  .........x....U.
> >   0020: 3356 ef18 87a0 ca13 74a6 c8ea 995c 4475  3V......t....\Du
> >   0030: 90dc e90f 1819 1a1b 1c1d fe              ...........
> >
>
> I did not mention that any other size I tried works ok.
>
> -Otto
>

This is not true; a little script their are more sizes failing. With -p 00:
31 nok
95 nok
151 nok
247 nok
311 nok
375 nok
407 nok
471 nok
503 nok
759 nok
791 nok
823 nok
855 nok
983 nok
1015 nok

With other -p arguments this varies a bit, but only odd packet lengths
are reported. This is the script:

#!/bin/ksh
for i in $(jot 1025); do
if ! ping -w 1 -c 1 -s $i -p00 10.1.1.3  > /dev/null; then
echo $i nok
fi
done

Reply | Threaded
Open this post in threaded view
|

Re: fec(0) malformed packets?

Nick Holland
On 2019-10-29 15:35, Otto Moerbeek wrote:
...

>
> This is not true; a little script their are more sizes failing. With -p 00:
> 31 nok
> 95 nok
> 151 nok
> 247 nok
> 311 nok
> 375 nok
> 407 nok
> 471 nok
> 503 nok
> 759 nok
> 791 nok
> 823 nok
> 855 nok
> 983 nok
> 1015 nok
>
> With other -p arguments this varies a bit, but only odd packet lengths
> are reported. This is the script:
>
> #!/bin/ksh
> for i in $(jot 1025); do
> if ! ping -w 1 -c 1 -s $i -p00 10.1.1.3  > /dev/null; then
> echo $i nok
> fi
> done
 
Some time back, (Oct 26, 2013) I reported something similar.
Rather than doing one ping, I did a "flood" ping of 100 packets, and saw
some really weird stuff, all packet size related.  Some didn't work at
all, some would work perfectly, some would drop SOME packets, a number
would drop ONE packet sometimes, never two or more.  

My really bad ones were also odd sizes, but I saw lesser problems
on at least -s480.

Mine was a Beaglebone Black, with its cpsw(4).
https://marc.info/?l=openbsd-bugs&m=138275913126582&w=2

Nick.