sis(4) induced uvm faults on net4801

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

sis(4) induced uvm faults on net4801

Nathanael Rensen-3
tl;dr: Zero-length packets from sis(4) on net4801 result in negative
length mbufs causing uvm faults.

I have observed uvm faults shortly after bringing up a sis(4) interface on
a Soekris net4801:

uvm_fault(0xd3adbbf0, 0xd3ee5000, 0, 1) -> e
kernel: page fault trap, code=0
Stopped at      memcpy+0x13:    repe movsl      (%esi),%es:(%edi)

ddb> trace
memcpy(d6025e00,1a280e,3b9aca00,2,1) at memcpy+0x13
m_copym2(d6025e00,e,3b9aca00,2) at m_copym2+0x19
bridge_m_dup(d6025e00,d3eed000,f13c1d78,d02a0d4b) at bridge_m_dup+0x2f
bridge_localbroadcast(d3eed000,d3ee0c00,f13c1dda,d6025e00) at bridge_localbroad
cast+0x47
bridge_broadcast(d3eed000,d3d3d834,f13c1dda,d6025e00,d3d3d834) at bridge_broadc
ast+0xaf
bridgeintr_frame(d3eed000,d3d3d834,d6025e00,d3ed9700) at bridgeintr_frame+0x1ed

bridge_process(d3d3d834,d6025e00,f13c1e6c,f13c1e6c,d6025c00) at bridge_process+
0x257
bridgeintr(d029312e,d3b22b20,d6025c00,f13c1ea8,d3d5c800) at bridgeintr+0x54
netintr(0,d020224c,0,f13c1efc) at netintr+0x47
softintr_dispatch(1) at softintr_dispatch+0x6c
Xsoftnet() at Xsoftnet+0x12
--- interrupt ---
0:

ddb> show mbuf 0xd6025e00
mbuf 0xd6025e00
m_type: 1       m_flags: b<M_EXT,M_PKTHDR,M_EXTWR>
m_next: 0x0     m_nextpkt: 0x0
m_data: 0xd3d42000      m_len: 4294967292
m_dat: 0xd6025e14       m_pktdat: 0xd6025e44
m_ptkhdr.ph_ifidx: 1    m_pkthdr.len: -4
m_ptkhdr.ph_tags: 0x0   m_pkthdr.ph_tagsset: 0
m_pkthdr.ph_flowid: 0
m_pkthdr.csum_flags: 0
m_pkthdr.ether_vtag: 0  m_ptkhdr.ph_rtableid: 0
m_pkthdr.pf.statekey: 0x0       m_pkthdr.pf.inp 0x0
m_pkthdr.pf.qid: 0      m_pkthdr.pf.tag: 0
m_pkthdr.pf.flags: 0
m_pkthdr.pf.routed: 0   m_pkthdr.pf.prio: 3
m_ext.ext_buf: 0xd3d42000       m_ext.ext_size: 2048
m_ext.ext_free: 0xd027a717      m_ext.ext_arg: 0xd3b0050c
m_ext.ext_nextref: 0xd6025e00   m_ext.ext_prevref: 0xd6025e00

# dmesg | grep sis0
sis0 at pci0 dev 6 function 0 vendor 0x100b product 0x0020 rev 0x00, DP83816A: irq 10, address 00:00:24:c4:fe:3c
nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1

I traced this to the DP83816A sometimes producing a small number of zero-
length packets soon after initialisation. The sis(4) driver unconditionally
subtracts ETHER_CRC_LEN from the packet length. When the packet length is
zero this results in a negative length mbuf.

#define SIS_RXBYTES(x) \
        ((letoh32((x)->sis_ctl) & SIS_CMDSTS_BUFLEN) - ETHER_CRC_LEN)

total_len = SIS_RXBYTES(cur_rx);

m->m_pkthdr.len = m->m_len = total_len;

I haven't been able to nail down the root cause of this behaviour. It is
not a recently introduced issue. For me this happens commonly enough that
it is readily reproduced, but the zero-length packets do not occur on
every boot and not every zero-length packet results in a uvm fault.

A web search didn't turn up any similar reports so perhaps there is
something unusual about my situation. Maybe having sis0 on a bridge
introduces a code path that increases the likelihood of a uvm_fault in
response to a negative length mbuf.

The descriptor cmdsts for these zero-length packets is consistently:

    0xd0000000 = SIS_CMDSTS_OWN | SIS_CMDSTS_MORE | SIS_CMDSTS_CRC

I found the following comment in the DP83816A datasheet:

    Care must be taken when setting DRTH to a value lower than the number
    of bytes needed to determine if packet should be accepted or rejected.
    In this case, the packet might be rejected after the bus master
    operation to begin transferring the packet into memory has begun. When
    this occurs, neither the OK bit or any error status bit in the
    descriptor's cmdsts will be set.

sis(4) sets DRTH to 64 bytes, which is sufficient for packet identification.
In the end, maybe this is just buggy DP83816A behaviour.

In any case the following diff works around the problem for me.

Nathanael


Index: if_sisreg.h
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_sisreg.h,v
retrieving revision 1.34
diff -u -p -r1.34 if_sisreg.h
--- if_sisreg.h 11 Feb 2015 21:36:02 -0000 1.34
+++ if_sisreg.h 6 Jan 2016 14:33:11 -0000
@@ -343,8 +343,7 @@ struct sis_desc {
 #define SIS_LASTDESC(x) (!(letoh32((x)->sis_ctl) & SIS_CMDSTS_MORE)))
 #define SIS_OWNDESC(x) (letoh32((x)->sis_ctl) & SIS_CMDSTS_OWN)
 #define SIS_INC(x, y) (x) = ((x) == ((y)-1)) ? 0 : (x)+1
-#define SIS_RXBYTES(x) \
- ((letoh32((x)->sis_ctl) & SIS_CMDSTS_BUFLEN) - ETHER_CRC_LEN)
+#define SIS_RXBYTES(x) (letoh32((x)->sis_ctl) & SIS_CMDSTS_BUFLEN)
 
 #define SIS_RXSTAT_COLL 0x00010000
 #define SIS_RXSTAT_LOOPBK 0x00020000
Index: if_sis.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_sis.c,v
retrieving revision 1.132
diff -u -p -r1.132 if_sis.c
--- if_sis.c 25 Nov 2015 03:09:59 -0000 1.132
+++ if_sis.c 6 Jan 2016 14:33:11 -0000
@@ -1389,6 +1389,18 @@ sis_rxeof(struct sis_softc *sc)
  if_rxr_put(&sc->sis_cdata.sis_rx_ring, 1);
 
  /*
+ * DP83816A sometimes produces zero-length packets
+ * shortly after initialisation.
+ */
+ if (total_len == 0) {
+ m_freem(m);
+ continue;
+ }
+
+ /* The ethernet CRC is always included */
+ total_len -= ETHER_CRC_LEN;
+
+ /*
  * If an error occurs, update stats, clear the
  * status word and leave the mbuf cluster in place:
  * it should simply get re-used next time this descriptor

Reply | Threaded
Open this post in threaded view
|

Re: sis(4) induced uvm faults on net4801

evelyne.levieux
 

Hi,

I have the same problem with an re(4) adapter and CARP.

My
post is in the bugs mail list (UVM Fault with CARP on re(4), 2016-01-17
09:15)

Fabrice

Le 2016-01-07 12:36, Nathanael Rensen a écrit :

>
tl;dr: Zero-length packets from sis(4) on net4801 result in negative
>
length mbufs causing uvm faults.
>
> I have observed uvm faults shortly
after bringing up a sis(4) interface on
> a Soekris net4801:
>
>
uvm_fault(0xd3adbbf0, 0xd3ee5000, 0, 1) -> e
> kernel: page fault trap,
code=0
> Stopped at memcpy+0x13: repe movsl (%esi),%es:(%edi)
>
> ddb>
trace
> memcpy(d6025e00,1a280e,3b9aca00,2,1) at memcpy+0x13
>
m_copym2(d6025e00,e,3b9aca00,2) at m_copym2+0x19
>
bridge_m_dup(d6025e00,d3eed000,f13c1d78,d02a0d4b) at bridge_m_dup+0x2f
>
bridge_localbroadcast(d3eed000,d3ee0c00,f13c1dda,d6025e00) at
bridge_localbroad
> cast+0x47
>
bridge_broadcast(d3eed000,d3d3d834,f13c1dda,d6025e00,d3d3d834) at
bridge_broadc
> ast+0xaf
>
bridgeintr_frame(d3eed000,d3d3d834,d6025e00,d3ed9700) at
bridgeintr_frame+0x1ed
>
>
bridge_process(d3d3d834,d6025e00,f13c1e6c,f13c1e6c,d6025c00) at
bridge_process+
> 0x257
>
bridgeintr(d029312e,d3b22b20,d6025c00,f13c1ea8,d3d5c800) at
bridgeintr+0x54
> netintr(0,d020224c,0,f13c1efc) at netintr+0x47
>
softintr_dispatch(1) at softintr_dispatch+0x6c
> Xsoftnet() at
Xsoftnet+0x12
> --- interrupt ---
> 0:
>
> ddb> show mbuf 0xd6025e00
>
mbuf 0xd6025e00
> m_type: 1 m_flags: b
> m_next: 0x0 m_nextpkt: 0x0
>
m_data: 0xd3d42000 m_len: 4294967292
> m_dat: 0xd6025e14 m_pktdat:
0xd6025e44
> m_ptkhdr.ph_ifidx: 1 m_pkthdr.len: -4
> m_ptkhdr.ph_tags:
0x0 m_pkthdr.ph_tagsset: 0
> m_pkthdr.ph_flowid: 0
>
m_pkthdr.csum_flags: 0
> m_pkthdr.ether_vtag: 0 m_ptkhdr.ph_rtableid:
0
> m_pkthdr.pf.statekey: 0x0 m_pkthdr.pf.inp 0x0
> m_pkthdr.pf.qid: 0
m_pkthdr.pf.tag: 0
> m_pkthdr.pf.flags: 0
> m_pkthdr.pf.routed: 0
m_pkthdr.pf.prio: 3
> m_ext.ext_buf: 0xd3d42000 m_ext.ext_size: 2048
>
m_ext.ext_free: 0xd027a717 m_ext.ext_arg: 0xd3b0050c
>
m_ext.ext_nextref: 0xd6025e00 m_ext.ext_prevref: 0xd6025e00
>
> # dmesg
| grep sis0
> sis0 at pci0 dev 6 function 0 vendor 0x100b product 0x0020
rev 0x00, DP83816A: irq 10, address 00:00:24:c4:fe:3c
> nsphyter0 at
sis0 phy 0: DP83815 10/100 PHY, rev. 1
>
> I traced this to the
DP83816A sometimes producing a small number of zero-
> length packets
soon after initialisation. The sis(4) driver unconditionally
> subtracts
ETHER_CRC_LEN from the packet length. When the packet length is
> zero
this results in a negative length mbuf.
>
> #define SIS_RXBYTES(x)
>
((letoh32((x)->sis_ctl) & SIS_CMDSTS_BUFLEN) - ETHER_CRC_LEN)
>
>
total_len = SIS_RXBYTES(cur_rx);
>
> m->m_pkthdr.len = m->m_len =
total_len;
>
> I haven't been able to nail down the root cause of this
behaviour. It is
> not a recently introduced issue. For me this happens
commonly enough that
> it is readily reproduced, but the zero-length
packets do not occur on
> every boot and not every zero-length packet
results in a uvm fault.
>
> A web search didn't turn up any similar
reports so perhaps there is
> something unusual about my situation.
Maybe having sis0 on a bridge
> introduces a code path that increases
the likelihood of a uvm_fault in
> response to a negative length mbuf.
>

> The descriptor cmdsts for these zero-length packets is
consistently:
>
> 0xd0000000 = SIS_CMDSTS_OWN | SIS_CMDSTS_MORE |
SIS_CMDSTS_CRC
>
> I found the following comment in the DP83816A
datasheet:
>
> Care must be taken when setting DRTH to a value lower
than the number
> of bytes needed to determine if packet should be
accepted or rejected.
> In this case, the packet might be rejected after
the bus master
> operation to begin transferring the packet into memory
has begun. When
> this occurs, neither the OK bit or any error status
bit in the
> descriptor's cmdsts will be set.
>
> sis(4) sets DRTH to
64 bytes, which is sufficient for packet identification.
> In the end,
maybe this is just buggy DP83816A behaviour.
>
> In any case the
following diff works around the problem for me.
>
> Nathanael
>
>
Index: if_sisreg.h
>
===================================================================
>
RCS file: /cvs/src/sys/dev/pci/if_sisreg.h,v
> retrieving revision
1.34
> diff -u -p -r1.34 if_sisreg.h
> --- if_sisreg.h 11 Feb 2015
21:36:02 -0000 1.34
> +++ if_sisreg.h 6 Jan 2016 14:33:11 -0000
> @@
-343,8 +343,7 @@ struct sis_desc {
> #define SIS_LASTDESC(x)
(!(letoh32((x)->sis_ctl) & SIS_CMDSTS_MORE)))
> #define SIS_OWNDESC(x)
(letoh32((x)->sis_ctl) & SIS_CMDSTS_OWN)
> #define SIS_INC(x, y) (x) =
((x) == ((y)-1)) ? 0 : (x)+1
> -#define SIS_RXBYTES(x)
> -
((letoh32((x)->sis_ctl) & SIS_CMDSTS_BUFLEN) - ETHER_CRC_LEN)
> +#define
SIS_RXBYTES(x) (letoh32((x)->sis_ctl) & SIS_CMDSTS_BUFLEN)
>
> #define
SIS_RXSTAT_COLL 0x00010000
> #define SIS_RXSTAT_LOOPBK 0x00020000
>
Index: if_sis.c
>
===================================================================
>
RCS file: /cvs/src/sys/dev/pci/if_sis.c,v
> retrieving revision 1.132
>
diff -u -p -r1.132 if_sis.c
> --- if_sis.c 25 Nov 2015 03:09:59 -0000
1.132
> +++ if_sis.c 6 Jan 2016 14:33:11 -0000
> @@ -1389,6 +1389,18 @@
sis_rxeof(struct sis_softc *sc)
> if_rxr_put(&sc->sis_cdata.sis_rx_ring,
1);
>
> /*
> + * DP83816A sometimes produces zero-length packets
> + *
shortly after initialisation.
> + */
> + if (total_len == 0) {
> +
m_freem(m);
> + continue;
> + }
> +
> + /* The ethernet CRC is always
included */
> + total_len -= ETHER_CRC_LEN;
> +
> + /*
> * If an error
occurs, update stats, clear the
> * status word and leave the mbuf
cluster in place:
> * it should simply get re-used next time this
descriptor