em(4): Don't count RX overruns and missed packets as input errros

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
Don't count RX overruns and missed packets as inputs errors. They're expected to
increment when using MCLGETI.

OK?


Index: if_em.c
===================================================================
RCS file: /cvs/src/sys/dev/pci/if_em.c,v
retrieving revision 1.275
diff -u -p -u -p -r1.275 if_em.c
--- if_em.c 28 Dec 2013 03:34:54 -0000 1.275
+++ if_em.c 31 Dec 2013 06:03:55 -0000
@@ -3246,9 +3248,9 @@ em_update_stats_counters(struct em_softc
     sc->stats.rxerrc +
     sc->stats.crcerrs +
     sc->stats.algnerrc +
-    sc->stats.ruc + sc->stats.roc +
-    sc->stats.mpc + sc->stats.cexterr +
-    sc->rx_overruns;
+    sc->stats.ruc +
+    sc->stats.roc +
+    sc->stats.cexterr;
 
  /* Tx Errors */
  ifp->if_oerrors = sc->stats.ecol + sc->stats.latecol +

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Mark Kettenis
> Date: Tue, 31 Dec 2013 01:28:04 -0500
> From: Brad Smith <[hidden email]>
>
> Don't count RX overruns and missed packets as inputs errors. They're
> expected to increment when using MCLGETI.
>
> OK?

These may be "expected", but they're still packets that were not
received.  And it is useful to know about these, for example when
debugging TCP performance issues.

> Index: if_em.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/pci/if_em.c,v
> retrieving revision 1.275
> diff -u -p -u -p -r1.275 if_em.c
> --- if_em.c 28 Dec 2013 03:34:54 -0000 1.275
> +++ if_em.c 31 Dec 2013 06:03:55 -0000
> @@ -3246,9 +3248,9 @@ em_update_stats_counters(struct em_softc
>      sc->stats.rxerrc +
>      sc->stats.crcerrs +
>      sc->stats.algnerrc +
> -    sc->stats.ruc + sc->stats.roc +
> -    sc->stats.mpc + sc->stats.cexterr +
> -    sc->rx_overruns;
> +    sc->stats.ruc +
> +    sc->stats.roc +
> +    sc->stats.cexterr;
>  
>   /* Tx Errors */
>   ifp->if_oerrors = sc->stats.ecol + sc->stats.latecol +
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
On 31/12/13 3:14 AM, Mark Kettenis wrote:

>> Date: Tue, 31 Dec 2013 01:28:04 -0500
>> From: Brad Smith <[hidden email]>
>>
>> Don't count RX overruns and missed packets as inputs errors. They're
>> expected to increment when using MCLGETI.
>>
>> OK?
>
> These may be "expected", but they're still packets that were not
> received.  And it is useful to know about these, for example when
> debugging TCP performance issues.

Well do we want to keep just the missed packets or both? Part of the
diff was inspired by this commit when I was looking at what counters
were incrementing..

for bge(4)..

revision 1.334
date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
dont count rx ring overruns as input errors. with MCLGETI controlling the
ring we expect to run out of rx descriptors as a matter of course, its not
an error.

ok mikeb@


>> Index: if_em.c
>> ===================================================================
>> RCS file: /cvs/src/sys/dev/pci/if_em.c,v
>> retrieving revision 1.275
>> diff -u -p -u -p -r1.275 if_em.c
>> --- if_em.c 28 Dec 2013 03:34:54 -0000 1.275
>> +++ if_em.c 31 Dec 2013 06:03:55 -0000
>> @@ -3246,9 +3248,9 @@ em_update_stats_counters(struct em_softc
>>      sc->stats.rxerrc +
>>      sc->stats.crcerrs +
>>      sc->stats.algnerrc +
>> -    sc->stats.ruc + sc->stats.roc +
>> -    sc->stats.mpc + sc->stats.cexterr +
>> -    sc->rx_overruns;
>> +    sc->stats.ruc +
>> +    sc->stats.roc +
>> +    sc->stats.cexterr;
>>
>>   /* Tx Errors */
>>   ifp->if_oerrors = sc->stats.ecol + sc->stats.latecol +
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
>>
>


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Mike Belopuhov-5
On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:

> On 31/12/13 3:14 AM, Mark Kettenis wrote:
>>>
>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
>>> From: Brad Smith <[hidden email]>
>>>
>>> Don't count RX overruns and missed packets as inputs errors. They're
>>> expected to increment when using MCLGETI.
>>>
>>> OK?
>>
>>
>> These may be "expected", but they're still packets that were not
>> received.  And it is useful to know about these, for example when
>> debugging TCP performance issues.
>
>
> Well do we want to keep just the missed packets or both? Part of the
> diff was inspired by this commit when I was looking at what counters
> were incrementing..
>
> for bge(4)..
>
> revision 1.334
> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
> dont count rx ring overruns as input errors. with MCLGETI controlling the
> ring we expect to run out of rx descriptors as a matter of course, its not
> an error.
>
> ok mikeb@
>
>

it does screws up statistics big time.  does mpc counter follow rx_overruns?
why did we add them up both previously?

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
On 31/12/13 5:50 AM, Mike Belopuhov wrote:

> On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:
>> On 31/12/13 3:14 AM, Mark Kettenis wrote:
>>>>
>>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
>>>> From: Brad Smith <[hidden email]>
>>>>
>>>> Don't count RX overruns and missed packets as inputs errors. They're
>>>> expected to increment when using MCLGETI.
>>>>
>>>> OK?
>>>
>>>
>>> These may be "expected", but they're still packets that were not
>>> received.  And it is useful to know about these, for example when
>>> debugging TCP performance issues.
>>
>>
>> Well do we want to keep just the missed packets or both? Part of the
>> diff was inspired by this commit when I was looking at what counters
>> were incrementing..
>>
>> for bge(4)..
>>
>> revision 1.334
>> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
>> dont count rx ring overruns as input errors. with MCLGETI controlling the
>> ring we expect to run out of rx descriptors as a matter of course, its not
>> an error.
>>
>> ok mikeb@
>>
>>
>
> it does screws up statistics big time.  does mpc counter follow rx_overruns?
> why did we add them up both previously?

.e.g.

em0: Missed Packets = 1925
em0: RX overruns = 267
em0: Good Packets Rcvd = 22279749

this already wasn't being counted but for reference..

em0: Receive No Buffers = 24382

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
In reply to this post by Mike Belopuhov-5
On 31/12/13 5:50 AM, Mike Belopuhov wrote:

> On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:
>> On 31/12/13 3:14 AM, Mark Kettenis wrote:
>>>>
>>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
>>>> From: Brad Smith <[hidden email]>
>>>>
>>>> Don't count RX overruns and missed packets as inputs errors. They're
>>>> expected to increment when using MCLGETI.
>>>>
>>>> OK?
>>>
>>>
>>> These may be "expected", but they're still packets that were not
>>> received.  And it is useful to know about these, for example when
>>> debugging TCP performance issues.
>>
>>
>> Well do we want to keep just the missed packets or both? Part of the
>> diff was inspired by this commit when I was looking at what counters
>> were incrementing..
>>
>> for bge(4)..
>>
>> revision 1.334
>> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
>> dont count rx ring overruns as input errors. with MCLGETI controlling the
>> ring we expect to run out of rx descriptors as a matter of course, its not
>> an error.
>>
>> ok mikeb@
>>
>>
>
> it does screws up statistics big time.  does mpc counter follow rx_overruns?
> why did we add them up both previously?

Yes, it does. I can't say why exactly but before MCLGETI for most
environments it was unlikely to have RX overruns.


--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

David Gwynne-5

On 26 Jan 2014, at 11:31 am, Brad Smith <[hidden email]> wrote:

> On 31/12/13 5:50 AM, Mike Belopuhov wrote:
>> On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:
>>> On 31/12/13 3:14 AM, Mark Kettenis wrote:
>>>>>
>>>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
>>>>> From: Brad Smith <[hidden email]>
>>>>>
>>>>> Don't count RX overruns and missed packets as inputs errors. They're
>>>>> expected to increment when using MCLGETI.
>>>>>
>>>>> OK?
>>>>
>>>>
>>>> These may be "expected", but they're still packets that were not
>>>> received.  And it is useful to know about these, for example when
>>>> debugging TCP performance issues.
>>>
>>>
>>> Well do we want to keep just the missed packets or both? Part of the
>>> diff was inspired by this commit when I was looking at what counters
>>> were incrementing..
>>>
>>> for bge(4)..
>>>
>>> revision 1.334
>>> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
>>> dont count rx ring overruns as input errors. with MCLGETI controlling the
>>> ring we expect to run out of rx descriptors as a matter of course, its not
>>> an error.
>>>
>>> ok mikeb@
>>>
>>>
>>
>> it does screws up statistics big time.  does mpc counter follow rx_overruns?
>> why did we add them up both previously?
>
> Yes, it does. I can't say why exactly but before MCLGETI for most environments it was unlikely to have RX overruns.

its not obvious?

rx rings are usually massively over provisioned. eg, my myx has 512 entries in its rx ring. however, its interrupt mitigation is currently configured for approximately 16k interrupts a second. if you're sustaining 1m pps, then you can divide that by the interrupt rate to figure out the average usage of the rx ring. so 1000 / 16 is about 60-65 descriptors per interrupt. 512 is roughly an order of magnitude more than what you need for that workload.

if you were hitting the ring limits before MCLGETI, then that means you dont have enough cpu to process the pps. increasing the ring size would make it worse cos you'd spend more time freeing mbufs because you were too far behind on the pps you could deal with.

>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
On Tue, Jan 28, 2014 at 01:21:46PM +1000, David Gwynne wrote:

>
> On 26 Jan 2014, at 11:31 am, Brad Smith <[hidden email]> wrote:
>
> > On 31/12/13 5:50 AM, Mike Belopuhov wrote:
> >> On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:
> >>> On 31/12/13 3:14 AM, Mark Kettenis wrote:
> >>>>>
> >>>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
> >>>>> From: Brad Smith <[hidden email]>
> >>>>>
> >>>>> Don't count RX overruns and missed packets as inputs errors. They're
> >>>>> expected to increment when using MCLGETI.
> >>>>>
> >>>>> OK?
> >>>>
> >>>>
> >>>> These may be "expected", but they're still packets that were not
> >>>> received.  And it is useful to know about these, for example when
> >>>> debugging TCP performance issues.
> >>>
> >>>
> >>> Well do we want to keep just the missed packets or both? Part of the
> >>> diff was inspired by this commit when I was looking at what counters
> >>> were incrementing..
> >>>
> >>> for bge(4)..
> >>>
> >>> revision 1.334
> >>> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
> >>> dont count rx ring overruns as input errors. with MCLGETI controlling the
> >>> ring we expect to run out of rx descriptors as a matter of course, its not
> >>> an error.
> >>>
> >>> ok mikeb@
> >>>
> >>>
> >>
> >> it does screws up statistics big time.  does mpc counter follow rx_overruns?
> >> why did we add them up both previously?
> >
> > Yes, it does. I can't say why exactly but before MCLGETI for most environments
> > it was unlikely to have RX overruns.
>
> its not obvious?
>
> rx rings are usually massively over provisioned. eg, my myx has 512 entries in its
> rx ring. however, its interrupt mitigation is currently configured for approximately
> 16k interrupts a second. if you're sustaining 1m pps, then you can divide that by the
> interrupt rate to figure out the average usage of the rx ring. so 1000 / 16 is about
> 60-65 descriptors per interrupt. 512 is roughly an order of magnitude more than what
> you need for that workload.
>
> if you were hitting the ring limits before MCLGETI, then that means you dont have
> enough cpu to process the pps. increasing the ring size would make it worse cos you'd
> spend more time freeing mbufs because you were too far behind on the pps you could
> deal with.

Ya, I don't know why I ultimately said I can't say why exactly as I was thinking about
what you said regaring having a lot of buffers allocated and that's why I said it was
unlikely to have RX overruns.

Since this was changed for bge(4) then the other drivers using MCLGETI should be changed
as well if there is consensus about not adding the RX overruns to the interfaces input
errors counter. But I think kettenis has a point as well that this information is useful
its just we don't have any way of exposing that info to userland.

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
On Tue, Jan 28, 2014 at 02:08:09AM -0500, Brad Smith wrote:

> On Tue, Jan 28, 2014 at 01:21:46PM +1000, David Gwynne wrote:
> >
> > On 26 Jan 2014, at 11:31 am, Brad Smith <[hidden email]> wrote:
> >
> > > On 31/12/13 5:50 AM, Mike Belopuhov wrote:
> > >> On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:
> > >>> On 31/12/13 3:14 AM, Mark Kettenis wrote:
> > >>>>>
> > >>>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
> > >>>>> From: Brad Smith <[hidden email]>
> > >>>>>
> > >>>>> Don't count RX overruns and missed packets as inputs errors. They're
> > >>>>> expected to increment when using MCLGETI.
> > >>>>>
> > >>>>> OK?
> > >>>>
> > >>>>
> > >>>> These may be "expected", but they're still packets that were not
> > >>>> received.  And it is useful to know about these, for example when
> > >>>> debugging TCP performance issues.
> > >>>
> > >>>
> > >>> Well do we want to keep just the missed packets or both? Part of the
> > >>> diff was inspired by this commit when I was looking at what counters
> > >>> were incrementing..
> > >>>
> > >>> for bge(4)..
> > >>>
> > >>> revision 1.334
> > >>> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
> > >>> dont count rx ring overruns as input errors. with MCLGETI controlling the
> > >>> ring we expect to run out of rx descriptors as a matter of course, its not
> > >>> an error.
> > >>>
> > >>> ok mikeb@
> > >>>
> > >>>
> > >>
> > >> it does screws up statistics big time.  does mpc counter follow rx_overruns?
> > >> why did we add them up both previously?
> > >
> > > Yes, it does. I can't say why exactly but before MCLGETI for most environments
> > > it was unlikely to have RX overruns.
> >
> > its not obvious?
> >
> > rx rings are usually massively over provisioned. eg, my myx has 512 entries in its
> > rx ring. however, its interrupt mitigation is currently configured for approximately
> > 16k interrupts a second. if you're sustaining 1m pps, then you can divide that by the
> > interrupt rate to figure out the average usage of the rx ring. so 1000 / 16 is about
> > 60-65 descriptors per interrupt. 512 is roughly an order of magnitude more than what
> > you need for that workload.
> >
> > if you were hitting the ring limits before MCLGETI, then that means you dont have
> > enough cpu to process the pps. increasing the ring size would make it worse cos you'd
> > spend more time freeing mbufs because you were too far behind on the pps you could
> > deal with.
>
> Ya, I don't know why I ultimately said I can't say why exactly as I was thinking about
> what you said regaring having a lot of buffers allocated and that's why I said it was
> unlikely to have RX overruns.
>
> Since this was changed for bge(4) then the other drivers using MCLGETI should be changed
> as well if there is consensus about not adding the RX overruns to the interfaces input
> errors counter. But I think kettenis has a point as well that this information is useful
> its just we don't have any way of exposing that info to userland.

Ok, so I looked at the other drivers using MCLGETI and there are a few others also
counting FIFO overruns towards input errors.


Index: arch/socppc/dev/if_tsec.c
===================================================================
RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
retrieving revision 1.29
diff -u -p -u -p -r1.29 if_tsec.c
--- arch/socppc/dev/if_tsec.c 29 Nov 2012 21:10:31 -0000 1.29
+++ arch/socppc/dev/if_tsec.c 28 Jan 2014 05:16:24 -0000
@@ -779,7 +779,6 @@ tsec_errintr(void *arg)
  */
  tsec_rx_proc(sc);
  tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
- ifp->if_ierrors++;
  }
 
  return (1);
Index: dev/pci/if_se.c
===================================================================
RCS file: /home/cvs/src/sys/dev/pci/if_se.c,v
retrieving revision 1.9
diff -u -p -u -p -r1.9 if_se.c
--- dev/pci/if_se.c 28 Dec 2013 03:34:54 -0000 1.9
+++ dev/pci/if_se.c 28 Jan 2014 04:49:53 -0000
@@ -939,7 +939,8 @@ se_rxeof(struct se_softc *sc)
  printf("%s: rx error %b\n",
     ifp->if_xname, rxstat, RX_ERR_BITS);
  se_discard_rxbuf(sc, i);
- ifp->if_ierrors++;
+ if ((rxstat & RDS_OVRUN) == 0)
+ ifp->if_ierrors++;
  continue;
  }
 
Index: dev/pci/if_sis.c
===================================================================
RCS file: /home/cvs/src/sys/dev/pci/if_sis.c,v
retrieving revision 1.115
diff -u -p -u -p -r1.115 if_sis.c
--- dev/pci/if_sis.c 28 Dec 2013 03:34:54 -0000 1.115
+++ dev/pci/if_sis.c 28 Jan 2014 04:51:53 -0000
@@ -1378,7 +1378,8 @@ sis_rxeof(struct sis_softc *sc)
     total_len <= (ETHER_MAX_DIX_LEN - ETHER_CRC_LEN))
  rxstat &= ~SIS_RXSTAT_GIANT;
  if (SIS_RXSTAT_ERROR(rxstat)) {
- ifp->if_ierrors++;
+ if ((rxstat & SIS_RXSTAT_OVERRUN) == 0)
+ ifp->if_ierrors++;
  if (rxstat & SIS_RXSTAT_COLL)
  ifp->if_collisions++;
  m_freem(m);
Index: dev/pci/if_vr.c
===================================================================
RCS file: /home/cvs/src/sys/dev/pci/if_vr.c,v
retrieving revision 1.132
diff -u -p -u -p -r1.132 if_vr.c
--- dev/pci/if_vr.c 28 Dec 2013 03:34:54 -0000 1.132
+++ dev/pci/if_vr.c 28 Jan 2014 04:56:19 -0000
@@ -858,7 +858,8 @@ vr_rxeof(struct vr_softc *sc)
  * comes up in the ring.
  */
  if ((rxstat & VR_RXSTAT_RX_OK) == 0) {
- ifp->if_ierrors++;
+ if ((rxstat & VR_RXSTAT_FIFOOFLOW) == 0)
+ ifp->if_ierrors++;
 #ifdef VR_DEBUG
  printf("%s: rx error (%02x):",
     sc->sc_dev.dv_xname, rxstat & 0x000000ff);

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
On Fri, Feb 07, 2014 at 06:15:44AM -0500, Brad Smith wrote:

> On Tue, Jan 28, 2014 at 02:08:09AM -0500, Brad Smith wrote:
> > On Tue, Jan 28, 2014 at 01:21:46PM +1000, David Gwynne wrote:
> > >
> > > On 26 Jan 2014, at 11:31 am, Brad Smith <[hidden email]> wrote:
> > >
> > > > On 31/12/13 5:50 AM, Mike Belopuhov wrote:
> > > >> On 31 December 2013 09:46, Brad Smith <[hidden email]> wrote:
> > > >>> On 31/12/13 3:14 AM, Mark Kettenis wrote:
> > > >>>>>
> > > >>>>> Date: Tue, 31 Dec 2013 01:28:04 -0500
> > > >>>>> From: Brad Smith <[hidden email]>
> > > >>>>>
> > > >>>>> Don't count RX overruns and missed packets as inputs errors. They're
> > > >>>>> expected to increment when using MCLGETI.
> > > >>>>>
> > > >>>>> OK?
> > > >>>>
> > > >>>>
> > > >>>> These may be "expected", but they're still packets that were not
> > > >>>> received.  And it is useful to know about these, for example when
> > > >>>> debugging TCP performance issues.
> > > >>>
> > > >>>
> > > >>> Well do we want to keep just the missed packets or both? Part of the
> > > >>> diff was inspired by this commit when I was looking at what counters
> > > >>> were incrementing..
> > > >>>
> > > >>> for bge(4)..
> > > >>>
> > > >>> revision 1.334
> > > >>> date: 2013/06/06 00:05:30;  author: dlg;  state: Exp;  lines: +2 -4;
> > > >>> dont count rx ring overruns as input errors. with MCLGETI controlling the
> > > >>> ring we expect to run out of rx descriptors as a matter of course, its not
> > > >>> an error.
> > > >>>
> > > >>> ok mikeb@
> > > >>>
> > > >>>
> > > >>
> > > >> it does screws up statistics big time.  does mpc counter follow rx_overruns?
> > > >> why did we add them up both previously?
> > > >
> > > > Yes, it does. I can't say why exactly but before MCLGETI for most environments
> > > > it was unlikely to have RX overruns.
> > >
> > > its not obvious?
> > >
> > > rx rings are usually massively over provisioned. eg, my myx has 512 entries in its
> > > rx ring. however, its interrupt mitigation is currently configured for approximately
> > > 16k interrupts a second. if you're sustaining 1m pps, then you can divide that by the
> > > interrupt rate to figure out the average usage of the rx ring. so 1000 / 16 is about
> > > 60-65 descriptors per interrupt. 512 is roughly an order of magnitude more than what
> > > you need for that workload.
> > >
> > > if you were hitting the ring limits before MCLGETI, then that means you dont have
> > > enough cpu to process the pps. increasing the ring size would make it worse cos you'd
> > > spend more time freeing mbufs because you were too far behind on the pps you could
> > > deal with.
> >
> > Ya, I don't know why I ultimately said I can't say why exactly as I was thinking about
> > what you said regaring having a lot of buffers allocated and that's why I said it was
> > unlikely to have RX overruns.
> >
> > Since this was changed for bge(4) then the other drivers using MCLGETI should be changed
> > as well if there is consensus about not adding the RX overruns to the interfaces input
> > errors counter. But I think kettenis has a point as well that this information is useful
> > its just we don't have any way of exposing that info to userland.
>
> Ok, so I looked at the other drivers using MCLGETI and there are a few others also
> counting FIFO overruns towards input errors.
 
ping.

> Index: arch/socppc/dev/if_tsec.c
> ===================================================================
> RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
> retrieving revision 1.29
> diff -u -p -u -p -r1.29 if_tsec.c
> --- arch/socppc/dev/if_tsec.c 29 Nov 2012 21:10:31 -0000 1.29
> +++ arch/socppc/dev/if_tsec.c 28 Jan 2014 05:16:24 -0000
> @@ -779,7 +779,6 @@ tsec_errintr(void *arg)
>   */
>   tsec_rx_proc(sc);
>   tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
> - ifp->if_ierrors++;
>   }
>  
>   return (1);
> Index: dev/pci/if_se.c
> ===================================================================
> RCS file: /home/cvs/src/sys/dev/pci/if_se.c,v
> retrieving revision 1.9
> diff -u -p -u -p -r1.9 if_se.c
> --- dev/pci/if_se.c 28 Dec 2013 03:34:54 -0000 1.9
> +++ dev/pci/if_se.c 28 Jan 2014 04:49:53 -0000
> @@ -939,7 +939,8 @@ se_rxeof(struct se_softc *sc)
>   printf("%s: rx error %b\n",
>      ifp->if_xname, rxstat, RX_ERR_BITS);
>   se_discard_rxbuf(sc, i);
> - ifp->if_ierrors++;
> + if ((rxstat & RDS_OVRUN) == 0)
> + ifp->if_ierrors++;
>   continue;
>   }
>  
> Index: dev/pci/if_sis.c
> ===================================================================
> RCS file: /home/cvs/src/sys/dev/pci/if_sis.c,v
> retrieving revision 1.115
> diff -u -p -u -p -r1.115 if_sis.c
> --- dev/pci/if_sis.c 28 Dec 2013 03:34:54 -0000 1.115
> +++ dev/pci/if_sis.c 28 Jan 2014 04:51:53 -0000
> @@ -1378,7 +1378,8 @@ sis_rxeof(struct sis_softc *sc)
>      total_len <= (ETHER_MAX_DIX_LEN - ETHER_CRC_LEN))
>   rxstat &= ~SIS_RXSTAT_GIANT;
>   if (SIS_RXSTAT_ERROR(rxstat)) {
> - ifp->if_ierrors++;
> + if ((rxstat & SIS_RXSTAT_OVERRUN) == 0)
> + ifp->if_ierrors++;
>   if (rxstat & SIS_RXSTAT_COLL)
>   ifp->if_collisions++;
>   m_freem(m);
> Index: dev/pci/if_vr.c
> ===================================================================
> RCS file: /home/cvs/src/sys/dev/pci/if_vr.c,v
> retrieving revision 1.132
> diff -u -p -u -p -r1.132 if_vr.c
> --- dev/pci/if_vr.c 28 Dec 2013 03:34:54 -0000 1.132
> +++ dev/pci/if_vr.c 28 Jan 2014 04:56:19 -0000
> @@ -858,7 +858,8 @@ vr_rxeof(struct vr_softc *sc)
>   * comes up in the ring.
>   */
>   if ((rxstat & VR_RXSTAT_RX_OK) == 0) {
> - ifp->if_ierrors++;
> + if ((rxstat & VR_RXSTAT_FIFOOFLOW) == 0)
> + ifp->if_ierrors++;
>  #ifdef VR_DEBUG
>   printf("%s: rx error (%02x):",
>      sc->sc_dev.dv_xname, rxstat & 0x000000ff);
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Mark Kettenis
> Date: Tue, 11 Feb 2014 13:30:47 -0500
> From: Brad Smith <[hidden email]>
>
> > Index: arch/socppc/dev/if_tsec.c
> > ===================================================================
> > RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
> > retrieving revision 1.29
> > diff -u -p -u -p -r1.29 if_tsec.c
> > --- arch/socppc/dev/if_tsec.c 29 Nov 2012 21:10:31 -0000 1.29
> > +++ arch/socppc/dev/if_tsec.c 28 Jan 2014 05:16:24 -0000
> > @@ -779,7 +779,6 @@ tsec_errintr(void *arg)
> >   */
> >   tsec_rx_proc(sc);
> >   tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
> > - ifp->if_ierrors++;
> >   }
> >  
> >   return (1);


This one doesn't seem right.  This is the only place where the driver
actually increases if_ierrors.

I also still fundamentally disagree with the direction.  I you guys
really want to make a distinction between packets dropped because
we're out of descriptors and packets that were not correctly received
for other reasons, add a counter for that first and then change the
drivers.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Brad Smith-14
On Tue, Feb 11, 2014 at 07:43:51PM +0100, Mark Kettenis wrote:

> > Date: Tue, 11 Feb 2014 13:30:47 -0500
> > From: Brad Smith <[hidden email]>
> >
> > > Index: arch/socppc/dev/if_tsec.c
> > > ===================================================================
> > > RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
> > > retrieving revision 1.29
> > > diff -u -p -u -p -r1.29 if_tsec.c
> > > --- arch/socppc/dev/if_tsec.c 29 Nov 2012 21:10:31 -0000 1.29
> > > +++ arch/socppc/dev/if_tsec.c 28 Jan 2014 05:16:24 -0000
> > > @@ -779,7 +779,6 @@ tsec_errintr(void *arg)
> > >   */
> > >   tsec_rx_proc(sc);
> > >   tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
> > > - ifp->if_ierrors++;
> > >   }
> > >  
> > >   return (1);
>
>
> This one doesn't seem right.  This is the only place where the driver
> actually increases if_ierrors.

Being the only place input errors are incremented is irrelevant.
Its being incremented because the particular "error" is a FIFO overrun.

> I also still fundamentally disagree with the direction.  I you guys
> really want to make a distinction between packets dropped because
> we're out of descriptors and packets that were not correctly received
> for other reasons, add a counter for that first and then change the
> drivers.

I don't necessarily disagree with what you have said. I think we should
have some additional counters to deal with some of the counters we
are lumping into error counters.

Since we can't seem to come to any consensus about how to deal with
this I'm going to revert the bge(4) commit in question.

--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Claudio Jeker
In reply to this post by Mark Kettenis
On Tue, Feb 11, 2014 at 07:43:51PM +0100, Mark Kettenis wrote:

> > Date: Tue, 11 Feb 2014 13:30:47 -0500
> > From: Brad Smith <[hidden email]>
> >
> > > Index: arch/socppc/dev/if_tsec.c
> > > ===================================================================
> > > RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
> > > retrieving revision 1.29
> > > diff -u -p -u -p -r1.29 if_tsec.c
> > > --- arch/socppc/dev/if_tsec.c 29 Nov 2012 21:10:31 -0000 1.29
> > > +++ arch/socppc/dev/if_tsec.c 28 Jan 2014 05:16:24 -0000
> > > @@ -779,7 +779,6 @@ tsec_errintr(void *arg)
> > >   */
> > >   tsec_rx_proc(sc);
> > >   tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
> > > - ifp->if_ierrors++;
> > >   }
> > >  
> > >   return (1);
>
>
> This one doesn't seem right.  This is the only place where the driver
> actually increases if_ierrors.
>
> I also still fundamentally disagree with the direction.  I you guys
> really want to make a distinction between packets dropped because
> we're out of descriptors and packets that were not correctly received
> for other reasons, add a counter for that first and then change the
> drivers.
>

I totally agree here. Packets are dropped and should be counted in some
ways else you may think all is fine but actually a lot of packets are
dropped by the driver.

--
:wq Claudio

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

Mike Belopuhov-5
In reply to this post by Brad Smith-14
On 11 February 2014 20:05, Brad Smith <[hidden email]> wrote:

> On Tue, Feb 11, 2014 at 07:43:51PM +0100, Mark Kettenis wrote:
>> > Date: Tue, 11 Feb 2014 13:30:47 -0500
>> > From: Brad Smith <[hidden email]>
>> >
>> > > Index: arch/socppc/dev/if_tsec.c
>> > > ===================================================================
>> > > RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
>> > > retrieving revision 1.29
>> > > diff -u -p -u -p -r1.29 if_tsec.c
>> > > --- arch/socppc/dev/if_tsec.c     29 Nov 2012 21:10:31 -0000      1.29
>> > > +++ arch/socppc/dev/if_tsec.c     28 Jan 2014 05:16:24 -0000
>> > > @@ -779,7 +779,6 @@ tsec_errintr(void *arg)
>> > >            */
>> > >           tsec_rx_proc(sc);
>> > >           tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
>> > > -         ifp->if_ierrors++;
>> > >   }
>> > >
>> > >   return (1);
>>
>>
>> This one doesn't seem right.  This is the only place where the driver
>> actually increases if_ierrors.
>
> Being the only place input errors are incremented is irrelevant.
> Its being incremented because the particular "error" is a FIFO overrun.
>
>> I also still fundamentally disagree with the direction.  I you guys
>> really want to make a distinction between packets dropped because
>> we're out of descriptors and packets that were not correctly received
>> for other reasons, add a counter for that first and then change the
>> drivers.
>
> I don't necessarily disagree with what you have said. I think we should
> have some additional counters to deal with some of the counters we
> are lumping into error counters.
>
> Since we can't seem to come to any consensus about how to deal with
> this I'm going to revert the bge(4) commit in question.
>

no way.  counting drops caused by the mclgeti should not be accounted
as input errors.  it makes it hard to debug things.

> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>

Reply | Threaded
Open this post in threaded view
|

Re: em(4): Don't count RX overruns and missed packets as input errros

gwes-2
On 02/12/2014 05:44 AM, Mike Belopuhov wrote:

> On 11 February 2014 20:05, Brad Smith <[hidden email]> wrote:
>> On Tue, Feb 11, 2014 at 07:43:51PM +0100, Mark Kettenis wrote:
>>>> Date: Tue, 11 Feb 2014 13:30:47 -0500
>>>> From: Brad Smith <[hidden email]>
>>>>
>>>>> Index: arch/socppc/dev/if_tsec.c
>>>>> ===================================================================
>>>>> RCS file: /home/cvs/src/sys/arch/socppc/dev/if_tsec.c,v
>>>>> retrieving revision 1.29
>>>>> diff -u -p -u -p -r1.29 if_tsec.c
>>>>> --- arch/socppc/dev/if_tsec.c     29 Nov 2012 21:10:31 -0000      1.29
>>>>> +++ arch/socppc/dev/if_tsec.c     28 Jan 2014 05:16:24 -0000
>>>>> @@ -779,7 +779,6 @@ tsec_errintr(void *arg)
>>>>>             */
>>>>>            tsec_rx_proc(sc);
>>>>>            tsec_write(sc, TSEC_RSTAT, TSEC_RSTAT_QHLT);
>>>>> -         ifp->if_ierrors++;
>>>>>    }
>>>>>
>>>>>    return (1);
>>>
>>> This one doesn't seem right.  This is the only place where the driver
>>> actually increases if_ierrors.
>> Being the only place input errors are incremented is irrelevant.
>> Its being incremented because the particular "error" is a FIFO overrun.
>>
>>> I also still fundamentally disagree with the direction.  I you guys
>>> really want to make a distinction between packets dropped because
>>> we're out of descriptors and packets that were not correctly received
>>> for other reasons, add a counter for that first and then change the
>>> drivers.
>> I don't necessarily disagree with what you have said. I think we should
>> have some additional counters to deal with some of the counters we
>> are lumping into error counters.
>>
>> Since we can't seem to come to any consensus about how to deal with
>> this I'm going to revert the bge(4) commit in question.
>>
> no way.  counting drops caused by the mclgeti should not be accounted
> as input errors.  it makes it hard to debug things.
>
>> --
>> This message has been scanned for viruses and
>> dangerous content by MailScanner, and is
>> believed to be clean.
>>
Locating where and why packets are dropped is essential to debugging network
overload problems. A per-interface "packets dropped because of lack of
resources"
would make that process easier. There used to be several places in packet
reception where packets were dropped with no record. I inserted counters
to get at least a global count. It's conceptually very ugly to silently fail
in a vital part of the network stack. It's definitely a frustrating
inconvenience.

Geoff Steckel