fix pppx(4) with net/ifq.c rev 1.38

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

fix pppx(4) with net/ifq.c rev 1.38

Vitaliy Makkoveev
I got splassert with pppx(4) and net/ifq.c rev 1.38 raised by
NET_ASSERT_LOCKED() in netinet/ip_output.c:113 and underlaying routines.

net/ifq.c rev 1.38 is not in snapshot yet so you need to checkout and
build kernel to reproduce.

---- dmesg begin ----

splassert: ip_output: want 2 have 0
Starting stack trace...
ip_output(fffffd801f8c8800,0,0,0,0,0) at ip_output+0x8f
pipex_l2tp_output(fffffd801f8c8800,ffff80000e787808) at
pipex_l2tp_output+0x21d
pipex_ppp_output(fffffd801f8c8800,ffff80000e787808,21) at
pipex_ppp_output+0xda
pppx_if_start(ffff80000e787268) at pppx_if_start+0x83
if_qstart_compat(ffff80000e7874e0) at if_qstart_compat+0x2e
ifq_serialize(ffff80000e7874e0,ffff80000e7875b0) at ifq_serialize+0x103
taskq_thread(ffffffff81f3df30) at taskq_thread+0x4d
end trace frame: 0x0, count: 250
End of stack trace.
splassert: ipsp_spd_lookup: want 2 have 0
Starting stack trace...
ipsp_spd_lookup(fffffd801f8c8800,2,14,ffff80000e726c9c,2,0) at
ipsp_spd_lookup+0x80
ip_output_ipsec_lookup(fffffd801f8c8800,14,ffff80000e726c9c,0,1) at
ip_output_ipsec_lookup+0x4d
ip_output(fffffd801f8c8800,0,0,0,0,0) at ip_output+0x4fa
pipex_l2tp_output(fffffd801f8c8800,ffff80000e787808) at
pipex_l2tp_output+0x21d
pipex_ppp_output(fffffd801f8c8800,ffff80000e787808,21) at
pipex_ppp_output+0xda
pppx_if_start(ffff80000e787268) at pppx_if_start+0x83
if_qstart_compat(ffff80000e7874e0) at if_qstart_compat+0x2e
ifq_serialize(ffff80000e7874e0,ffff80000e7875b0) at ifq_serialize+0x103
taskq_thread(ffffffff81f3df30) at taskq_thread+0x4d
end trace frame: 0x0, count: 248
End of stack trace.
splassert: spd_table_get: want 2 have 0

---- dmesg end ----

1. `pxi_if' owned by struct pppx_if has IFXF_MPSAFE flag unset
  1.1 pppx(4) sets IFQ_SET_MAXLEN(&ifp->if_snd, 1) at net/if_pppx.c:866
2. pppx_if_output() is called under NET_LOCK()
3. pppx_if_output() calls if_enqueue() at net/if_pppx.c:1123
4. pppx(4) doesn't set `ifp->if_enqueue' so if_enqueue() calls
    if_enqueue_ifq() at net/if.c:709 (which is set in net/if.c:639)
5. if_enqueue_ifq() calls ifq_start() at net/if.c:734
6. ifq_start() we a still under NET_LOCK() here

6.a. in net/ifq.c rev 1.37 ifq_start() checks "ifq_len(ifq) >=
min(ifp->if_txmit, ifq->ifq_maxlen)" and this was always true because
(1.1) so we always call ifq_run_start() which calls ifq_serialize().

ifq_serialize() will call if_qstart_compat() which calls
pppx_if_start() which calls pipex_ppp_output() etc while we still
holding NET_LOCK() so the assertions I reported above are not raised.

6.b. net/ifq.c rev 1.38 introduce checks of IFXF_MPSAFE flag. so we are
always going to net/ifq.c:132 where we adding out task to `systq'
referenced by `ifq_softnet' (ifq_softnet set to `systq' at
net/ifq.c:199).

taskq_thread() doesn't grab NET_LOCK() so after net/ifq.c rev 1.38
ifq_serialize() and underlaying pppx_if_start() call without NET_LOCK()
and corresponding asserts raised.

The problem I pointed is not in net/ifq.c rev 1.38 but in pppx(4).
`if_start' routines should grab NET_LOCK() by themself if it is required
but pppx_if_start() and pppac_start() did't do that. pppac_start() has
no underlaying NET_ASSERT_LOCKED() so the pppx(4) is the only case were
the problem is shown.

Since NET_LOCK() is required by pipex(4), diff below adds it to
pppx_if_start() and pppac_start().

After net/ifq.c rev 1.38 pppx_if_start() will newer be called from
pppx_if_output() but from `systq' only so I don't add lock/unlock
dances around if_enqueue() at net/if_pppx.c:1123.

Diff tested for both pppx(4) and pppac(4) cases.

Index: sys/net/if_pppx.c
===================================================================
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.84
diff -u -p -r1.84 if_pppx.c
--- sys/net/if_pppx.c 18 Apr 2020 04:03:56 -0000 1.84
+++ sys/net/if_pppx.c 20 May 2020 14:17:20 -0000
@@ -1054,8 +1054,9 @@ pppx_if_start(struct ifnet *ifp)
  struct mbuf *m;
  int proto;
 
+ NET_LOCK();
  if (!ISSET(ifp->if_flags, IFF_RUNNING))
- return;
+ goto unlock;
 
  for (;;) {
  IFQ_DEQUEUE(&ifp->if_snd, m);
@@ -1071,6 +1072,8 @@ pppx_if_start(struct ifnet *ifp)
 
  pipex_ppp_output(m, &pxi->pxi_session, proto);
  }
+unlock:
+ NET_UNLOCK();
 }
 
 int
@@ -1640,6 +1643,8 @@ pppac_start(struct ifnet *ifp)
  struct pppac_softc *sc = ifp->if_softc;
  struct mbuf *m;
 
+ NET_LOCK();
+
  while ((m = ifq_dequeue(&ifp->if_snd)) != NULL) {
 #if NBPFILTER > 0
  if (ifp->if_bpf) {
@@ -1668,4 +1673,6 @@ pppac_start(struct ifnet *ifp)
  wakeup(sc);
  selwakeup(&sc->sc_rsel);
  }
+
+ NET_UNLOCK();
 }

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

Vitaliy Makkoveev
After net/ifq.c rev 1.38 was reverted pppac(4) still has this problem.

pppac_output() called under NET_LOCK(). pppac_output() calls if_enqueue()
which calls ifq_start(). But now ifq_start() can run pppac_input()
directly under NET_LOCK() and also is can enqueue work and pppac_input()
will be called without NET_LOCK(). pppac_input() calls pipex_output() which
requires NET_LOCK().

We can be sure what pppx_if_input() will be always called under
NET_LOCK() because it's `if_snd' queue size is 1. I guess to make this
restriction to pppac(4) just to be sure pppac_input() is always called under
NET_LOCK() as temporary solution. I guess it's better then unlock/lock
dances around if_enqueue().

Also 6.7 release has this problem. And I like to know, how to fix it too.

Index: sys/net/if_pppx.c
===================================================================
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.84
diff -u -p -r1.84 if_pppx.c
--- sys/net/if_pppx.c 18 Apr 2020 04:03:56 -0000 1.84
+++ sys/net/if_pppx.c 21 May 2020 12:31:49 -0000
@@ -1054,6 +1054,8 @@ pppx_if_start(struct ifnet *ifp)
  struct mbuf *m;
  int proto;
 
+ NET_ASSERT_LOCKED();
+
  if (!ISSET(ifp->if_flags, IFF_RUNNING))
  return;
 
@@ -1290,6 +1292,8 @@ pppacopen(dev_t dev, int flags, int mode
  ifp->if_output = pppac_output;
  ifp->if_start = pppac_start;
  ifp->if_ioctl = pppac_ioctl;
+ /* XXX: To be sure pppac_input() is called under NET_LOCK() */
+ IFQ_SET_MAXLEN(&ifp->if_snd, 1);
 
  if_counters_alloc(ifp);
  if_attach(ifp);
@@ -1609,6 +1613,8 @@ pppac_output(struct ifnet *ifp, struct m
 {
  int error;
 
+ NET_ASSERT_LOCKED();
+
  if (!ISSET(ifp->if_flags, IFF_RUNNING)) {
  error = EHOSTDOWN;
  goto drop;
@@ -1639,6 +1645,8 @@ pppac_start(struct ifnet *ifp)
 {
  struct pppac_softc *sc = ifp->if_softc;
  struct mbuf *m;
+
+ NET_ASSERT_LOCKED();
 
  while ((m = ifq_dequeue(&ifp->if_snd)) != NULL) {
 #if NBPFILTER > 0

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

Theo de Raadt-2
Vitaliy Makkoveev <[hidden email]> wrote:

> Also 6.7 release has this problem. And I like to know, how to fix it too.

Is this is a critical security fix, or an issue that lots of people will
encounter?

Errata creation is hours of work.  We don't do it for tiny things.  I
actually don't appreciate seeing pressure for us create erratas, and
expend such effort.  We'll make those decisions on our own, thank you
very much.

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

David Gwynne-5
In reply to this post by Vitaliy Makkoveev
On Wed, May 20, 2020 at 05:42:35PM +0300, Vitaliy Makkoveev wrote:

> I got splassert with pppx(4) and net/ifq.c rev 1.38 raised by
> NET_ASSERT_LOCKED() in netinet/ip_output.c:113 and underlaying routines.
>
> net/ifq.c rev 1.38 is not in snapshot yet so you need to checkout and
> build kernel to reproduce.
>
> ---- dmesg begin ----
>
> splassert: ip_output: want 2 have 0
> Starting stack trace...
> ip_output(fffffd801f8c8800,0,0,0,0,0) at ip_output+0x8f
> pipex_l2tp_output(fffffd801f8c8800,ffff80000e787808) at
> pipex_l2tp_output+0x21d
> pipex_ppp_output(fffffd801f8c8800,ffff80000e787808,21) at
> pipex_ppp_output+0xda
> pppx_if_start(ffff80000e787268) at pppx_if_start+0x83
> if_qstart_compat(ffff80000e7874e0) at if_qstart_compat+0x2e
> ifq_serialize(ffff80000e7874e0,ffff80000e7875b0) at ifq_serialize+0x103
> taskq_thread(ffffffff81f3df30) at taskq_thread+0x4d
> end trace frame: 0x0, count: 250
> End of stack trace.
> splassert: ipsp_spd_lookup: want 2 have 0
> Starting stack trace...
> ipsp_spd_lookup(fffffd801f8c8800,2,14,ffff80000e726c9c,2,0) at
> ipsp_spd_lookup+0x80
> ip_output_ipsec_lookup(fffffd801f8c8800,14,ffff80000e726c9c,0,1) at
> ip_output_ipsec_lookup+0x4d
> ip_output(fffffd801f8c8800,0,0,0,0,0) at ip_output+0x4fa
> pipex_l2tp_output(fffffd801f8c8800,ffff80000e787808) at
> pipex_l2tp_output+0x21d
> pipex_ppp_output(fffffd801f8c8800,ffff80000e787808,21) at
> pipex_ppp_output+0xda
> pppx_if_start(ffff80000e787268) at pppx_if_start+0x83
> if_qstart_compat(ffff80000e7874e0) at if_qstart_compat+0x2e
> ifq_serialize(ffff80000e7874e0,ffff80000e7875b0) at ifq_serialize+0x103
> taskq_thread(ffffffff81f3df30) at taskq_thread+0x4d
> end trace frame: 0x0, count: 248
> End of stack trace.
> splassert: spd_table_get: want 2 have 0
>
> ---- dmesg end ----
>
> 1. `pxi_if' owned by struct pppx_if has IFXF_MPSAFE flag unset
>   1.1 pppx(4) sets IFQ_SET_MAXLEN(&ifp->if_snd, 1) at net/if_pppx.c:866
> 2. pppx_if_output() is called under NET_LOCK()
> 3. pppx_if_output() calls if_enqueue() at net/if_pppx.c:1123
> 4. pppx(4) doesn't set `ifp->if_enqueue' so if_enqueue() calls
>     if_enqueue_ifq() at net/if.c:709 (which is set in net/if.c:639)
> 5. if_enqueue_ifq() calls ifq_start() at net/if.c:734
> 6. ifq_start() we a still under NET_LOCK() here
>
> 6.a. in net/ifq.c rev 1.37 ifq_start() checks "ifq_len(ifq) >=
> min(ifp->if_txmit, ifq->ifq_maxlen)" and this was always true because
> (1.1) so we always call ifq_run_start() which calls ifq_serialize().
>
> ifq_serialize() will call if_qstart_compat() which calls
> pppx_if_start() which calls pipex_ppp_output() etc while we still
> holding NET_LOCK() so the assertions I reported above are not raised.
>
> 6.b. net/ifq.c rev 1.38 introduce checks of IFXF_MPSAFE flag. so we are
> always going to net/ifq.c:132 where we adding out task to `systq'
> referenced by `ifq_softnet' (ifq_softnet set to `systq' at
> net/ifq.c:199).
>
> taskq_thread() doesn't grab NET_LOCK() so after net/ifq.c rev 1.38
> ifq_serialize() and underlaying pppx_if_start() call without NET_LOCK()
> and corresponding asserts raised.
>
> The problem I pointed is not in net/ifq.c rev 1.38 but in pppx(4).
> `if_start' routines should grab NET_LOCK() by themself if it is required
> but pppx_if_start() and pppac_start() did't do that. pppac_start() has
> no underlaying NET_ASSERT_LOCKED() so the pppx(4) is the only case were
> the problem is shown.
>
> Since NET_LOCK() is required by pipex(4), diff below adds it to
> pppx_if_start() and pppac_start().
>
> After net/ifq.c rev 1.38 pppx_if_start() will newer be called from
> pppx_if_output() but from `systq' only so I don't add lock/unlock
> dances around if_enqueue() at net/if_pppx.c:1123.
>
> Diff tested for both pppx(4) and pppac(4) cases.

thanks for the detailed analysis. i wondered how the ifq change
triggered this exactly, and your mail makes it clear.

however, pppx/pppac/pipex are not the first or only drivers in the tree
that encapsulate a packet in IP from their if_start routine and send it
out with the network stack. the way this has been solved in every other
driver has been to call ip{,6}_send to transmit the packet instead
of ip{,6}_output.

can you try the following diff?

Index: pipex.c
===================================================================
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.113
diff -u -p -r1.113 pipex.c
--- pipex.c 7 Apr 2020 07:11:22 -0000 1.113
+++ pipex.c 21 May 2020 21:49:50 -0000
@@ -1453,10 +1453,7 @@ pipex_pptp_output(struct mbuf *m0, struc
  gre->flags = htons(gre->flags);
 
  m0->m_pkthdr.ph_ifidx = session->pipex_iface->ifnet_this->if_index;
- if (ip_output(m0, NULL, NULL, 0, NULL, NULL, 0) != 0) {
- PIPEX_DBG((session, LOG_DEBUG, "ip_output failed."));
- goto drop;
- }
+ ip_send(m0);
  if (len > 0) { /* network layer only */
  /* countup statistics */
  session->stat.opackets++;
@@ -1901,11 +1898,7 @@ pipex_l2tp_output(struct mbuf *m0, struc
  ip->ip_tos = 0;
  ip->ip_off = 0;
 
- if (ip_output(m0, NULL, NULL, 0, NULL, NULL,
-    session->proto.l2tp.ipsecflowinfo) != 0) {
- PIPEX_DBG((session, LOG_DEBUG, "ip_output failed."));
- goto drop;
- }
+ ip_send(m0);
  break;
 #ifdef INET6
  case AF_INET6:
@@ -1920,10 +1913,7 @@ pipex_l2tp_output(struct mbuf *m0, struc
     &session->peer.sin6, NULL);
  /* ip6->ip6_plen will be filled in ip6_output. */
 
- if (ip6_output(m0, NULL, NULL, 0, NULL, NULL) != 0) {
- PIPEX_DBG((session, LOG_DEBUG, "ip6_output failed."));
- goto drop;
- }
+ ip6_send(m0);
  break;
 #endif
  }

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

Vitaliy Makkoveev
On Fri, May 22, 2020 at 07:57:13AM +1000, David Gwynne wrote:

> On Wed, May 20, 2020 at 05:42:35PM +0300, Vitaliy Makkoveev wrote:
> > I got splassert with pppx(4) and net/ifq.c rev 1.38 raised by
> > NET_ASSERT_LOCKED() in netinet/ip_output.c:113 and underlaying routines.
> >
> > net/ifq.c rev 1.38 is not in snapshot yet so you need to checkout and
> > build kernel to reproduce.
> >
> > ---- dmesg begin ----
> >
> > splassert: ip_output: want 2 have 0
> > Starting stack trace...
> > ip_output(fffffd801f8c8800,0,0,0,0,0) at ip_output+0x8f
> > pipex_l2tp_output(fffffd801f8c8800,ffff80000e787808) at
> > pipex_l2tp_output+0x21d
> > pipex_ppp_output(fffffd801f8c8800,ffff80000e787808,21) at
> > pipex_ppp_output+0xda
> > pppx_if_start(ffff80000e787268) at pppx_if_start+0x83
> > if_qstart_compat(ffff80000e7874e0) at if_qstart_compat+0x2e
> > ifq_serialize(ffff80000e7874e0,ffff80000e7875b0) at ifq_serialize+0x103
> > taskq_thread(ffffffff81f3df30) at taskq_thread+0x4d
> > end trace frame: 0x0, count: 250
> > End of stack trace.
> > splassert: ipsp_spd_lookup: want 2 have 0
> > Starting stack trace...
> > ipsp_spd_lookup(fffffd801f8c8800,2,14,ffff80000e726c9c,2,0) at
> > ipsp_spd_lookup+0x80
> > ip_output_ipsec_lookup(fffffd801f8c8800,14,ffff80000e726c9c,0,1) at
> > ip_output_ipsec_lookup+0x4d
> > ip_output(fffffd801f8c8800,0,0,0,0,0) at ip_output+0x4fa
> > pipex_l2tp_output(fffffd801f8c8800,ffff80000e787808) at
> > pipex_l2tp_output+0x21d
> > pipex_ppp_output(fffffd801f8c8800,ffff80000e787808,21) at
> > pipex_ppp_output+0xda
> > pppx_if_start(ffff80000e787268) at pppx_if_start+0x83
> > if_qstart_compat(ffff80000e7874e0) at if_qstart_compat+0x2e
> > ifq_serialize(ffff80000e7874e0,ffff80000e7875b0) at ifq_serialize+0x103
> > taskq_thread(ffffffff81f3df30) at taskq_thread+0x4d
> > end trace frame: 0x0, count: 248
> > End of stack trace.
> > splassert: spd_table_get: want 2 have 0
> >
> > ---- dmesg end ----
> >
> > 1. `pxi_if' owned by struct pppx_if has IFXF_MPSAFE flag unset
> >   1.1 pppx(4) sets IFQ_SET_MAXLEN(&ifp->if_snd, 1) at net/if_pppx.c:866
> > 2. pppx_if_output() is called under NET_LOCK()
> > 3. pppx_if_output() calls if_enqueue() at net/if_pppx.c:1123
> > 4. pppx(4) doesn't set `ifp->if_enqueue' so if_enqueue() calls
> >     if_enqueue_ifq() at net/if.c:709 (which is set in net/if.c:639)
> > 5. if_enqueue_ifq() calls ifq_start() at net/if.c:734
> > 6. ifq_start() we a still under NET_LOCK() here
> >
> > 6.a. in net/ifq.c rev 1.37 ifq_start() checks "ifq_len(ifq) >=
> > min(ifp->if_txmit, ifq->ifq_maxlen)" and this was always true because
> > (1.1) so we always call ifq_run_start() which calls ifq_serialize().
> >
> > ifq_serialize() will call if_qstart_compat() which calls
> > pppx_if_start() which calls pipex_ppp_output() etc while we still
> > holding NET_LOCK() so the assertions I reported above are not raised.
> >
> > 6.b. net/ifq.c rev 1.38 introduce checks of IFXF_MPSAFE flag. so we are
> > always going to net/ifq.c:132 where we adding out task to `systq'
> > referenced by `ifq_softnet' (ifq_softnet set to `systq' at
> > net/ifq.c:199).
> >
> > taskq_thread() doesn't grab NET_LOCK() so after net/ifq.c rev 1.38
> > ifq_serialize() and underlaying pppx_if_start() call without NET_LOCK()
> > and corresponding asserts raised.
> >
> > The problem I pointed is not in net/ifq.c rev 1.38 but in pppx(4).
> > `if_start' routines should grab NET_LOCK() by themself if it is required
> > but pppx_if_start() and pppac_start() did't do that. pppac_start() has
> > no underlaying NET_ASSERT_LOCKED() so the pppx(4) is the only case were
> > the problem is shown.
> >
> > Since NET_LOCK() is required by pipex(4), diff below adds it to
> > pppx_if_start() and pppac_start().
> >
> > After net/ifq.c rev 1.38 pppx_if_start() will newer be called from
> > pppx_if_output() but from `systq' only so I don't add lock/unlock
> > dances around if_enqueue() at net/if_pppx.c:1123.
> >
> > Diff tested for both pppx(4) and pppac(4) cases.
>
> thanks for the detailed analysis. i wondered how the ifq change
> triggered this exactly, and your mail makes it clear.
>
> however, pppx/pppac/pipex are not the first or only drivers in the tree
> that encapsulate a packet in IP from their if_start routine and send it
> out with the network stack. the way this has been solved in every other
> driver has been to call ip{,6}_send to transmit the packet instead
> of ip{,6}_output.
>
> can you try the following diff?
>

I tested this diff and it works for me. But the problem I pointed is
about pipex(4) locking.

pipex(4) requires NET_LOCK() be grabbed not only for underlaying
ip{,6}_output() but for itself too. But since pppac_start() has
unpredictable behavior I suggested to make it predictable [1].

With net/ifq.c rev 1.37 and rev 1.39 pppx_if_start() routine will be
allways called under NET_LOCK(), but pppac_start() will not. It depends
on packet count stored in `if_snd' (look at net/ifq.c:125).

With net/ifq.c rev 1.38 pppx_if_start() and pppac_start() have fully
identical behavior and we can simply add NET_LOCK() [2] required by
pipex(4).

Also I can write my own handler for `ifp->if_enqueue' to be sure about
`ifp->if_start' locking state, but I still have misunderstanding why
pppx(4) does "IFQ_SET_MAXLEN(&ifp->if_snd, 1)". Just to make locking
state of pppx_if_start() predictable or for avoid hidden bug triggering?

Also this time I'am not shure about KERNEL_LOCK() locking while pipex(4)
is accessed from the stack so I can't say is NET_LOCK() really required
for the whole pipex(4).

1. https://marc.info/?l=openbsd-tech&m=159006455720473&w=2
2. https://marc.info/?l=openbsd-tech&m=158998597126002&w=2

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

Martin Pieuchot
On 22/05/20(Fri) 13:25, Vitaliy Makkoveev wrote:

> On Fri, May 22, 2020 at 07:57:13AM +1000, David Gwynne wrote:
> > [...]
> > can you try the following diff?
> >
>
> I tested this diff and it works for me. But the problem I pointed is
> about pipex(4) locking.
>
> pipex(4) requires NET_LOCK() be grabbed not only for underlaying
> ip{,6}_output() but for itself too. But since pppac_start() has
> unpredictable behavior I suggested to make it predictable [1].

What needs the NET_LOCK() in their?  We're talking about
pipex_ppp_output(), right?  Does it really need the NET_LOCK() or
the KERNEL_LOCK() is what protects those data structures?

> With net/ifq.c rev 1.37 and rev 1.39 pppx_if_start() routine will be
> allways called under NET_LOCK(), but pppac_start() will not. It depends
> on packet count stored in `if_snd' (look at net/ifq.c:125).

Ideally the *_start() routine should not require the NET_LOCK().
Ideally the pseudo-drivers should not require the NET_LOCK().  That's
what we should aim for.

In case of pipex(4) is isn't clear that the NET_LOCK() is necessary.

> With net/ifq.c rev 1.38 pppx_if_start() and pppac_start() have fully
> identical behavior and we can simply add NET_LOCK() [2] required by
> pipex(4).
>
> Also I can write my own handler for `ifp->if_enqueue' to be sure about
> `ifp->if_start' locking state, but I still have misunderstanding why
> pppx(4) does "IFQ_SET_MAXLEN(&ifp->if_snd, 1)". Just to make locking
> state of pppx_if_start() predictable or for avoid hidden bug triggering?
>
> Also this time I'am not shure about KERNEL_LOCK() locking while pipex(4)
> is accessed from the stack so I can't say is NET_LOCK() really required
> for the whole pipex(4).
>
> 1. https://marc.info/?l=openbsd-tech&m=159006455720473&w=2
> 2. https://marc.info/?l=openbsd-tech&m=158998597126002&w=2
>

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

Vitaliy Makkoveev

> On 23 May 2020, at 12:54, Martin Pieuchot <[hidden email]> wrote:
>
> On 22/05/20(Fri) 13:25, Vitaliy Makkoveev wrote:
>> On Fri, May 22, 2020 at 07:57:13AM +1000, David Gwynne wrote:
>>> [...]
>>> can you try the following diff?
>>>
>>
>> I tested this diff and it works for me. But the problem I pointed is
>> about pipex(4) locking.
>>
>> pipex(4) requires NET_LOCK() be grabbed not only for underlaying
>> ip{,6}_output() but for itself too. But since pppac_start() has
>> unpredictable behavior I suggested to make it predictable [1].
>
> What needs the NET_LOCK() in their?  We're talking about
> pipex_ppp_output(), right?  Does it really need the NET_LOCK() or
> the KERNEL_LOCK() is what protects those data structures?

Yes, about pipex_ppp_output() and pipex_output(). Except
ip{,6}_output() nothing requires NET_LOCK(). As David Gwynne pointed,
they can be replaced by ip{,6}_send().

>
>> With net/ifq.c rev 1.37 and rev 1.39 pppx_if_start() routine will be
>> allways called under NET_LOCK(), but pppac_start() will not. It depends
>> on packet count stored in `if_snd' (look at net/ifq.c:125).
>
> Ideally the *_start() routine should not require the NET_LOCK().
> Ideally the pseudo-drivers should not require the NET_LOCK().  That's
> what we should aim for.

NET_LOCK() is not required. It’s locked while corresponding start routines
were called directly from pppx_if_output() and pppac_output(). In case of
pppac_start() you can't predict NET_LOCK() status.


>
> In case of pipex(4) is isn't clear that the NET_LOCK() is necessary.

I guess, pipex(4) was wrapped by NET_LOCK() to protect it while it’s
accessed through `pr_input’. Is NET_LOCK() required for this case?

Reply | Threaded
Open this post in threaded view
|

Re: fix pppx(4) with net/ifq.c rev 1.38

Martin Pieuchot
On 23/05/20(Sat) 15:38, Vitaliy Makkoveev wrote:

> > On 23 May 2020, at 12:54, Martin Pieuchot <[hidden email]> wrote:
> > On 22/05/20(Fri) 13:25, Vitaliy Makkoveev wrote:
> >> On Fri, May 22, 2020 at 07:57:13AM +1000, David Gwynne wrote:
> >>> [...]
> >>> can you try the following diff?
> >>>
> >>
> >> I tested this diff and it works for me. But the problem I pointed is
> >> about pipex(4) locking.
> >>
> >> pipex(4) requires NET_LOCK() be grabbed not only for underlaying
> >> ip{,6}_output() but for itself too. But since pppac_start() has
> >> unpredictable behavior I suggested to make it predictable [1].
> >
> > What needs the NET_LOCK() in their?  We're talking about
> > pipex_ppp_output(), right?  Does it really need the NET_LOCK() or
> > the KERNEL_LOCK() is what protects those data structures?
>
> Yes, about pipex_ppp_output() and pipex_output(). Except
> ip{,6}_output() nothing requires NET_LOCK(). As David Gwynne pointed,
> they can be replaced by ip{,6}_send().

Locks protect data structures, you're talking about functions, which
data structures are serialized by this lock?  I'm questioning whether
there is one.

> [...]
> > In case of pipex(4) is isn't clear that the NET_LOCK() is necessary.
>
> I guess, pipex(4) was wrapped by NET_LOCK() to protect it while it’s
> accessed through `pr_input’. Is NET_LOCK() required for this case?

pipex(4) like all the network stack has been wrapped in the NET_LOCK()
because it was easy to do.  That means it isn't a concious decision or
design.  The fact that pipex(4) code runs under the NET_LOCK() is a side
effect of how the rest of the stack evolved.  I'm questioning whether
this lock is required there.  In theory it shouldn't.  What is the
reality?