pipex(4): use reference counters for `ifnet'

classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
While `mbuf' enqueued to `pipexinq' or `pipexoutq' it has the reference
to corresponding pipex(4) session as `ph_cookie'. `ph_cookie' is
accessed by pipexintr() and it's always defferent context from context
where we destroy session. `ph_cookie' is protected only while we destroy
session by pipex_timer() but this protection is not effective. While we
destroy session related to pppx(4) `ph_cookie' is not potected. While we
destroy session related to pppac(4) by `PIPEXSMODE' ioctl() or by
closing pppac(4) device node `ph_cookie' is also not protected.

I'am going to use reference counters to protect `ph_cookie' but some
additional steps required to be done.

We have `pipex_iface' which holds the reference to external `ifnet'. The
pipex(4) session also has reference to this `ifnet'. With reference
counters pipex(4) session can still be in `pipexinq' or `pipexoutq' but
corresponding `pppx_if' or `pppac_softc' is already destroyed. I want to
be shure while we do if_detach() no one has reference to this `ifnet'.

Diff below grabs reference to `ifnet' each time it passed to pipex(4).
Also while we release this session we release the reference.

We attach `ifnet' before we have linked sessions and we detach `ifnet'
after we release all sessions. Now it's safe.

if_ref() was declared in sys/if.c so I moved if_ref() declaration to
sys/if.h.

Index: sys/net/if.c
===================================================================
RCS file: /cvs/src/sys/net/if.c,v
retrieving revision 1.610
diff -u -p -r1.610 if.c
--- sys/net/if.c 22 Jun 2020 09:45:13 -0000 1.610
+++ sys/net/if.c 24 Jun 2020 13:43:33 -0000
@@ -192,7 +192,6 @@ void if_qstart_compat(struct ifqueue *);
 
 void if_ifp_dtor(void *, void *);
 void if_map_dtor(void *, void *);
-struct ifnet *if_ref(struct ifnet *);
 
 /*
  * struct if_map
Index: sys/net/if.h
===================================================================
RCS file: /cvs/src/sys/net/if.h,v
retrieving revision 1.203
diff -u -p -r1.203 if.h
--- sys/net/if.h 25 Jul 2019 15:23:39 -0000 1.203
+++ sys/net/if.h 24 Jun 2020 13:43:33 -0000
@@ -548,6 +548,7 @@ int if_delgroup(struct ifnet *, const ch
 void if_group_routechange(struct sockaddr *, struct sockaddr *);
 struct ifnet *ifunit(const char *);
 struct ifnet *if_get(unsigned int);
+struct ifnet *if_ref(struct ifnet *);
 void if_put(struct ifnet *);
 void ifnewlladdr(struct ifnet *);
 void if_congestion(void);
Index: sys/net/if_pppx.c
===================================================================
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.90
diff -u -p -r1.90 if_pppx.c
--- sys/net/if_pppx.c 24 Jun 2020 08:52:53 -0000 1.90
+++ sys/net/if_pppx.c 24 Jun 2020 13:43:33 -0000
@@ -714,15 +714,15 @@ pppx_add_session(struct pppx_dev *pxd, s
  ifp->if_softc = pxi;
  /* ifp->if_rdomain = req->pr_rdomain; */
 
- error = pipex_link_session(session, &pxi->pxi_ifcontext);
- if (error)
- goto remove;
-
  /* XXXSMP breaks atomicity */
  NET_UNLOCK();
  if_attach(ifp);
  NET_LOCK();
 
+ error = pipex_link_session(session, &pxi->pxi_ifcontext);
+ if (error)
+ goto detach;
+
  if_addgroup(ifp, "pppx");
  if_alloc_sadl(ifp);
 
@@ -771,7 +771,12 @@ pppx_add_session(struct pppx_dev *pxd, s
 
  return (error);
 
-remove:
+detach:
+ /* XXXSMP breaks atomicity */
+ NET_UNLOCK();
+ if_detach(ifp);
+ NET_LOCK();
+
  if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
  panic("%s: inconsistent RB tree", __func__);
  LIST_REMOVE(pxi, pxi_list);
@@ -860,13 +865,13 @@ pppx_if_destroy(struct pppx_dev *pxd, st
  CLR(ifp->if_flags, IFF_RUNNING);
 
  pipex_unlink_session(session);
+ pipex_rele_session(session);
 
  /* XXXSMP breaks atomicity */
  NET_UNLOCK();
  if_detach(ifp);
  NET_LOCK();
 
- pipex_rele_session(session);
  if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
  panic("%s: inconsistent RB tree", __func__);
  LIST_REMOVE(pxi, pxi_list);
Index: sys/net/pipex.c
===================================================================
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.116
diff -u -p -r1.116 pipex.c
--- sys/net/pipex.c 22 Jun 2020 09:38:15 -0000 1.116
+++ sys/net/pipex.c 24 Jun 2020 13:43:34 -0000
@@ -147,6 +147,7 @@ pipex_iface_init(struct pipex_iface_cont
 {
  struct pipex_session *session;
 
+ if_ref(ifp);
  pipex_iface->pipexmode = 0;
  pipex_iface->ifnet_this = ifp;
 
@@ -164,6 +165,7 @@ pipex_iface_init(struct pipex_iface_cont
  /* virtual pipex_session entry for multicast */
  session = pool_get(&pipex_session_pool, PR_WAITOK | PR_ZERO);
  session->is_multicast = 1;
+ if_ref(pipex_iface->ifnet_this);
  session->pipex_iface = pipex_iface;
  pipex_iface->multicast_session = session;
 }
@@ -194,10 +196,11 @@ pipex_iface_stop(struct pipex_iface_cont
 void
 pipex_iface_fini(struct pipex_iface_context *pipex_iface)
 {
- pool_put(&pipex_session_pool, pipex_iface->multicast_session);
  NET_LOCK();
  pipex_iface_stop(pipex_iface);
  NET_UNLOCK();
+ pipex_rele_session(pipex_iface->multicast_session);
+ if_put(pipex_iface->ifnet_this);
 }
 
 int
@@ -421,6 +424,9 @@ pipex_init_session(struct pipex_session
 void
 pipex_rele_session(struct pipex_session *session)
 {
+ if (session->pipex_iface) {
+ if_put(session->pipex_iface->ifnet_this);
+ }
  if (session->mppe_recv.old_session_keys)
  pool_put(&mppe_key_pool, session->mppe_recv.old_session_keys);
  pool_put(&pipex_session_pool, session);
@@ -438,6 +444,7 @@ pipex_link_session(struct pipex_session
     session->session_id))
  return (EEXIST);
 
+ if_ref(iface->ifnet_this);
  session->pipex_iface = iface;
 
  LIST_INSERT_HEAD(&pipex_session_list, session, session_list);
@@ -655,7 +662,7 @@ pipex_destroy_session(struct pipex_sessi
  }
 
  pipex_unlink_session(session);
- pool_put(&pipex_session_pool, session);
+ pipex_rele_session(session);
 
  return (0);
 }

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Martin Pieuchot
On 24/06/20(Wed) 17:10, Vitaliy Makkoveev wrote:

> While `mbuf' enqueued to `pipexinq' or `pipexoutq' it has the reference
> to corresponding pipex(4) session as `ph_cookie'. `ph_cookie' is
> accessed by pipexintr() and it's always defferent context from context
> where we destroy session. `ph_cookie' is protected only while we destroy
> session by pipex_timer() but this protection is not effective. While we
> destroy session related to pppx(4) `ph_cookie' is not potected. While we
> destroy session related to pppac(4) by `PIPEXSMODE' ioctl() or by
> closing pppac(4) device node `ph_cookie' is also not protected.
>
> I'am going to use reference counters to protect `ph_cookie' but some
> additional steps required to be done.

Please no.  Store an ifidx in session instead of a pointer.  Such
index are guaranteed to be unique and can be used with if_get(9).

We deliberately kept if_ref() private to keep the code base coherent.

> We have `pipex_iface' which holds the reference to external `ifnet'. The
> pipex(4) session also has reference to this `ifnet'. With reference
> counters pipex(4) session can still be in `pipexinq' or `pipexoutq' but
> corresponding `pppx_if' or `pppac_softc' is already destroyed. I want to
> be shure while we do if_detach() no one has reference to this `ifnet'.
>
> Diff below grabs reference to `ifnet' each time it passed to pipex(4).
> Also while we release this session we release the reference.
>
> We attach `ifnet' before we have linked sessions and we detach `ifnet'
> after we release all sessions. Now it's safe.
>
> if_ref() was declared in sys/if.c so I moved if_ref() declaration to
> sys/if.h.
>
> Index: sys/net/if.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if.c,v
> retrieving revision 1.610
> diff -u -p -r1.610 if.c
> --- sys/net/if.c 22 Jun 2020 09:45:13 -0000 1.610
> +++ sys/net/if.c 24 Jun 2020 13:43:33 -0000
> @@ -192,7 +192,6 @@ void if_qstart_compat(struct ifqueue *);
>  
>  void if_ifp_dtor(void *, void *);
>  void if_map_dtor(void *, void *);
> -struct ifnet *if_ref(struct ifnet *);
>  
>  /*
>   * struct if_map
> Index: sys/net/if.h
> ===================================================================
> RCS file: /cvs/src/sys/net/if.h,v
> retrieving revision 1.203
> diff -u -p -r1.203 if.h
> --- sys/net/if.h 25 Jul 2019 15:23:39 -0000 1.203
> +++ sys/net/if.h 24 Jun 2020 13:43:33 -0000
> @@ -548,6 +548,7 @@ int if_delgroup(struct ifnet *, const ch
>  void if_group_routechange(struct sockaddr *, struct sockaddr *);
>  struct ifnet *ifunit(const char *);
>  struct ifnet *if_get(unsigned int);
> +struct ifnet *if_ref(struct ifnet *);
>  void if_put(struct ifnet *);
>  void ifnewlladdr(struct ifnet *);
>  void if_congestion(void);
> Index: sys/net/if_pppx.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 if_pppx.c
> --- sys/net/if_pppx.c 24 Jun 2020 08:52:53 -0000 1.90
> +++ sys/net/if_pppx.c 24 Jun 2020 13:43:33 -0000
> @@ -714,15 +714,15 @@ pppx_add_session(struct pppx_dev *pxd, s
>   ifp->if_softc = pxi;
>   /* ifp->if_rdomain = req->pr_rdomain; */
>  
> - error = pipex_link_session(session, &pxi->pxi_ifcontext);
> - if (error)
> - goto remove;
> -
>   /* XXXSMP breaks atomicity */
>   NET_UNLOCK();
>   if_attach(ifp);
>   NET_LOCK();
>  
> + error = pipex_link_session(session, &pxi->pxi_ifcontext);
> + if (error)
> + goto detach;
> +
>   if_addgroup(ifp, "pppx");
>   if_alloc_sadl(ifp);
>  
> @@ -771,7 +771,12 @@ pppx_add_session(struct pppx_dev *pxd, s
>  
>   return (error);
>  
> -remove:
> +detach:
> + /* XXXSMP breaks atomicity */
> + NET_UNLOCK();
> + if_detach(ifp);
> + NET_LOCK();
> +
>   if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
>   panic("%s: inconsistent RB tree", __func__);
>   LIST_REMOVE(pxi, pxi_list);
> @@ -860,13 +865,13 @@ pppx_if_destroy(struct pppx_dev *pxd, st
>   CLR(ifp->if_flags, IFF_RUNNING);
>  
>   pipex_unlink_session(session);
> + pipex_rele_session(session);
>  
>   /* XXXSMP breaks atomicity */
>   NET_UNLOCK();
>   if_detach(ifp);
>   NET_LOCK();
>  
> - pipex_rele_session(session);
>   if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
>   panic("%s: inconsistent RB tree", __func__);
>   LIST_REMOVE(pxi, pxi_list);
> Index: sys/net/pipex.c
> ===================================================================
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.116
> diff -u -p -r1.116 pipex.c
> --- sys/net/pipex.c 22 Jun 2020 09:38:15 -0000 1.116
> +++ sys/net/pipex.c 24 Jun 2020 13:43:34 -0000
> @@ -147,6 +147,7 @@ pipex_iface_init(struct pipex_iface_cont
>  {
>   struct pipex_session *session;
>  
> + if_ref(ifp);
>   pipex_iface->pipexmode = 0;
>   pipex_iface->ifnet_this = ifp;
>  
> @@ -164,6 +165,7 @@ pipex_iface_init(struct pipex_iface_cont
>   /* virtual pipex_session entry for multicast */
>   session = pool_get(&pipex_session_pool, PR_WAITOK | PR_ZERO);
>   session->is_multicast = 1;
> + if_ref(pipex_iface->ifnet_this);
>   session->pipex_iface = pipex_iface;
>   pipex_iface->multicast_session = session;
>  }
> @@ -194,10 +196,11 @@ pipex_iface_stop(struct pipex_iface_cont
>  void
>  pipex_iface_fini(struct pipex_iface_context *pipex_iface)
>  {
> - pool_put(&pipex_session_pool, pipex_iface->multicast_session);
>   NET_LOCK();
>   pipex_iface_stop(pipex_iface);
>   NET_UNLOCK();
> + pipex_rele_session(pipex_iface->multicast_session);
> + if_put(pipex_iface->ifnet_this);
>  }
>  
>  int
> @@ -421,6 +424,9 @@ pipex_init_session(struct pipex_session
>  void
>  pipex_rele_session(struct pipex_session *session)
>  {
> + if (session->pipex_iface) {
> + if_put(session->pipex_iface->ifnet_this);
> + }
>   if (session->mppe_recv.old_session_keys)
>   pool_put(&mppe_key_pool, session->mppe_recv.old_session_keys);
>   pool_put(&pipex_session_pool, session);
> @@ -438,6 +444,7 @@ pipex_link_session(struct pipex_session
>      session->session_id))
>   return (EEXIST);
>  
> + if_ref(iface->ifnet_this);
>   session->pipex_iface = iface;
>  
>   LIST_INSERT_HEAD(&pipex_session_list, session, session_list);
> @@ -655,7 +662,7 @@ pipex_destroy_session(struct pipex_sessi
>   }
>  
>   pipex_unlink_session(session);
> - pool_put(&pipex_session_pool, session);
> + pipex_rele_session(session);
>  
>   return (0);
>  }
>

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev


> On 25 Jun 2020, at 11:55, Martin Pieuchot <[hidden email]> wrote:
>
> On 24/06/20(Wed) 17:10, Vitaliy Makkoveev wrote:
>> While `mbuf' enqueued to `pipexinq' or `pipexoutq' it has the reference
>> to corresponding pipex(4) session as `ph_cookie'. `ph_cookie' is
>> accessed by pipexintr() and it's always defferent context from context
>> where we destroy session. `ph_cookie' is protected only while we destroy
>> session by pipex_timer() but this protection is not effective. While we
>> destroy session related to pppx(4) `ph_cookie' is not potected. While we
>> destroy session related to pppac(4) by `PIPEXSMODE' ioctl() or by
>> closing pppac(4) device node `ph_cookie' is also not protected.
>>
>> I'am going to use reference counters to protect `ph_cookie' but some
>> additional steps required to be done.
>
> Please no.  Store an ifidx in session instead of a pointer.  Such
> index are guaranteed to be unique and can be used with if_get(9).

This means I should do if_get(9) before each `ifnet’ usage. Also I
should do checks and introduce error path. It's ugly.

Alternatively, I can pass `ifnet->if_index’ instead of `ifnet’ to
sessinon being linked and store obtained reference up to session
destruction. I should add one error path to pipex_link_session().
It’s ugly but a little bit.

>
> We deliberately kept if_ref() private to keep the code base coherent.

Could you explain please why if_ref() and if_get() are incoherent?
Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Christiano F. Haesbaert
On Thu, 25 Jun 2020 at 14:06, Vitaliy Makkoveev
<[hidden email]> wrote:

>
>
>
> > On 25 Jun 2020, at 11:55, Martin Pieuchot <[hidden email]> wrote:
> >
> > On 24/06/20(Wed) 17:10, Vitaliy Makkoveev wrote:
> >> While `mbuf' enqueued to `pipexinq' or `pipexoutq' it has the reference
> >> to corresponding pipex(4) session as `ph_cookie'. `ph_cookie' is
> >> accessed by pipexintr() and it's always defferent context from context
> >> where we destroy session. `ph_cookie' is protected only while we destroy
> >> session by pipex_timer() but this protection is not effective. While we
> >> destroy session related to pppx(4) `ph_cookie' is not potected. While we
> >> destroy session related to pppac(4) by `PIPEXSMODE' ioctl() or by
> >> closing pppac(4) device node `ph_cookie' is also not protected.
> >>
> >> I'am going to use reference counters to protect `ph_cookie' but some
> >> additional steps required to be done.
> >
> > Please no.  Store an ifidx in session instead of a pointer.  Such
> > index are guaranteed to be unique and can be used with if_get(9).
>
> This means I should do if_get(9) before each `ifnet’ usage. Also I
> should do checks and introduce error path. It's ugly.
>

My 2 cents,

Not really, this is actually easier to reason with, it produces less
dependencies between data structures and the protocol is kinda clear,
you have to account for the loss of atomicity.
Reference counting is pretty much the worst thing you can do in MP development.
Very often it is actually desirable to pay for the cost of a lookup
(which is not the case here, since it's a silly O(1)) than to pay the
human/debug cost of interlinking data structures.
Also, in an ideal world if_get() would be RCUed (or SMRed) and would
not require bumping a reference count which implies an atomic
operation, it would also remove the need for if_put() since the epoch
is "until we context switch", as the kernel is not preemptive this is
easy to do without introducing new bugs.


> Alternatively, I can pass `ifnet->if_index’ instead of `ifnet’ to
> sessinon being linked and store obtained reference up to session
> destruction. I should add one error path to pipex_link_session().
> It’s ugly but a little bit.
>
> >
> > We deliberately kept if_ref() private to keep the code base coherent.
>
> Could you explain please why if_ref() and if_get() are incoherent?

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
> On 25 Jun 2020, at 16:35, Christiano F. Haesbaert <[hidden email]> wrote:
>
> On Thu, 25 Jun 2020 at 14:06, Vitaliy Makkoveev
> <[hidden email]> wrote:
>>
>>
>>
>>> On 25 Jun 2020, at 11:55, Martin Pieuchot <[hidden email]> wrote:
>>>
>>> On 24/06/20(Wed) 17:10, Vitaliy Makkoveev wrote:
>>>> While `mbuf' enqueued to `pipexinq' or `pipexoutq' it has the reference
>>>> to corresponding pipex(4) session as `ph_cookie'. `ph_cookie' is
>>>> accessed by pipexintr() and it's always defferent context from context
>>>> where we destroy session. `ph_cookie' is protected only while we destroy
>>>> session by pipex_timer() but this protection is not effective. While we
>>>> destroy session related to pppx(4) `ph_cookie' is not potected. While we
>>>> destroy session related to pppac(4) by `PIPEXSMODE' ioctl() or by
>>>> closing pppac(4) device node `ph_cookie' is also not protected.
>>>>
>>>> I'am going to use reference counters to protect `ph_cookie' but some
>>>> additional steps required to be done.
>>>
>>> Please no.  Store an ifidx in session instead of a pointer.  Such
>>> index are guaranteed to be unique and can be used with if_get(9).
>>
>> This means I should do if_get(9) before each `ifnet’ usage. Also I
>> should do checks and introduce error path. It's ugly.
>>
>
> My 2 cents,
>
> Not really, this is actually easier to reason with, it produces less
> dependencies between data structures and the protocol is kinda clear,
> you have to account for the loss of atomicity.
> Reference counting is pretty much the worst thing you can do in MP development.
> Very often it is actually desirable to pay for the cost of a lookup
> (which is not the case here, since it's a silly O(1)) than to pay the
> human/debug cost of interlinking data structures.
> Also, in an ideal world if_get() would be RCUed (or SMRed) and would
> not require bumping a reference count which implies an atomic
> operation, it would also remove the need for if_put() since the epoch
> is "until we context switch", as the kernel is not preemptive this is
> easy to do without introducing new bugs.
>

But I can’t store interface index with session instead of pointer.
Session can be alive after corresponding `ifnet’ was destroyed and
replaced by new `ifnet’.

>
>> Alternatively, I can pass `ifnet->if_index’ instead of `ifnet’ to
>> sessinon being linked and store obtained reference up to session
>> destruction. I should add one error path to pipex_link_session().
>> It’s ugly but a little bit.
>>
>>>
>>> We deliberately kept if_ref() private to keep the code base coherent.
>>
>> Could you explain please why if_ref() and if_get() are incoherent?

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
Updated diff.

OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
store index in session and be sure if_get(9) returned `ifnet' is our
original `ifnet'.

Now each pipex(4) session has it's own reference to `ifnet'. Also pppoe
related sessions has the reference to it's ethernet interface.

All `ifnet's are obtained by if_get(9) and we use `if_index' from stored
reference to `ifnet'.

Index: net/if_pppx.c
===================================================================
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.90
diff -u -p -r1.90 if_pppx.c
--- net/if_pppx.c 24 Jun 2020 08:52:53 -0000 1.90
+++ net/if_pppx.c 25 Jun 2020 16:41:25 -0000
@@ -714,15 +714,15 @@ pppx_add_session(struct pppx_dev *pxd, s
  ifp->if_softc = pxi;
  /* ifp->if_rdomain = req->pr_rdomain; */
 
- error = pipex_link_session(session, &pxi->pxi_ifcontext);
- if (error)
- goto remove;
-
  /* XXXSMP breaks atomicity */
  NET_UNLOCK();
  if_attach(ifp);
  NET_LOCK();
 
+ error = pipex_link_session(session, &pxi->pxi_ifcontext);
+ if (error)
+ goto detach;
+
  if_addgroup(ifp, "pppx");
  if_alloc_sadl(ifp);
 
@@ -771,7 +771,12 @@ pppx_add_session(struct pppx_dev *pxd, s
 
  return (error);
 
-remove:
+detach:
+ /* XXXSMP breaks atomicity */
+ NET_UNLOCK();
+ if_detach(ifp);
+ NET_LOCK();
+
  if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
  panic("%s: inconsistent RB tree", __func__);
  LIST_REMOVE(pxi, pxi_list);
@@ -860,13 +865,13 @@ pppx_if_destroy(struct pppx_dev *pxd, st
  CLR(ifp->if_flags, IFF_RUNNING);
 
  pipex_unlink_session(session);
+ pipex_rele_session(session);
 
  /* XXXSMP breaks atomicity */
  NET_UNLOCK();
  if_detach(ifp);
  NET_LOCK();
 
- pipex_rele_session(session);
  if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
  panic("%s: inconsistent RB tree", __func__);
  LIST_REMOVE(pxi, pxi_list);
Index: net/pipex.c
===================================================================
RCS file: /cvs/src/sys/net/pipex.c,v
retrieving revision 1.116
diff -u -p -r1.116 pipex.c
--- net/pipex.c 22 Jun 2020 09:38:15 -0000 1.116
+++ net/pipex.c 25 Jun 2020 16:41:25 -0000
@@ -148,7 +148,7 @@ pipex_iface_init(struct pipex_iface_cont
  struct pipex_session *session;
 
  pipex_iface->pipexmode = 0;
- pipex_iface->ifnet_this = ifp;
+ pipex_iface->ifnet_this = if_get(ifp->if_index);
 
  if (pipex_rd_head4 == NULL) {
  if (!rn_inithead((void **)&pipex_rd_head4,
@@ -165,6 +165,7 @@ pipex_iface_init(struct pipex_iface_cont
  session = pool_get(&pipex_session_pool, PR_WAITOK | PR_ZERO);
  session->is_multicast = 1;
  session->pipex_iface = pipex_iface;
+ session->ifnet_this = if_get(ifp->if_index);
  pipex_iface->multicast_session = session;
 }
 
@@ -194,10 +195,11 @@ pipex_iface_stop(struct pipex_iface_cont
 void
 pipex_iface_fini(struct pipex_iface_context *pipex_iface)
 {
- pool_put(&pipex_session_pool, pipex_iface->multicast_session);
  NET_LOCK();
  pipex_iface_stop(pipex_iface);
  NET_UNLOCK();
+ pipex_rele_session(pipex_iface->multicast_session);
+ if_put(pipex_iface->ifnet_this);
 }
 
 int
@@ -358,7 +360,7 @@ pipex_init_session(struct pipex_session
     MIN(req->pr_local_address.ss_len, sizeof(session->local)));
 #ifdef PIPEX_PPPOE
  if (req->pr_protocol == PIPEX_PROTO_PPPOE)
- session->proto.pppoe.over_ifidx = over_ifp->if_index;
+ session->proto.pppoe.over_iface = if_get(over_ifp->if_index);
 #endif
 #ifdef PIPEX_PPTP
  if (req->pr_protocol == PIPEX_PROTO_PPTP) {
@@ -421,6 +423,13 @@ pipex_init_session(struct pipex_session
 void
 pipex_rele_session(struct pipex_session *session)
 {
+#ifdef PIPEX_PPPOE
+ if (session->protocol == PIPEX_PROTO_PPPOE)
+ if_put(session->proto.pppoe.over_iface);
+#endif
+ if (session->ifnet_this) {
+ if_put(session->ifnet_this);
+ }
  if (session->mppe_recv.old_session_keys)
  pool_put(&mppe_key_pool, session->mppe_recv.old_session_keys);
  pool_put(&pipex_session_pool, session);
@@ -439,6 +448,7 @@ pipex_link_session(struct pipex_session
  return (EEXIST);
 
  session->pipex_iface = iface;
+ session->ifnet_this = if_get(iface->ifnet_this->if_index);
 
  LIST_INSERT_HEAD(&pipex_session_list, session, session_list);
  chain = PIPEX_ID_HASHTABLE(session->session_id);
@@ -655,7 +665,7 @@ pipex_destroy_session(struct pipex_sessi
  }
 
  pipex_unlink_session(session);
- pool_put(&pipex_session_pool, session);
+ pipex_rele_session(session);
 
  return (0);
 }
@@ -919,7 +929,7 @@ pipex_ip_output(struct mbuf *m0, struct
  struct ifnet *ifp;
 
  /* output succeed here as a interface */
- ifp = session->pipex_iface->ifnet_this;
+ ifp = session->ifnet_this;
  ifp->if_opackets++;
  ifp->if_obytes+=m0->m_pkthdr.len;
 
@@ -1038,7 +1048,7 @@ pipex_ppp_input(struct mbuf *m0, struct
 
 #if NBPFILTER > 0
     {
- struct ifnet *ifp = session->pipex_iface->ifnet_this;
+ struct ifnet *ifp = session->ifnet_this;
  if (ifp->if_bpf && ifp->if_type == IFT_PPP)
  bpf_mtap(ifp->if_bpf, m0, BPF_DIRECTION_IN);
     }
@@ -1105,7 +1115,7 @@ pipex_ip_input(struct mbuf *m0, struct p
  int is_idle;
 
  /* change recvif */
- ifp = session->pipex_iface->ifnet_this;
+ ifp = session->ifnet_this;
  m0->m_pkthdr.ph_ifidx = ifp->if_index;
 
  if (ISSET(session->ppp_flags, PIPEX_PPP_INGRESS_FILTER)) {
@@ -1174,7 +1184,7 @@ pipex_ip6_input(struct mbuf *m0, struct
  int len;
 
  /* change recvif */
- ifp = session->pipex_iface->ifnet_this;
+ ifp = session->ifnet_this;
  m0->m_pkthdr.ph_ifidx = ifp->if_index;
 
  /*
@@ -1346,7 +1356,8 @@ pipex_pppoe_lookup_session(struct mbuf *
  PIPEX_DBG((NULL, LOG_DEBUG, "<%s> session not found (id=%d)",
     __func__, pppoe.session_id));
 #endif
- if (session && session->proto.pppoe.over_ifidx != m0->m_pkthdr.ph_ifidx)
+ if (session && session->proto.pppoe.over_iface->if_index !=
+    m0->m_pkthdr.ph_ifidx)
  session = NULL;
 
  return (session);
@@ -1407,19 +1418,12 @@ pipex_pppoe_output(struct mbuf *m0, stru
  pppoe->session_id = htons(session->session_id);
  pppoe->length = htons(len);
 
- m0->m_pkthdr.ph_ifidx = session->proto.pppoe.over_ifidx;
+ ifp = session->proto.pppoe.over_iface;
+ m0->m_pkthdr.ph_ifidx = ifp->if_index;
  m0->m_flags &= ~(M_BCAST|M_MCAST);
-
- ifp = if_get(session->proto.pppoe.over_ifidx);
- if (ifp != NULL) {
- ifp->if_output(ifp, m0, &session->peer.sa, NULL);
- session->stat.opackets++;
- session->stat.obytes += len;
- } else {
- m_freem(m0);
- session->stat.oerrors++;
- }
- if_put(ifp);
+ ifp->if_output(ifp, m0, &session->peer.sa, NULL);
+ session->stat.opackets++;
+ session->stat.obytes += len;
 }
 #endif /* PIPEX_PPPOE */
 
Index: net/pipex_local.h
===================================================================
RCS file: /cvs/src/sys/net/pipex_local.h,v
retrieving revision 1.35
diff -u -p -r1.35 pipex_local.h
--- net/pipex_local.h 18 Jun 2020 14:20:12 -0000 1.35
+++ net/pipex_local.h 25 Jun 2020 16:41:25 -0000
@@ -78,7 +78,7 @@ struct pipex_mppe {
 
 #ifdef PIPEX_PPPOE
 struct pipex_pppoe_session {
- u_int over_ifidx;                    /* [I] ether interface */
+ struct ifnet *over_iface;               /* [I] ether interface */
 };
 #endif /* PIPEX_PPPOE */
 
@@ -183,6 +183,7 @@ struct pipex_session {
  int ip6_prefixlen;   /* [I] remote IPv6 prefixlen */
 
  struct pipex_iface_context* pipex_iface; /* [I] context for interface */
+ struct ifnet *ifnet_this; /* [k] outer interface */
 
  uint32_t ppp_flags; /* [I] configure flags */
 #ifdef PIPEX_MPPE

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Christiano F. Haesbaert
You can.

The idea is that ifindex is always monotonically increased, so to actually
get a new interface you would have to have "overflowed" 65k interfaces,
which is unreal.

So if your interface is gone, you can be sure if_get will give you NULL.

On Thu, Jun 25, 2020, 18:55 Vitaliy Makkoveev <[hidden email]>
wrote:

> Updated diff.
>
> OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> store index in session and be sure if_get(9) returned `ifnet' is our
> original `ifnet'.
>
> Now each pipex(4) session has it's own reference to `ifnet'. Also pppoe
> related sessions has the reference to it's ethernet interface.
>
> All `ifnet's are obtained by if_get(9) and we use `if_index' from stored
> reference to `ifnet'.
>
> Index: net/if_pppx.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.90
> diff -u -p -r1.90 if_pppx.c
> --- net/if_pppx.c       24 Jun 2020 08:52:53 -0000      1.90
> +++ net/if_pppx.c       25 Jun 2020 16:41:25 -0000
> @@ -714,15 +714,15 @@ pppx_add_session(struct pppx_dev *pxd, s
>         ifp->if_softc = pxi;
>         /* ifp->if_rdomain = req->pr_rdomain; */
>
> -       error = pipex_link_session(session, &pxi->pxi_ifcontext);
> -       if (error)
> -               goto remove;
> -
>         /* XXXSMP breaks atomicity */
>         NET_UNLOCK();
>         if_attach(ifp);
>         NET_LOCK();
>
> +       error = pipex_link_session(session, &pxi->pxi_ifcontext);
> +       if (error)
> +               goto detach;
> +
>         if_addgroup(ifp, "pppx");
>         if_alloc_sadl(ifp);
>
> @@ -771,7 +771,12 @@ pppx_add_session(struct pppx_dev *pxd, s
>
>         return (error);
>
> -remove:
> +detach:
> +       /* XXXSMP breaks atomicity */
> +       NET_UNLOCK();
> +       if_detach(ifp);
> +       NET_LOCK();
> +
>         if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
>                 panic("%s: inconsistent RB tree", __func__);
>         LIST_REMOVE(pxi, pxi_list);
> @@ -860,13 +865,13 @@ pppx_if_destroy(struct pppx_dev *pxd, st
>         CLR(ifp->if_flags, IFF_RUNNING);
>
>         pipex_unlink_session(session);
> +       pipex_rele_session(session);
>
>         /* XXXSMP breaks atomicity */
>         NET_UNLOCK();
>         if_detach(ifp);
>         NET_LOCK();
>
> -       pipex_rele_session(session);
>         if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
>                 panic("%s: inconsistent RB tree", __func__);
>         LIST_REMOVE(pxi, pxi_list);
> Index: net/pipex.c
> ===================================================================
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.116
> diff -u -p -r1.116 pipex.c
> --- net/pipex.c 22 Jun 2020 09:38:15 -0000      1.116
> +++ net/pipex.c 25 Jun 2020 16:41:25 -0000
> @@ -148,7 +148,7 @@ pipex_iface_init(struct pipex_iface_cont
>         struct pipex_session *session;
>
>         pipex_iface->pipexmode = 0;
> -       pipex_iface->ifnet_this = ifp;
> +       pipex_iface->ifnet_this = if_get(ifp->if_index);
>
>         if (pipex_rd_head4 == NULL) {
>                 if (!rn_inithead((void **)&pipex_rd_head4,
> @@ -165,6 +165,7 @@ pipex_iface_init(struct pipex_iface_cont
>         session = pool_get(&pipex_session_pool, PR_WAITOK | PR_ZERO);
>         session->is_multicast = 1;
>         session->pipex_iface = pipex_iface;
> +       session->ifnet_this = if_get(ifp->if_index);
>         pipex_iface->multicast_session = session;
>  }
>
> @@ -194,10 +195,11 @@ pipex_iface_stop(struct pipex_iface_cont
>  void
>  pipex_iface_fini(struct pipex_iface_context *pipex_iface)
>  {
> -       pool_put(&pipex_session_pool, pipex_iface->multicast_session);
>         NET_LOCK();
>         pipex_iface_stop(pipex_iface);
>         NET_UNLOCK();
> +       pipex_rele_session(pipex_iface->multicast_session);
> +       if_put(pipex_iface->ifnet_this);
>  }
>
>  int
> @@ -358,7 +360,7 @@ pipex_init_session(struct pipex_session
>                     MIN(req->pr_local_address.ss_len,
> sizeof(session->local)));
>  #ifdef PIPEX_PPPOE
>         if (req->pr_protocol == PIPEX_PROTO_PPPOE)
> -               session->proto.pppoe.over_ifidx = over_ifp->if_index;
> +               session->proto.pppoe.over_iface =
> if_get(over_ifp->if_index);
>  #endif
>  #ifdef PIPEX_PPTP
>         if (req->pr_protocol == PIPEX_PROTO_PPTP) {
> @@ -421,6 +423,13 @@ pipex_init_session(struct pipex_session
>  void
>  pipex_rele_session(struct pipex_session *session)
>  {
> +#ifdef PIPEX_PPPOE
> +       if (session->protocol == PIPEX_PROTO_PPPOE)
> +               if_put(session->proto.pppoe.over_iface);
> +#endif
> +       if (session->ifnet_this) {
> +               if_put(session->ifnet_this);
> +       }
>         if (session->mppe_recv.old_session_keys)
>                 pool_put(&mppe_key_pool,
> session->mppe_recv.old_session_keys);
>         pool_put(&pipex_session_pool, session);
> @@ -439,6 +448,7 @@ pipex_link_session(struct pipex_session
>                 return (EEXIST);
>
>         session->pipex_iface = iface;
> +       session->ifnet_this = if_get(iface->ifnet_this->if_index);
>
>         LIST_INSERT_HEAD(&pipex_session_list, session, session_list);
>         chain = PIPEX_ID_HASHTABLE(session->session_id);
> @@ -655,7 +665,7 @@ pipex_destroy_session(struct pipex_sessi
>         }
>
>         pipex_unlink_session(session);
> -       pool_put(&pipex_session_pool, session);
> +       pipex_rele_session(session);
>
>         return (0);
>  }
> @@ -919,7 +929,7 @@ pipex_ip_output(struct mbuf *m0, struct
>         struct ifnet *ifp;
>
>         /* output succeed here as a interface */
> -       ifp = session->pipex_iface->ifnet_this;
> +       ifp = session->ifnet_this;
>         ifp->if_opackets++;
>         ifp->if_obytes+=m0->m_pkthdr.len;
>
> @@ -1038,7 +1048,7 @@ pipex_ppp_input(struct mbuf *m0, struct
>
>  #if NBPFILTER > 0
>             {
> -               struct ifnet *ifp = session->pipex_iface->ifnet_this;
> +               struct ifnet *ifp = session->ifnet_this;
>                 if (ifp->if_bpf && ifp->if_type == IFT_PPP)
>                         bpf_mtap(ifp->if_bpf, m0, BPF_DIRECTION_IN);
>             }
> @@ -1105,7 +1115,7 @@ pipex_ip_input(struct mbuf *m0, struct p
>         int is_idle;
>
>         /* change recvif */
> -       ifp = session->pipex_iface->ifnet_this;
> +       ifp = session->ifnet_this;
>         m0->m_pkthdr.ph_ifidx = ifp->if_index;
>
>         if (ISSET(session->ppp_flags, PIPEX_PPP_INGRESS_FILTER)) {
> @@ -1174,7 +1184,7 @@ pipex_ip6_input(struct mbuf *m0, struct
>         int len;
>
>         /* change recvif */
> -       ifp = session->pipex_iface->ifnet_this;
> +       ifp = session->ifnet_this;
>         m0->m_pkthdr.ph_ifidx = ifp->if_index;
>
>         /*
> @@ -1346,7 +1356,8 @@ pipex_pppoe_lookup_session(struct mbuf *
>                 PIPEX_DBG((NULL, LOG_DEBUG, "<%s> session not found
> (id=%d)",
>                     __func__, pppoe.session_id));
>  #endif
> -       if (session && session->proto.pppoe.over_ifidx !=
> m0->m_pkthdr.ph_ifidx)
> +       if (session && session->proto.pppoe.over_iface->if_index !=
> +           m0->m_pkthdr.ph_ifidx)
>                 session = NULL;
>
>         return (session);
> @@ -1407,19 +1418,12 @@ pipex_pppoe_output(struct mbuf *m0, stru
>         pppoe->session_id = htons(session->session_id);
>         pppoe->length = htons(len);
>
> -       m0->m_pkthdr.ph_ifidx = session->proto.pppoe.over_ifidx;
> +       ifp = session->proto.pppoe.over_iface;
> +       m0->m_pkthdr.ph_ifidx = ifp->if_index;
>         m0->m_flags &= ~(M_BCAST|M_MCAST);
> -
> -       ifp = if_get(session->proto.pppoe.over_ifidx);
> -       if (ifp != NULL) {
> -               ifp->if_output(ifp, m0, &session->peer.sa, NULL);
> -               session->stat.opackets++;
> -               session->stat.obytes += len;
> -       } else {
> -               m_freem(m0);
> -               session->stat.oerrors++;
> -       }
> -       if_put(ifp);
> +       ifp->if_output(ifp, m0, &session->peer.sa, NULL);
> +       session->stat.opackets++;
> +       session->stat.obytes += len;
>  }
>  #endif /* PIPEX_PPPOE */
>
> Index: net/pipex_local.h
> ===================================================================
> RCS file: /cvs/src/sys/net/pipex_local.h,v
> retrieving revision 1.35
> diff -u -p -r1.35 pipex_local.h
> --- net/pipex_local.h   18 Jun 2020 14:20:12 -0000      1.35
> +++ net/pipex_local.h   25 Jun 2020 16:41:25 -0000
> @@ -78,7 +78,7 @@ struct pipex_mppe {
>
>  #ifdef PIPEX_PPPOE
>  struct pipex_pppoe_session {
> -       u_int    over_ifidx;                    /* [I] ether interface */
> +       struct ifnet *over_iface;               /* [I] ether interface */
>  };
>  #endif /* PIPEX_PPPOE */
>
> @@ -183,6 +183,7 @@ struct pipex_session {
>         int             ip6_prefixlen;   /* [I] remote IPv6 prefixlen */
>
>         struct pipex_iface_context* pipex_iface; /* [I] context for
> interface */
> +       struct ifnet    *ifnet_this;            /* [k] outer interface */
>
>         uint32_t        ppp_flags;              /* [I] configure flags */
>  #ifdef PIPEX_MPPE
>
Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev


> On 25 Jun 2020, at 20:13, Christiano F. Haesbaert <[hidden email]> wrote:
>
> You can.
>
> The idea is that ifindex is always monotonically increased, so to actually
> get a new interface you would have to have "overflowed" 65k interfaces,
> which is unreal.
>
> So if your interface is gone, you can be sure if_get will give you NULL.

No I can’t :) It’s *not* the operational infinity you talks about.

You can run:
while true ; do ifconfig tun0 create ; ifconfig tun0 destroy ; done

And after couple of minutes you will see is very real to have *counter* be
overflowed. Not the interface count.

>
> On Thu, Jun 25, 2020, 18:55 Vitaliy Makkoveev <[hidden email]>
> wrote:
>
>> Updated diff.
>>
>> OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
>> store index in session and be sure if_get(9) returned `ifnet' is our
>> original `ifnet'.
>>
>> Now each pipex(4) session has it's own reference to `ifnet'. Also pppoe
>> related sessions has the reference to it's ethernet interface.
>>
>> All `ifnet's are obtained by if_get(9) and we use `if_index' from stored
>> reference to `ifnet'.
>>
>> Index: net/if_pppx.c
>> ===================================================================
>> RCS file: /cvs/src/sys/net/if_pppx.c,v
>> retrieving revision 1.90
>> diff -u -p -r1.90 if_pppx.c
>> --- net/if_pppx.c       24 Jun 2020 08:52:53 -0000      1.90
>> +++ net/if_pppx.c       25 Jun 2020 16:41:25 -0000
>> @@ -714,15 +714,15 @@ pppx_add_session(struct pppx_dev *pxd, s
>>        ifp->if_softc = pxi;
>>        /* ifp->if_rdomain = req->pr_rdomain; */
>>
>> -       error = pipex_link_session(session, &pxi->pxi_ifcontext);
>> -       if (error)
>> -               goto remove;
>> -
>>        /* XXXSMP breaks atomicity */
>>        NET_UNLOCK();
>>        if_attach(ifp);
>>        NET_LOCK();
>>
>> +       error = pipex_link_session(session, &pxi->pxi_ifcontext);
>> +       if (error)
>> +               goto detach;
>> +
>>        if_addgroup(ifp, "pppx");
>>        if_alloc_sadl(ifp);
>>
>> @@ -771,7 +771,12 @@ pppx_add_session(struct pppx_dev *pxd, s
>>
>>        return (error);
>>
>> -remove:
>> +detach:
>> +       /* XXXSMP breaks atomicity */
>> +       NET_UNLOCK();
>> +       if_detach(ifp);
>> +       NET_LOCK();
>> +
>>        if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
>>                panic("%s: inconsistent RB tree", __func__);
>>        LIST_REMOVE(pxi, pxi_list);
>> @@ -860,13 +865,13 @@ pppx_if_destroy(struct pppx_dev *pxd, st
>>        CLR(ifp->if_flags, IFF_RUNNING);
>>
>>        pipex_unlink_session(session);
>> +       pipex_rele_session(session);
>>
>>        /* XXXSMP breaks atomicity */
>>        NET_UNLOCK();
>>        if_detach(ifp);
>>        NET_LOCK();
>>
>> -       pipex_rele_session(session);
>>        if (RBT_REMOVE(pppx_ifs, &pppx_ifs, pxi) == NULL)
>>                panic("%s: inconsistent RB tree", __func__);
>>        LIST_REMOVE(pxi, pxi_list);
>> Index: net/pipex.c
>> ===================================================================
>> RCS file: /cvs/src/sys/net/pipex.c,v
>> retrieving revision 1.116
>> diff -u -p -r1.116 pipex.c
>> --- net/pipex.c 22 Jun 2020 09:38:15 -0000      1.116
>> +++ net/pipex.c 25 Jun 2020 16:41:25 -0000
>> @@ -148,7 +148,7 @@ pipex_iface_init(struct pipex_iface_cont
>>        struct pipex_session *session;
>>
>>        pipex_iface->pipexmode = 0;
>> -       pipex_iface->ifnet_this = ifp;
>> +       pipex_iface->ifnet_this = if_get(ifp->if_index);
>>
>>        if (pipex_rd_head4 == NULL) {
>>                if (!rn_inithead((void **)&pipex_rd_head4,
>> @@ -165,6 +165,7 @@ pipex_iface_init(struct pipex_iface_cont
>>        session = pool_get(&pipex_session_pool, PR_WAITOK | PR_ZERO);
>>        session->is_multicast = 1;
>>        session->pipex_iface = pipex_iface;
>> +       session->ifnet_this = if_get(ifp->if_index);
>>        pipex_iface->multicast_session = session;
>> }
>>
>> @@ -194,10 +195,11 @@ pipex_iface_stop(struct pipex_iface_cont
>> void
>> pipex_iface_fini(struct pipex_iface_context *pipex_iface)
>> {
>> -       pool_put(&pipex_session_pool, pipex_iface->multicast_session);
>>        NET_LOCK();
>>        pipex_iface_stop(pipex_iface);
>>        NET_UNLOCK();
>> +       pipex_rele_session(pipex_iface->multicast_session);
>> +       if_put(pipex_iface->ifnet_this);
>> }
>>
>> int
>> @@ -358,7 +360,7 @@ pipex_init_session(struct pipex_session
>>                    MIN(req->pr_local_address.ss_len,
>> sizeof(session->local)));
>> #ifdef PIPEX_PPPOE
>>        if (req->pr_protocol == PIPEX_PROTO_PPPOE)
>> -               session->proto.pppoe.over_ifidx = over_ifp->if_index;
>> +               session->proto.pppoe.over_iface =
>> if_get(over_ifp->if_index);
>> #endif
>> #ifdef PIPEX_PPTP
>>        if (req->pr_protocol == PIPEX_PROTO_PPTP) {
>> @@ -421,6 +423,13 @@ pipex_init_session(struct pipex_session
>> void
>> pipex_rele_session(struct pipex_session *session)
>> {
>> +#ifdef PIPEX_PPPOE
>> +       if (session->protocol == PIPEX_PROTO_PPPOE)
>> +               if_put(session->proto.pppoe.over_iface);
>> +#endif
>> +       if (session->ifnet_this) {
>> +               if_put(session->ifnet_this);
>> +       }
>>        if (session->mppe_recv.old_session_keys)
>>                pool_put(&mppe_key_pool,
>> session->mppe_recv.old_session_keys);
>>        pool_put(&pipex_session_pool, session);
>> @@ -439,6 +448,7 @@ pipex_link_session(struct pipex_session
>>                return (EEXIST);
>>
>>        session->pipex_iface = iface;
>> +       session->ifnet_this = if_get(iface->ifnet_this->if_index);
>>
>>        LIST_INSERT_HEAD(&pipex_session_list, session, session_list);
>>        chain = PIPEX_ID_HASHTABLE(session->session_id);
>> @@ -655,7 +665,7 @@ pipex_destroy_session(struct pipex_sessi
>>        }
>>
>>        pipex_unlink_session(session);
>> -       pool_put(&pipex_session_pool, session);
>> +       pipex_rele_session(session);
>>
>>        return (0);
>> }
>> @@ -919,7 +929,7 @@ pipex_ip_output(struct mbuf *m0, struct
>>        struct ifnet *ifp;
>>
>>        /* output succeed here as a interface */
>> -       ifp = session->pipex_iface->ifnet_this;
>> +       ifp = session->ifnet_this;
>>        ifp->if_opackets++;
>>        ifp->if_obytes+=m0->m_pkthdr.len;
>>
>> @@ -1038,7 +1048,7 @@ pipex_ppp_input(struct mbuf *m0, struct
>>
>> #if NBPFILTER > 0
>>            {
>> -               struct ifnet *ifp = session->pipex_iface->ifnet_this;
>> +               struct ifnet *ifp = session->ifnet_this;
>>                if (ifp->if_bpf && ifp->if_type == IFT_PPP)
>>                        bpf_mtap(ifp->if_bpf, m0, BPF_DIRECTION_IN);
>>            }
>> @@ -1105,7 +1115,7 @@ pipex_ip_input(struct mbuf *m0, struct p
>>        int is_idle;
>>
>>        /* change recvif */
>> -       ifp = session->pipex_iface->ifnet_this;
>> +       ifp = session->ifnet_this;
>>        m0->m_pkthdr.ph_ifidx = ifp->if_index;
>>
>>        if (ISSET(session->ppp_flags, PIPEX_PPP_INGRESS_FILTER)) {
>> @@ -1174,7 +1184,7 @@ pipex_ip6_input(struct mbuf *m0, struct
>>        int len;
>>
>>        /* change recvif */
>> -       ifp = session->pipex_iface->ifnet_this;
>> +       ifp = session->ifnet_this;
>>        m0->m_pkthdr.ph_ifidx = ifp->if_index;
>>
>>        /*
>> @@ -1346,7 +1356,8 @@ pipex_pppoe_lookup_session(struct mbuf *
>>                PIPEX_DBG((NULL, LOG_DEBUG, "<%s> session not found
>> (id=%d)",
>>                    __func__, pppoe.session_id));
>> #endif
>> -       if (session && session->proto.pppoe.over_ifidx !=
>> m0->m_pkthdr.ph_ifidx)
>> +       if (session && session->proto.pppoe.over_iface->if_index !=
>> +           m0->m_pkthdr.ph_ifidx)
>>                session = NULL;
>>
>>        return (session);
>> @@ -1407,19 +1418,12 @@ pipex_pppoe_output(struct mbuf *m0, stru
>>        pppoe->session_id = htons(session->session_id);
>>        pppoe->length = htons(len);
>>
>> -       m0->m_pkthdr.ph_ifidx = session->proto.pppoe.over_ifidx;
>> +       ifp = session->proto.pppoe.over_iface;
>> +       m0->m_pkthdr.ph_ifidx = ifp->if_index;
>>        m0->m_flags &= ~(M_BCAST|M_MCAST);
>> -
>> -       ifp = if_get(session->proto.pppoe.over_ifidx);
>> -       if (ifp != NULL) {
>> -               ifp->if_output(ifp, m0, &session->peer.sa, NULL);
>> -               session->stat.opackets++;
>> -               session->stat.obytes += len;
>> -       } else {
>> -               m_freem(m0);
>> -               session->stat.oerrors++;
>> -       }
>> -       if_put(ifp);
>> +       ifp->if_output(ifp, m0, &session->peer.sa, NULL);
>> +       session->stat.opackets++;
>> +       session->stat.obytes += len;
>> }
>> #endif /* PIPEX_PPPOE */
>>
>> Index: net/pipex_local.h
>> ===================================================================
>> RCS file: /cvs/src/sys/net/pipex_local.h,v
>> retrieving revision 1.35
>> diff -u -p -r1.35 pipex_local.h
>> --- net/pipex_local.h   18 Jun 2020 14:20:12 -0000      1.35
>> +++ net/pipex_local.h   25 Jun 2020 16:41:25 -0000
>> @@ -78,7 +78,7 @@ struct pipex_mppe {
>>
>> #ifdef PIPEX_PPPOE
>> struct pipex_pppoe_session {
>> -       u_int    over_ifidx;                    /* [I] ether interface */
>> +       struct ifnet *over_iface;               /* [I] ether interface */
>> };
>> #endif /* PIPEX_PPPOE */
>>
>> @@ -183,6 +183,7 @@ struct pipex_session {
>>        int             ip6_prefixlen;   /* [I] remote IPv6 prefixlen */
>>
>>        struct pipex_iface_context* pipex_iface; /* [I] context for
>> interface */
>> +       struct ifnet    *ifnet_this;            /* [k] outer interface */
>>
>>        uint32_t        ppp_flags;              /* [I] configure flags */
>> #ifdef PIPEX_MPPE
>>

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Martin Pieuchot
In reply to this post by Vitaliy Makkoveev
On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> Updated diff.
>
> OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> store index in session and be sure if_get(9) returned `ifnet' is our
> original `ifnet'.

Why not?  The point of if_get(9) is to be sure.  If that doesn't work
for whatever reason then the if_get(9) interface has to be fixed.  Which
case doesn't work for you?  Do you have a reproducer?  

How does sessions stay around if their corresponding interface is
destroyed?

> Now each pipex(4) session has it's own reference to `ifnet'. Also pppoe
> related sessions has the reference to it's ethernet interface.
>
> All `ifnet's are obtained by if_get(9) and we use `if_index' from stored
> reference to `ifnet'.

That's not how the API is meant to be used.  All if_get(9) must have a
corresponding if_put(9) in the same context.  We don't store pointers
put interface indexes, meaning:

> Index: net/pipex.c
> ===================================================================
> RCS file: /cvs/src/sys/net/pipex.c,v
> retrieving revision 1.116
> diff -u -p -r1.116 pipex.c
> --- net/pipex.c 22 Jun 2020 09:38:15 -0000 1.116
> +++ net/pipex.c 25 Jun 2020 16:41:25 -0000
> @@ -148,7 +148,7 @@ pipex_iface_init(struct pipex_iface_cont
>   struct pipex_session *session;
>  
>   pipex_iface->pipexmode = 0;
> - pipex_iface->ifnet_this = ifp;
> + pipex_iface->ifnet_this = if_get(ifp->if_index);

`ifnet_this' should be replaced by `ifidx' so this becomes:

        pipex_iface->ifidx = ifp->if_index.

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:

> On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > Updated diff.
> >
> > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > store index in session and be sure if_get(9) returned `ifnet' is our
> > original `ifnet'.
>
> Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> for whatever reason then the if_get(9) interface has to be fixed.  Which
> case doesn't work for you?  Do you have a reproducer?  
>
> How does sessions stay around if their corresponding interface is
> destroyed?

We have `pipexinq' and `pipexoutq' which can store pointers to session.
pipexintr() process these queues. pipexintr() and
pipex_destroy_session() are *always* different context. This mean we
*can't* free pipex(4) session without be sure there is no reference to
this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
afret free issue. Look please at net/pipex.c:846. The way pppx(4)
destroy sessions is wrong. While pppac(4) destroys sessions by
pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
and `pipexoutq' state. I'am said it again and again.

Look at net/pipex.c:777

Static int
pipex_ppp_enqueue(struct mbuf *m0, struct pipex_session *session,
    struct mbuf_queue *mq)
{
        m0->m_pkthdr.ph_cookie = session; /* [1] */
        /* XXX need to support other protocols */
        m0->m_pkthdr.ph_ppp_proto = PPP_IP;

        if (mq_enqueue(mq, m0) != 0) /* [2] */
                return (1);

        schednetisr(NETISR_PIPEX); /* [3] */

        return (0);
}

and net/pipex.c:643

Static int
pipex_destroy_session(struct pipex_session *session)
{
        struct radix_node *rn;

        /* remove from radix tree and hash chain */
        NET_ASSERT_LOCKED();

        if (!in_nullhost(session->ip_address.sin_addr)) {
                rn = rn_delete(&session->ip_address, &session->ip_netmask,
                    pipex_rd_head4, (struct radix_node *)session);
                KASSERT(rn != NULL);
        }

        pipex_unlink_session(session);
        pool_put(&pipex_session_pool, session); /* [4] */

        return (0);
}

and net/pipex.c:720

void
pipexintr(void)
{
        struct pipex_session *pkt_session;
        u_int16_t proto;
        struct mbuf *m;
        struct mbuf_list ml;

        /* ppp output */
        mq_delist(&pipexoutq, &ml);
        while ((m = ml_dequeue(&ml)) != NULL) {
                pkt_session = m->m_pkthdr.ph_cookie; /* [5] */
                /* ... */
                pipex_ppp_output(m, pkt_session, proto);
                /* [7] */
        }
        /* ... */
        mq_delist(&pipexinq, &ml);
        while ((m = ml_dequeue(&ml)) != NULL) {
                pkt_session = m->m_pkthdr.ph_cookie; /* [6] */
                /* ... */
                pipex_ppp_input(m, pkt_session, 0);
                /* [8] */
        }
}

and net/pipex.c:808

Static void
pipex_timer(void *ignored_arg)
{
        struct pipex_session *session, *session_tmp;

        timeout_add_sec(&pipex_timer_ch, pipex_prune);

        NET_LOCK();
        /* walk through */
        LIST_FOREACH_SAFE(session, &pipex_session_list, session_list,
            session_tmp) {
                switch (session->state) {
                /* ... */
                case PIPEX_STATE_CLOSED:
                        /*
                         * mbuf queued in pipexinq or pipexoutq may have a
                         * refererce to this session.
                         */
                        /* [9] */
                        if (!mq_empty(&pipexinq) || !mq_empty(&pipexoutq))
                                continue;

                        pipex_destroy_session(session);
                        break;
                /* ... */
        }

1. You pass pointer to mbuf
2. You enqueue this mbuf to `pipexinq' or `pipexoutq'
3. You call task_add();

Is it * always * garanteed that netisr() will be called here? What
will be happened if netisr will be called after [4]?

4. You destroy session.

5. You use memory pointed by `pkt_session' It's netisr context. Are you
shure `ph_cookie' is still valid?
6. You use memory pointed by `pkt_session' It's netisr context. Are you
shure `ph_cookie' is still valid?

Also we are going to move pipexintr() out of KERNEL_LOCK() and
NET_LOCK(). This means pipex_destroy_session() and pipexintr() will be
executed simultaneously. How to protect `ph_cookie'?

I guess to use reference counters for pipex(4) sessions. We will grab
extra reference before [1] and release this reference at [7] or [8].

I see absolutele *no* reason to sleep at [9]. So refcnt_finalize() is
not applicable here. I want to use reference counters in the way of
file(9). This means session can live after userland destroy it because
it has reference in `pipexinq' or `pipexoutq'.

But userland will not only release it's reference to session. It will
destroy `ifnet' too. And you will have session referenced by `pipexinq'
or `pipexoutq' but it's `ifnet' is already destoyed.

The diff I posted should prevent this.

you can run something like the command below:
while true ; do ifconfig tun0 create ; ifconfig tun0 destroy ; done

This command will overflow interface index counters and it will be start
from null. It's ok. It will not overwrite indexes for existing `ifnet's.
But if you are store session index as you suggested to store it's
absolutely *NOT* true that if_get(9) will return NULL if the session you
want to get is done.

The case is:
1. The session is in pipexinq' or `pipexoutq'.
2. pppx(4) or pppac(4) already release its reference and corresponding
        `ifnet'.
3. pipexintr() want to access to this session erferenced by index:
         ifp = if_get(session->if_index);
         `ifp' can be NULL here, you are lucky.
         or NOT NULL here, that means you grab `ifnet' owned by someone
         else.

The way I suggest to go albsolutely exclude this case.

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Martin Pieuchot
On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:

> On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > Updated diff.
> > >
> > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > original `ifnet'.
> >
> > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > case doesn't work for you?  Do you have a reproducer?  
> >
> > How does sessions stay around if their corresponding interface is
> > destroyed?
>
> We have `pipexinq' and `pipexoutq' which can store pointers to session.
> pipexintr() process these queues. pipexintr() and
> pipex_destroy_session() are *always* different context. This mean we
> *can't* free pipex(4) session without be sure there is no reference to
> this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> destroy sessions is wrong. While pppac(4) destroys sessions by
> pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> and `pipexoutq' state. I'am said it again and again.

I understand.  Why is it a problem?  Using reference counting the way
you're suggesting is *one* possible solution to a problem we don't fully
understand.  What are we trying to achieve?  Which problem are we trying
to solve?

Are we trying to move pipexintr() out of KERNEL_LOCK()?  If so are we
trying to guarantee the memory pointed by `ph_cookie' representing a
session pointer is always valid without the KERNEL_LOCK()?

If that's what we're trying to do, shouldn't we start with a baby-step
and trade the KERNEL_LOCK() for another lock first?  I understand it is
not some sexy-mp-rcu-atomic-crazy-refcount-clever solution, but it is
simpler to do.

Since all the packets (mbuf) are being processed with the NET_LOCK(). we
should be able to start trading the KERNEL_LOCK() by the NET_LOCK(), no?
Once that works and we've built knowledge about the existing architecture
we might consider some finer grain locking.

Is there any downside to this suggestion?

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:

> On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > Updated diff.
> > > >
> > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > original `ifnet'.
> > >
> > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > case doesn't work for you?  Do you have a reproducer?  
> > >
> > > How does sessions stay around if their corresponding interface is
> > > destroyed?
> >
> > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > pipexintr() process these queues. pipexintr() and
> > pipex_destroy_session() are *always* different context. This mean we
> > *can't* free pipex(4) session without be sure there is no reference to
> > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > destroy sessions is wrong. While pppac(4) destroys sessions by
> > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > and `pipexoutq' state. I'am said it again and again.
>
> I understand.  Why is it a problem?  Using reference counting the way
> you're suggesting is *one* possible solution to a problem we don't fully
> understand.  What are we trying to achieve?  Which problem are we trying
> to solve?

Sorry, may be I misunderstand something.

`pipexoutq' case:

1. pppac_start() calls pipex_output()
2. pipex_output() calls pipex_ip_output()
3. pipex_ip_output() calls pipex_ppp_enqueue()
4. pipex_ppp_enqueue() calls schednetisr() which is task_add()

`pipexinq' cases:

1.1. ether_input() calls pipex_pppoe_input()
1.2. gre_input() calls gre_input_1()
         gre_input_1() calls pipex_pptp_input()
1.3. udp_input() calls pipex_l2tp_input()

2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
3. pipex_common_input() calls schednetisr() which is task_add()

task_add(9) just schedules the execution of the work specified by `tq'.
So we can do pipex_destroy_session() * between * schednetisr() and
pipexintr(). And we can do this right * now *, with our current locking.
And this is the problem I'am trying to solve.

My apologies if I'am wrong above. Please point me where I'am wrong.

Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
session. We should be shure `session' is still walid between
pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
required in future.

> Are we trying to move pipexintr() out of KERNEL_LOCK()?  If so are we
> trying to guarantee the memory pointed by `ph_cookie' representing a
> session pointer is always valid without the KERNEL_LOCK()?
>
> If that's what we're trying to do, shouldn't we start with a baby-step
> and trade the KERNEL_LOCK() for another lock first?  I understand it is
> not some sexy-mp-rcu-atomic-crazy-refcount-clever solution, but it is
> simpler to do.
>
> Since all the packets (mbuf) are being processed with the NET_LOCK(). we
> should be able to start trading the KERNEL_LOCK() by the NET_LOCK(), no?
> Once that works and we've built knowledge about the existing architecture
> we might consider some finer grain locking.
>
> Is there any downside to this suggestion?
>

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Martin Pieuchot
On 26/06/20(Fri) 17:53, Vitaliy Makkoveev wrote:

> On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> > On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > > Updated diff.
> > > > >
> > > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > > original `ifnet'.
> > > >
> > > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > > case doesn't work for you?  Do you have a reproducer?  
> > > >
> > > > How does sessions stay around if their corresponding interface is
> > > > destroyed?
> > >
> > > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > > pipexintr() process these queues. pipexintr() and
> > > pipex_destroy_session() are *always* different context. This mean we
> > > *can't* free pipex(4) session without be sure there is no reference to
> > > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > > destroy sessions is wrong. While pppac(4) destroys sessions by
> > > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > > and `pipexoutq' state. I'am said it again and again.
> >
> > I understand.  Why is it a problem?  Using reference counting the way
> > you're suggesting is *one* possible solution to a problem we don't fully
> > understand.  What are we trying to achieve?  Which problem are we trying
> > to solve?
>
> Sorry, may be I misunderstand something.
>
> `pipexoutq' case:
>
> 1. pppac_start() calls pipex_output()
> 2. pipex_output() calls pipex_ip_output()
> 3. pipex_ip_output() calls pipex_ppp_enqueue()
> 4. pipex_ppp_enqueue() calls schednetisr() which is task_add()
>
> `pipexinq' cases:
>
> 1.1. ether_input() calls pipex_pppoe_input()
> 1.2. gre_input() calls gre_input_1()
>          gre_input_1() calls pipex_pptp_input()
> 1.3. udp_input() calls pipex_l2tp_input()
>
> 2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
> 3. pipex_common_input() calls schednetisr() which is task_add()
>
> task_add(9) just schedules the execution of the work specified by `tq'.
> So we can do pipex_destroy_session() * between * schednetisr() and
> pipexintr(). And we can do this right * now *, with our current locking.
> And this is the problem I'am trying to solve.
>
> My apologies if I'am wrong above. Please point me where I'am wrong.
>
> Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
> pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
> session. We should be shure `session' is still walid between
> pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
> required in future.

Why not iterate over the queues and garbage collect the sessions that
are about to be removed?  That's what the network stack was doing with
mbuf queues prior to if_get(9) when interfaces where destroyed.

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
On Fri, Jun 26, 2020 at 09:15:38PM +0200, Martin Pieuchot wrote:

> On 26/06/20(Fri) 17:53, Vitaliy Makkoveev wrote:
> > On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> > > On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > > > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > > > Updated diff.
> > > > > >
> > > > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > > > original `ifnet'.
> > > > >
> > > > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > > > case doesn't work for you?  Do you have a reproducer?  
> > > > >
> > > > > How does sessions stay around if their corresponding interface is
> > > > > destroyed?
> > > >
> > > > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > > > pipexintr() process these queues. pipexintr() and
> > > > pipex_destroy_session() are *always* different context. This mean we
> > > > *can't* free pipex(4) session without be sure there is no reference to
> > > > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > > > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > > > destroy sessions is wrong. While pppac(4) destroys sessions by
> > > > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > > > and `pipexoutq' state. I'am said it again and again.
> > >
> > > I understand.  Why is it a problem?  Using reference counting the way
> > > you're suggesting is *one* possible solution to a problem we don't fully
> > > understand.  What are we trying to achieve?  Which problem are we trying
> > > to solve?
> >
> > Sorry, may be I misunderstand something.
> >
> > `pipexoutq' case:
> >
> > 1. pppac_start() calls pipex_output()
> > 2. pipex_output() calls pipex_ip_output()
> > 3. pipex_ip_output() calls pipex_ppp_enqueue()
> > 4. pipex_ppp_enqueue() calls schednetisr() which is task_add()
> >
> > `pipexinq' cases:
> >
> > 1.1. ether_input() calls pipex_pppoe_input()
> > 1.2. gre_input() calls gre_input_1()
> >          gre_input_1() calls pipex_pptp_input()
> > 1.3. udp_input() calls pipex_l2tp_input()
> >
> > 2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
> > 3. pipex_common_input() calls schednetisr() which is task_add()
> >
> > task_add(9) just schedules the execution of the work specified by `tq'.
> > So we can do pipex_destroy_session() * between * schednetisr() and
> > pipexintr(). And we can do this right * now *, with our current locking.
> > And this is the problem I'am trying to solve.
> >
> > My apologies if I'am wrong above. Please point me where I'am wrong.
> >
> > Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
> > pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
> > session. We should be shure `session' is still walid between
> > pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
> > required in future.
>
> Why not iterate over the queues and garbage collect the sessions that
> are about to be removed?  That's what the network stack was doing with
> mbuf queues prior to if_get(9) when interfaces where destroyed.
>

Do you mean net/if.c:1185 and below? This is the queues associated with
this `ifp'. But for pipex(4) we should go through all mbufs associated
with pipex(4). This can be heavy if we have hundreds of sessions. Also
this would work until session destruction and `pr_input' are serialized.

Point me please the line in source to see if I'am wrong about `ifnet's
mbuf queues claninig.

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Martin Pieuchot
On 27/06/20(Sat) 01:02, Vitaliy Makkoveev wrote:

> On Fri, Jun 26, 2020 at 09:15:38PM +0200, Martin Pieuchot wrote:
> > On 26/06/20(Fri) 17:53, Vitaliy Makkoveev wrote:
> > > On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> > > > On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > > > > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > > > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > > > > Updated diff.
> > > > > > >
> > > > > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > > > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > > > > original `ifnet'.
> > > > > >
> > > > > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > > > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > > > > case doesn't work for you?  Do you have a reproducer?  
> > > > > >
> > > > > > How does sessions stay around if their corresponding interface is
> > > > > > destroyed?
> > > > >
> > > > > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > > > > pipexintr() process these queues. pipexintr() and
> > > > > pipex_destroy_session() are *always* different context. This mean we
> > > > > *can't* free pipex(4) session without be sure there is no reference to
> > > > > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > > > > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > > > > destroy sessions is wrong. While pppac(4) destroys sessions by
> > > > > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > > > > and `pipexoutq' state. I'am said it again and again.
> > > >
> > > > I understand.  Why is it a problem?  Using reference counting the way
> > > > you're suggesting is *one* possible solution to a problem we don't fully
> > > > understand.  What are we trying to achieve?  Which problem are we trying
> > > > to solve?
> > >
> > > Sorry, may be I misunderstand something.
> > >
> > > `pipexoutq' case:
> > >
> > > 1. pppac_start() calls pipex_output()
> > > 2. pipex_output() calls pipex_ip_output()
> > > 3. pipex_ip_output() calls pipex_ppp_enqueue()
> > > 4. pipex_ppp_enqueue() calls schednetisr() which is task_add()
> > >
> > > `pipexinq' cases:
> > >
> > > 1.1. ether_input() calls pipex_pppoe_input()
> > > 1.2. gre_input() calls gre_input_1()
> > >          gre_input_1() calls pipex_pptp_input()
> > > 1.3. udp_input() calls pipex_l2tp_input()
> > >
> > > 2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
> > > 3. pipex_common_input() calls schednetisr() which is task_add()
> > >
> > > task_add(9) just schedules the execution of the work specified by `tq'.
> > > So we can do pipex_destroy_session() * between * schednetisr() and
> > > pipexintr(). And we can do this right * now *, with our current locking.
> > > And this is the problem I'am trying to solve.
> > >
> > > My apologies if I'am wrong above. Please point me where I'am wrong.
> > >
> > > Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
> > > pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
> > > session. We should be shure `session' is still walid between
> > > pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
> > > required in future.
> >
> > Why not iterate over the queues and garbage collect the sessions that
> > are about to be removed?  That's what the network stack was doing with
> > mbuf queues prior to if_get(9) when interfaces where destroyed.
> >
>
> Do you mean net/if.c:1185 and below? This is the queues associated with
> this `ifp'. But for pipex(4) we should go through all mbufs associated
> with pipex(4). This can be heavy if we have hundreds of sessions. Also
> this would work until session destruction and `pr_input' are serialized.
>
> Point me please the line in source to see if I'am wrong about `ifnet's
> mbuf queues claninig.

Look at r1.329 of net/if.c.  Prior to this change if_detach_queues() was
used to free all mbufs when an interface was removed.  Now lazy freeing
is used everytime if_get(9) rerturns NULL.

This is possible because we store an index and not a pointer directly in
the mbuf.

The advantage of storing a session pointer in `ph_cookie' is that no
lookup is required in pipexintr(), right?  Maybe we could save a ID
instead and do a lookup.  How big can be the `pipex_session_list'?

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Vitaliy Makkoveev
On Sat, Jun 27, 2020 at 12:41:29PM +0200, Martin Pieuchot wrote:

> On 27/06/20(Sat) 01:02, Vitaliy Makkoveev wrote:
> > On Fri, Jun 26, 2020 at 09:15:38PM +0200, Martin Pieuchot wrote:
> > > On 26/06/20(Fri) 17:53, Vitaliy Makkoveev wrote:
> > > > On Fri, Jun 26, 2020 at 02:29:03PM +0200, Martin Pieuchot wrote:
> > > > > On 26/06/20(Fri) 12:35, Vitaliy Makkoveev wrote:
> > > > > > On Fri, Jun 26, 2020 at 10:23:42AM +0200, Martin Pieuchot wrote:
> > > > > > > On 25/06/20(Thu) 19:56, Vitaliy Makkoveev wrote:
> > > > > > > > Updated diff.
> > > > > > > >
> > > > > > > > OpenBSD uses 16 bit counter for allocate interface indexes. So we can't
> > > > > > > > store index in session and be sure if_get(9) returned `ifnet' is our
> > > > > > > > original `ifnet'.
> > > > > > >
> > > > > > > Why not?  The point of if_get(9) is to be sure.  If that doesn't work
> > > > > > > for whatever reason then the if_get(9) interface has to be fixed.  Which
> > > > > > > case doesn't work for you?  Do you have a reproducer?  
> > > > > > >
> > > > > > > How does sessions stay around if their corresponding interface is
> > > > > > > destroyed?
> > > > > >
> > > > > > We have `pipexinq' and `pipexoutq' which can store pointers to session.
> > > > > > pipexintr() process these queues. pipexintr() and
> > > > > > pipex_destroy_session() are *always* different context. This mean we
> > > > > > *can't* free pipex(4) session without be sure there is no reference to
> > > > > > this session in `pipexinq' or `pipexoutq'. Elsewhere this will cause use
> > > > > > afret free issue. Look please at net/pipex.c:846. The way pppx(4)
> > > > > > destroy sessions is wrong. While pppac(4) destroys sessions by
> > > > > > pipex_iface_fini() it's also wrong. Because we don't check `pipexinq'
> > > > > > and `pipexoutq' state. I'am said it again and again.
> > > > >
> > > > > I understand.  Why is it a problem?  Using reference counting the way
> > > > > you're suggesting is *one* possible solution to a problem we don't fully
> > > > > understand.  What are we trying to achieve?  Which problem are we trying
> > > > > to solve?
> > > >
> > > > Sorry, may be I misunderstand something.
> > > >
> > > > `pipexoutq' case:
> > > >
> > > > 1. pppac_start() calls pipex_output()
> > > > 2. pipex_output() calls pipex_ip_output()
> > > > 3. pipex_ip_output() calls pipex_ppp_enqueue()
> > > > 4. pipex_ppp_enqueue() calls schednetisr() which is task_add()
> > > >
> > > > `pipexinq' cases:
> > > >
> > > > 1.1. ether_input() calls pipex_pppoe_input()
> > > > 1.2. gre_input() calls gre_input_1()
> > > >          gre_input_1() calls pipex_pptp_input()
> > > > 1.3. udp_input() calls pipex_l2tp_input()
> > > >
> > > > 2. pipex_{pppoe,pptp,l2tp}_input() calls pipex_common_input()
> > > > 3. pipex_common_input() calls schednetisr() which is task_add()
> > > >
> > > > task_add(9) just schedules the execution of the work specified by `tq'.
> > > > So we can do pipex_destroy_session() * between * schednetisr() and
> > > > pipexintr(). And we can do this right * now *, with our current locking.
> > > > And this is the problem I'am trying to solve.
> > > >
> > > > My apologies if I'am wrong above. Please point me where I'am wrong.
> > > >
> > > > Also before pipex_{pppoe,pptp,l2tp}_input() we call corresponding
> > > > pipex_{pptp,l2tp}_lookup_session() to obtain pointer to pipex(4)
> > > > session. We should be shure `session' is still walid between
> > > > pipex_*_lookup() and pipex_*_input(). It's not required now, but will be
> > > > required in future.
> > >
> > > Why not iterate over the queues and garbage collect the sessions that
> > > are about to be removed?  That's what the network stack was doing with
> > > mbuf queues prior to if_get(9) when interfaces where destroyed.
> > >
> >
> > Do you mean net/if.c:1185 and below? This is the queues associated with
> > this `ifp'. But for pipex(4) we should go through all mbufs associated
> > with pipex(4). This can be heavy if we have hundreds of sessions. Also
> > this would work until session destruction and `pr_input' are serialized.
> >
> > Point me please the line in source to see if I'am wrong about `ifnet's
> > mbuf queues claninig.
>
> Look at r1.329 of net/if.c.  Prior to this change if_detach_queues() was
> used to free all mbufs when an interface was removed.  Now lazy freeing
> is used everytime if_get(9) rerturns NULL.
>
> This is possible because we store an index and not a pointer directly in
> the mbuf.
>
> The advantage of storing a session pointer in `ph_cookie' is that no
> lookup is required in pipexintr(), right?  Maybe we could save a ID
> instead and do a lookup.  How big can be the `pipex_session_list'?
>

It's unlimited. In pppac(4) case you create the only one interface and
you can share it between the count of sessions you wish. In my practice
I had machines with 800+ active ppp interfaces in 2005. We can have
dosens of cores and hundreds gigs of ram now. How big can be real
count of active ppp interfaces on VPN provider's NAS?

I looked at r1.328 of net/if.c.
Imagine we have a lot of connected clients with high traffic. One on them
starts connect/disconnect games. It's not malicious hacker, just mobile
phone in area with low signal. You need:

1. block pipexintr() (and netisr too).
2. block insertion to queues from concurrent threads.
3. Walk through very loaded queues and compare. And most or may be all
   packets are foreign.
4. Repeat it every time you lost connection.

Yes, now it's all serealized. And pipexintr() is already blocked while
we do session destruction(). But what is the reason to make your future
life harder? pipex(4) session already has pointer to it's relared
`ifnet'. `ifnet' already has reference counters. Why don't use them? The
cases you suggested are serious performance impact in high load.

In the way I suggest to use refcounters for pipex(4) session you don't
need to block your packet processing. You don't need to do searchs. You
just need to be sure the `ifnet' you has is still alive. You already has
mechanics for that. Why don't use this? You don't like if_ref() to be
used outside net/if.c? Ok, what's wrong with referenced pointers
obtained by if_get(9)? While session is alive it *uses* it's `ifnet'. It
uses `ifnet' all lifetime *not* only while output. And we should be
shure we didn't destroy `ifnet' while someone uses it. What's wrong with
referenced pointers? The way I wish to go used in file(9). May be it is
totally wrong and we need global in-kernel descriptor table where `fp'
will be referenced by index too :) ?

You said you like to make kernel more MP capable. It's great but before
releasing the locks we need to make straigt logic to be sure the object
we obtained is the object we requested. And this is the problem not only
for pipex(4) but for net/if.c too [1]. The problems are fully idntical.

Also while I was student we had a little internet provider. It was in
2005-2008. And freebsd with mpd was the easiest way to start. So I had a
lot of problems and receive a lot of experience in this area.

1. https://marc.info/?t=159289590100001&r=1&w=2

Reply | Threaded
Open this post in threaded view
|

Re: pipex(4): use reference counters for `ifnet'

Martin Pieuchot
On 27/06/20(Sat) 17:58, Vitaliy Makkoveev wrote:

> > [...]
> > Look at r1.329 of net/if.c.  Prior to this change if_detach_queues() was
> > used to free all mbufs when an interface was removed.  Now lazy freeing
> > is used everytime if_get(9) rerturns NULL.
> >
> > This is possible because we store an index and not a pointer directly in
> > the mbuf.
> >
> > The advantage of storing a session pointer in `ph_cookie' is that no
> > lookup is required in pipexintr(), right?  Maybe we could save a ID
> > instead and do a lookup.  How big can be the `pipex_session_list'?
> >
>
> It's unlimited. In pppac(4) case you create the only one interface and
> you can share it between the count of sessions you wish. In my practice
> I had machines with 800+ active ppp interfaces in 2005. We can have
> dosens of cores and hundreds gigs of ram now. How big can be real
> count of active ppp interfaces on VPN provider's NAS?

With that number of items a linear list might not be the best fit if we
decide to stop using pointers.  So if we want to use a "lookup" a
different data structure might be more appropriate.

> I looked at r1.328 of net/if.c.
> Imagine we have a lot of connected clients with high traffic. One on them
> starts connect/disconnect games. It's not malicious hacker, just mobile
> phone in area with low signal. You need:
>
> 1. block pipexintr() (and netisr too).
> 2. block insertion to queues from concurrent threads.
> 3. Walk through very loaded queues and compare. And most or may be all
>    packets are foreign.
> 4. Repeat it every time you lost connection.
>
> Yes, now it's all serealized. And pipexintr() is already blocked while
> we do session destruction(). But what is the reason to make your future
> life harder? pipex(4) session already has pointer to it's relared
> `ifnet'. `ifnet' already has reference counters. Why don't use them?

We decided as a team to not use reference counting because they are
hard to debug.  Using if_get(9)/if_put(9) allows us to check that our
changes where correct with static analysis tools.

If we start using reference counting for ifps in pipex(4) they we now
have an exception in the network stack.  That means the techniques
applied are not coherent and it makes it harder to work across a huge
amount of code.

But I don't care, if there's a consensus that it is the way to go, then
go for it.

> In the way I suggest to use refcounters for pipex(4) session you don't
> need to block your packet processing. You don't need to do searchs. You
> just need to be sure the `ifnet' you has is still alive. You already has
> mechanics for that. Why don't use this?

The same can be said with the actual machinery.  Using a if_get(9)-like
approach doesn't block packet processing, you don't need to to search,
there's no reference counting bug that can lead to deadlock, it is
consistent with the existing network stack and there are already example
of implementation like: if_get() and rtable_get().

>                                         You don't like if_ref() to be
> used outside net/if.c? Ok, what's wrong with referenced pointers
> obtained by if_get(9)? While session is alive it *uses* it's `ifnet'. It
> uses `ifnet' all lifetime *not* only while output. And we should be
> shure we didn't destroy `ifnet' while someone uses it. What's wrong with
> referenced pointers?

If there's a bug, if the reference counting is messed up, it is hard to
find where that happened.  It is like a use-after-free, where the crash
happens is not where the bug is.  If you see a leak you don't know where
the reference drop is missing.

Sure they are many ways to deal with that, but since we did not embrace
them. I'm not sure that it makes sense to change direction or introduce
a difference now.

>                      The way I wish to go used in file(9). May be it is
> totally wrong and we need global in-kernel descriptor table where `fp'
> will be referenced by index too :) ?

That's not what I'm saying.  if_get(9) already exists and is already
used in a certain way, I don't see why not embrace that and keep the
network stack coherent.

But once again, I don't want to block anyone in its development, if you
and others agree that's the way to go, then let's go this way.