pppx(4): avoid direct usage of pxi owned session.

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

pppx(4): avoid direct usage of pxi owned session.

Vitaliy Makkoveev
Each `struct pppx_if' holds it's own `pipex_session' and this session is
used directly within ifnet's related handlers pppx_if_start() and
pppx_if_output().

pppx_if_destroy() at first destroys `pipex_session' and calls
if_deatch() which can cause context switch. Hypothetically,
pppx_if_start() or pppx_if_output() can receive cpu at this moment and
start to operate with already destroyed session.

I guess the order of `pppx_if' destruction in pppx_if_destroy() is
right. If we call if_detach() before `pipex_session' destruction, after
context switch caused by this if_detach() this session can be accessed
from network stack, but it's own `ifnet' is already destoyed by
if_detach().

Also pppx_add_session() has the same problem: newly created
`pipex_session' can start to operate with half initilaized ifnet, so
ifnet should be initialized before session creation.

Ifnet should be already exist outside corresponding `pipex_session'
life time. And within ifnet's handlers we should be sure that session is
valid.

Diff below drops direct access of this `pipex_session' in ifnet's
handlers.

In pppx_if_start() we obtain corresponding session by
pipex_lookup_by_session_id(). If pppx_if_start() was called in half of
pppx_if_destroy(), pipex_lookup_by_session_id() will return NULL.
Context switch can't be occured in pppx_if_start().

In pppx_if_output() the only ppp_id is requred.

In pppx_add_session() if_attach() moved before `pipex_session' can be
accessed by pipex(4) internals.

Also this will be useful for future. We are going to move out pipex(4)
related code from pppx(4) and after this diff it will be possible just
remove all duplicating code from pppx_if_destroy() and
pppx_add_session().

Index: sys/net/if_pppx.c
===================================================================
RCS file: /cvs/src/sys/net/if_pppx.c,v
retrieving revision 1.84
diff -u -p -r1.84 if_pppx.c
--- sys/net/if_pppx.c 18 Apr 2020 04:03:56 -0000 1.84
+++ sys/net/if_pppx.c 14 May 2020 12:47:35 -0000
@@ -157,6 +157,9 @@ struct pppx_if {
  struct pppx_dev *pxi_dev;
  struct pipex_session pxi_session;
  struct pipex_iface_context pxi_ifcontext;
+ int pxi_protocol;
+ int pxi_session_id;
+ int pxi_ppp_id;
 };
 
 static inline int
@@ -841,6 +844,9 @@ pppx_add_session(struct pppx_dev *pxd, s
  pxi->pxi_key.pxik_session_id = req->pr_session_id;
  pxi->pxi_key.pxik_protocol = req->pr_protocol;
  pxi->pxi_dev = pxd;
+ pxi->pxi_protocol = req->pr_protocol;
+ pxi->pxi_session_id = req->pr_session_id;
+ pxi->pxi_ppp_id = req->pr_ppp_id;
 
  /* this is safe without splnet since we're not modifying it */
  if (RBT_FIND(pppx_ifs, &pppx_ifs, pxi) != NULL) {
@@ -867,6 +873,11 @@ pppx_add_session(struct pppx_dev *pxd, s
  ifp->if_softc = pxi;
  /* ifp->if_rdomain = req->pr_rdomain; */
 
+ /* XXXSMP breaks atomicity */
+ NET_UNLOCK();
+ if_attach(ifp);
+ NET_LOCK();
+
  /* hook up pipex context */
  chain = PIPEX_ID_HASHTABLE(session->session_id);
  LIST_INSERT_HEAD(chain, session, id_chain);
@@ -886,11 +897,6 @@ pppx_add_session(struct pppx_dev *pxd, s
  if (LIST_NEXT(session, session_list) == NULL)
  pipex_timer_start();
 
- /* XXXSMP breaks atomicity */
- NET_UNLOCK();
- if_attach(ifp);
- NET_LOCK();
-
  if_addgroup(ifp, "pppx");
  if_alloc_sadl(ifp);
 
@@ -1051,25 +1057,34 @@ void
 pppx_if_start(struct ifnet *ifp)
 {
  struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
+ struct pipex_session *session;
  struct mbuf *m;
  int proto;
 
  if (!ISSET(ifp->if_flags, IFF_RUNNING))
  return;
 
+ session = pipex_lookup_by_session_id(pxi->pxi_protocol,
+ pxi->pxi_session_id);
+
  for (;;) {
  IFQ_DEQUEUE(&ifp->if_snd, m);
 
  if (m == NULL)
  break;
 
+ if (session == NULL) {
+ m_freem(m);
+ continue;
+ }
+
  proto = *mtod(m, int *);
  m_adj(m, sizeof(proto));
 
  ifp->if_obytes += m->m_pkthdr.len;
  ifp->if_opackets++;
 
- pipex_ppp_output(m, &pxi->pxi_session, proto);
+ pipex_ppp_output(m, session, proto);
  }
 }
 
@@ -1129,7 +1144,7 @@ pppx_if_output(struct ifnet *ifp, struct
  }
  th = mtod(m, struct pppx_hdr *);
  th->pppx_proto = 0; /* not used */
- th->pppx_id = pxi->pxi_session.ppp_id;
+ th->pppx_id = pxi->pxi_ppp_id;
  rw_enter_read(&pppx_devs_lk);
  error = mq_enqueue(&pxi->pxi_dev->pxd_svcq, m);
  if (error == 0) {

Reply | Threaded
Open this post in threaded view
|

Re: pppx(4): avoid direct usage of pxi owned session.

Martin Pieuchot
On 14/05/20(Thu) 15:53, Vitaliy Makkoveev wrote:
> Each `struct pppx_if' holds it's own `pipex_session' and this session is
> used directly within ifnet's related handlers pppx_if_start() and
> pppx_if_output().

I don't see a problem with keeping a reference on a pipex_session inside
the softc.  Is there one?  If so we might want to get rid of the descriptor
completely and always use a lookup function.

Always doing lookups might adds an overhead, so it should be understood.

> pppx_if_destroy() at first destroys `pipex_session' and calls
> if_deatch() which can cause context switch. Hypothetically,
> pppx_if_start() or pppx_if_output() can receive cpu at this moment and
> start to operate with already destroyed session.

What do you mean with 'destroyed session'?  If if_detach() sleeps the
session still exists, it is just no longer linked.  I don't see what bug
exist in the scenario you're describing.

> I guess the order of `pppx_if' destruction in pppx_if_destroy() is
> right. If we call if_detach() before `pipex_session' destruction, after
> context switch caused by this if_detach() this session can be accessed
> from network stack, but it's own `ifnet' is already destoyed by
> if_detach().

if_detach() doesn't destroy anything.  `ifp' is still valid, just no
longer referenced by the global data structures of the network stack.

Concerning the order of operations I'd suggest being coherent with
other pseudo-devices.  So look at vlan_clone_destroy(),
gre_clone_destroy(), etc for example.

> Also pppx_add_session() has the same problem: newly created
> `pipex_session' can start to operate with half initilaized ifnet, so
> ifnet should be initialized before session creation.

Currently the KERNEL_LOCK() is protecting all of that, but yes getting
the order of operations right helps.

> Ifnet should be already exist outside corresponding `pipex_session'
> life time. And within ifnet's handlers we should be sure that session is
> valid.

If we follow other pseudo-device logic we have indeed first a 'clone'
operation that creates and `ifp' then an ioctl like SIOCSVNETID which
links the descriptor to a global list.

Your diff goes in this direction but misses some of the bits related to
the creation of interface like if_addgroup(), bpfattach(), etc.  I'd
suggest you look at net/if_vlan.c :)

> Diff below drops direct access of this `pipex_session' in ifnet's
> handlers.

I don't see the point of this :/

> In pppx_if_start() we obtain corresponding session by
> pipex_lookup_by_session_id(). If pppx_if_start() was called in half of
> pppx_if_destroy(), pipex_lookup_by_session_id() will return NULL.
> Context switch can't be occured in pppx_if_start().

The IFF_RUNNING flag should be used to prevent this scenario.  I'd
suggest imitating the logic in vlan_clone_destroy() in this case and
bring the interface down as a first step of the destroy process.
 
> In pppx_if_output() the only ppp_id is requred.
>
> In pppx_add_session() if_attach() moved before `pipex_session' can be
> accessed by pipex(4) internals.

As said above, please keep all the `ifp' initialization together.

> Also this will be useful for future. We are going to move out pipex(4)
> related code from pppx(4) and after this diff it will be possible just
> remove all duplicating code from pppx_if_destroy() and
> pppx_add_session().

Please yes, remove the duplication before trying to address concurrency
issues.  It will make review easier :)

> Index: sys/net/if_pppx.c
> ===================================================================
> RCS file: /cvs/src/sys/net/if_pppx.c,v
> retrieving revision 1.84
> diff -u -p -r1.84 if_pppx.c
> --- sys/net/if_pppx.c 18 Apr 2020 04:03:56 -0000 1.84
> +++ sys/net/if_pppx.c 14 May 2020 12:47:35 -0000
> @@ -157,6 +157,9 @@ struct pppx_if {
>   struct pppx_dev *pxi_dev;
>   struct pipex_session pxi_session;
>   struct pipex_iface_context pxi_ifcontext;
> + int pxi_protocol;
> + int pxi_session_id;
> + int pxi_ppp_id;
>  };
>  
>  static inline int
> @@ -841,6 +844,9 @@ pppx_add_session(struct pppx_dev *pxd, s
>   pxi->pxi_key.pxik_session_id = req->pr_session_id;
>   pxi->pxi_key.pxik_protocol = req->pr_protocol;
>   pxi->pxi_dev = pxd;
> + pxi->pxi_protocol = req->pr_protocol;
> + pxi->pxi_session_id = req->pr_session_id;
> + pxi->pxi_ppp_id = req->pr_ppp_id;
>  
>   /* this is safe without splnet since we're not modifying it */
>   if (RBT_FIND(pppx_ifs, &pppx_ifs, pxi) != NULL) {
> @@ -867,6 +873,11 @@ pppx_add_session(struct pppx_dev *pxd, s
>   ifp->if_softc = pxi;
>   /* ifp->if_rdomain = req->pr_rdomain; */
>  
> + /* XXXSMP breaks atomicity */
> + NET_UNLOCK();
> + if_attach(ifp);
> + NET_LOCK();
> +
>   /* hook up pipex context */
>   chain = PIPEX_ID_HASHTABLE(session->session_id);
>   LIST_INSERT_HEAD(chain, session, id_chain);
> @@ -886,11 +897,6 @@ pppx_add_session(struct pppx_dev *pxd, s
>   if (LIST_NEXT(session, session_list) == NULL)
>   pipex_timer_start();
>  
> - /* XXXSMP breaks atomicity */
> - NET_UNLOCK();
> - if_attach(ifp);
> - NET_LOCK();
> -
>   if_addgroup(ifp, "pppx");
>   if_alloc_sadl(ifp);
>  
> @@ -1051,25 +1057,34 @@ void
>  pppx_if_start(struct ifnet *ifp)
>  {
>   struct pppx_if *pxi = (struct pppx_if *)ifp->if_softc;
> + struct pipex_session *session;
>   struct mbuf *m;
>   int proto;
>  
>   if (!ISSET(ifp->if_flags, IFF_RUNNING))
>   return;
>  
> + session = pipex_lookup_by_session_id(pxi->pxi_protocol,
> + pxi->pxi_session_id);
> +
>   for (;;) {
>   IFQ_DEQUEUE(&ifp->if_snd, m);
>  
>   if (m == NULL)
>   break;
>  
> + if (session == NULL) {
> + m_freem(m);
> + continue;
> + }
> +
>   proto = *mtod(m, int *);
>   m_adj(m, sizeof(proto));
>  
>   ifp->if_obytes += m->m_pkthdr.len;
>   ifp->if_opackets++;
>  
> - pipex_ppp_output(m, &pxi->pxi_session, proto);
> + pipex_ppp_output(m, session, proto);
>   }
>  }
>  
> @@ -1129,7 +1144,7 @@ pppx_if_output(struct ifnet *ifp, struct
>   }
>   th = mtod(m, struct pppx_hdr *);
>   th->pppx_proto = 0; /* not used */
> - th->pppx_id = pxi->pxi_session.ppp_id;
> + th->pppx_id = pxi->pxi_ppp_id;
>   rw_enter_read(&pppx_devs_lk);
>   error = mq_enqueue(&pxi->pxi_dev->pxd_svcq, m);
>   if (error == 0) {
>

Reply | Threaded
Open this post in threaded view
|

Re: pppx(4): avoid direct usage of pxi owned session.

Vitaliy Makkoveev
Sorry about delay.

> On 20 May 2020, at 10:54, Martin Pieuchot <[hidden email]> wrote:
>
> On 14/05/20(Thu) 15:53, Vitaliy Makkoveev wrote:
>> Each `struct pppx_if' holds it's own `pipex_session' and this session is
>> used directly within ifnet's related handlers pppx_if_start() and
>> pppx_if_output().
>
> I don't see a problem with keeping a reference on a pipex_session inside
> the softc.  Is there one?  If so we might want to get rid of the descriptor
> completely and always use a lookup function.

This diff was for my private question about if_detach(). I was wrong about
if_detach(), so we can drop this diff :)

Since we are going to move pipex(4) internals back, I tried to go in pppac(4)
way. Pseudo-device logic, you pointed, is very similar.

Also as pipex(4) was designed, session should be destroyed by pipex_ioctl()
command PIPEXDSESSION. Corresponding function pipex_close_session() doesn’t
kill session, but places it to queue, and this session will be deleted later
if `pipexinq’ and `pipexoutq’ are empty. This was done to be sure there is
no mbufs in queues with reference to this session. Real pipex(4) destruction
is done by pipex_destroy_session(). pppac(4) hasn’t references to it’s
sessions, so since I tried pppac(4) way did a step to remove reference in
pppx(4) too.

But in fact, pipex_iface_stop() uses direct pipex_destroy_session(). Also
pppx_if_destroy() uses code copypasted from this function. It’s not clean
for me why this is safe.

Also we have disabled in-kernel timer for pppx(4) which wants to destroy
session within pipex(4).

>
> Always doing lookups might adds an overhead, so it should be understood.

pppac_start() does one search per mbuf within pipex_output(). In my diff
ppppx_if_start() does one search per queue processing. pipex_output()
uses radix tree, pipex_lookup_by_session_id() uses hash.

>
>> pppx_if_destroy() at first destroys `pipex_session' and calls
>> if_deatch() which can cause context switch. Hypothetically,
>> pppx_if_start() or pppx_if_output() can receive cpu at this moment and
>> start to operate with already destroyed session.
>
> What do you mean with 'destroyed session'?  If if_detach() sleeps the
> session still exists, it is just no longer linked.  I don't see what bug
> exist in the scenario you're describing.

In fact this code doesn't destroy session, but since it was copypasted
from pipex_destroy_session() (without pool_put()) the meaning of this code
is session destruction.

May be the first step to deduplicate should be to move this code to new
function pipex_session_unlink() and call it within pipex_destroy_session()
and pppx_if_destroy(). Also we can introduce pipex_session_link() for
pipex session creation.

>
>> I guess the order of `pppx_if' destruction in pppx_if_destroy() is
>> right. If we call if_detach() before `pipex_session' destruction, after
>> context switch caused by this if_detach() this session can be accessed
>> from network stack, but it's own `ifnet' is already destoyed by
>> if_detach().
>
> if_detach() doesn't destroy anything.  `ifp' is still valid, just no
> longer referenced by the global data structures of the network stack.
>
> Concerning the order of operations I'd suggest being coherent with
> other pseudo-devices.  So look at vlan_clone_destroy(),
> gre_clone_destroy(), etc for example.
>
>> Also pppx_add_session() has the same problem: newly created
>> `pipex_session' can start to operate with half initilaized ifnet, so
>> ifnet should be initialized before session creation.
>
> Currently the KERNEL_LOCK() is protecting all of that, but yes getting
> the order of operations right helps.
>
>> Ifnet should be already exist outside corresponding `pipex_session'
>> life time. And within ifnet's handlers we should be sure that session is
>> valid.
>
> If we follow other pseudo-device logic we have indeed first a 'clone'
> operation that creates and `ifp' then an ioctl like SIOCSVNETID which
> links the descriptor to a global list.
>
> Your diff goes in this direction but misses some of the bits related to
> the creation of interface like if_addgroup(), bpfattach(), etc.  I'd
> suggest you look at net/if_vlan.c :)
>
>> Diff below drops direct access of this `pipex_session' in ifnet's
>> handlers.
>
> I don't see the point of this :/
>
>> In pppx_if_start() we obtain corresponding session by
>> pipex_lookup_by_session_id(). If pppx_if_start() was called in half of
>> pppx_if_destroy(), pipex_lookup_by_session_id() will return NULL.
>> Context switch can't be occured in pppx_if_start().
>
> The IFF_RUNNING flag should be used to prevent this scenario.  I'd
> suggest imitating the logic in vlan_clone_destroy() in this case and
> bring the interface down as a first step of the destroy process.

pppx_if_start() already checks this flag, but pppx_if_destroy() doesn’t
clear it. I will make diff for it.

>  
>
>> In pppx_if_output() the only ppp_id is requred.
>>
>> In pppx_add_session() if_attach() moved before `pipex_session' can be
>> accessed by pipex(4) internals.
>
> As said above, please keep all the `ifp' initialization together.
>
>> Also this will be useful for future. We are going to move out pipex(4)
>> related code from pppx(4) and after this diff it will be possible just
>> remove all duplicating code from pppx_if_destroy() and
>> pppx_add_session().
>
> Please yes, remove the duplication before trying to address concurrency
> issues.  It will make review easier :)

We stopped at `pppx_ifs_lk’ removal, so let’s continue but after [1]
finished.

1. https://marc.info/?t=158998611500002&r=1&w=2
Reply | Threaded
Open this post in threaded view
|

Re: pppx(4): avoid direct usage of pxi owned session.

Martin Pieuchot
On 24/05/20(Sun) 22:58, Vitaliy Makkoveev wrote:
> > Please yes, remove the duplication before trying to address concurrency
> > issues.  It will make review easier :)
>
> We stopped at `pppx_ifs_lk’ removal, so let’s continue but after [1]
> finished.
>
> 1. https://marc.info/?t=158998611500002&r=1&w=2

Please, re-send the diff that need to be committed.