mountd is immortal

classic Classic list List threaded Threaded
16 messages Options
Reply | Threaded
Open this post in threaded view
|

mountd is immortal

David Coppa
It doesn't want to die...

Do you see this problem or is it just mine?
Any clue?

# /etc/rc.d/mountd start
mountd(ok)
# ps akuwwx|grep -v grep|grep mountd
root     15715  0.0  0.0   616   368 ??  Ss     9:39AM    0:00.00 /sbin/mountd
# /etc/rc.d/nfsd start
nfsd(ok)
# ps akuwwx|grep -v grep|grep nfs
root     11051  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
root      9270  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
root     10254  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
root     20969  0.0  0.0   156   228 ??  Ss     9:39AM    0:00.00 /sbin/nfsd -tun 4
root     17010  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
# /etc/rc.d/nfsd stop
nfsd(ok)
# ps akuwwx|grep -v grep|grep nfs
# time /etc/rc.d/mountd stop
mountd(failed)
    0m30.32s real     0m0.00s user     0m0.01s system
# ps akuwwx|grep -v grep|grep mountd
root     15715  0.0  0.0   616   388 ??  Ss     9:39AM    0:00.00 /sbin/mountd
# pkill -f /sbin/mountd
# ps akuwwx|grep -v grep|grep mountd
root     15715  0.0  0.0   616   388 ??  Ss     9:39AM    0:00.00 /sbin/mountd
# pkill -9 -f /sbin/mountd
# ps akuwwx|grep -v grep|grep mountd
#

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Robert Nagy
It seems that SIGTERM is not enough for mountd, according to the code
SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
So I think what we should do in this case is to first send a SIGTERM to mountd,
and then SIGKILL it in rc_stop().

On (2011-07-28 09:46), David Coppa wrote:

> It doesn't want to die...
>
> Do you see this problem or is it just mine?
> Any clue?
>
> # /etc/rc.d/mountd start
> mountd(ok)
> # ps akuwwx|grep -v grep|grep mountd
> root     15715  0.0  0.0   616   368 ??  Ss     9:39AM    0:00.00 /sbin/mountd
> # /etc/rc.d/nfsd start
> nfsd(ok)
> # ps akuwwx|grep -v grep|grep nfs
> root     11051  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
> root      9270  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
> root     10254  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
> root     20969  0.0  0.0   156   228 ??  Ss     9:39AM    0:00.00 /sbin/nfsd -tun 4
> root     17010  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd: server (nfsd)
> # /etc/rc.d/nfsd stop
> nfsd(ok)
> # ps akuwwx|grep -v grep|grep nfs
> # time /etc/rc.d/mountd stop
> mountd(failed)
>     0m30.32s real     0m0.00s user     0m0.01s system
> # ps akuwwx|grep -v grep|grep mountd
> root     15715  0.0  0.0   616   388 ??  Ss     9:39AM    0:00.00 /sbin/mountd
> # pkill -f /sbin/mountd
> # ps akuwwx|grep -v grep|grep mountd
> root     15715  0.0  0.0   616   388 ??  Ss     9:39AM    0:00.00 /sbin/mountd
> # pkill -9 -f /sbin/mountd
> # ps akuwwx|grep -v grep|grep mountd
> #

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

David Coppa
In reply to this post by David Coppa
On Thu, Jul 28, 2011 at 9:46 AM, David Coppa <[hidden email]> wrote:
> It doesn't want to die...

This is because mountd traps the SIGTERM signal and uses it to send a
broadcast RPCMNT_UMNTALL to the clients

> Do you see this problem or is it just mine?
> Any clue?
>
> # /etc/rc.d/mountd start
> mountd(ok)
> # ps akuwwx|grep -v grep|grep mountd
> root     15715  0.0  0.0   616   368 ??  Ss     9:39AM    0:00.00
/sbin/mountd
> # /etc/rc.d/nfsd start
> nfsd(ok)
> # ps akuwwx|grep -v grep|grep nfs
> root     11051  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd:
server (nfsd)
> root      9270  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd:
server (nfsd)
> root     10254  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd:
server (nfsd)
> root     20969  0.0  0.0   156   228 ??  Ss     9:39AM    0:00.00 /sbin/nfsd
-tun 4
> root     17010  0.0  0.0   120   156 ??  S      9:39AM    0:00.00 nfsd:
server (nfsd)
> # /etc/rc.d/nfsd stop
> nfsd(ok)
> # ps akuwwx|grep -v grep|grep nfs
> # time /etc/rc.d/mountd stop
> mountd(failed)
>    0m30.32s real     0m0.00s user     0m0.01s system
> # ps akuwwx|grep -v grep|grep mountd
> root     15715  0.0  0.0   616   388 ??  Ss     9:39AM    0:00.00
/sbin/mountd
> # pkill -f /sbin/mountd
> # ps akuwwx|grep -v grep|grep mountd
> root     15715  0.0  0.0   616   388 ??  Ss     9:39AM    0:00.00
/sbin/mountd
> # pkill -9 -f /sbin/mountd
> # ps akuwwx|grep -v grep|grep mountd
> #

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

David Coppa
In reply to this post by Robert Nagy
On Thu, 28 Jul 2011, Robert Nagy wrote:

> It seems that SIGTERM is not enough for mountd, according to the code
> SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> So I think what we should do in this case is to first send a SIGTERM to mountd,
> and then SIGKILL it in rc_stop().

Something like this? the sleep is just paranoia, don't know if it's useful...

Index: mountd
===================================================================
RCS file: /cvs/src/etc/rc.d/mountd,v
retrieving revision 1.1
diff -u -p -r1.1 mountd
--- mountd 8 Jul 2011 00:54:04 -0000 1.1
+++ mountd 28 Jul 2011 08:15:37 -0000
@@ -6,4 +6,10 @@ daemon="/sbin/mountd"
 
 . /etc/rc.d/rc.subr
 
+rc_stop() {
+ pkill -f "^${pexp}"
+ sleep 1
+ pkill -9 -f "^${pexp}"
+}
+
 rc_cmd $1

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Antoine Jacoutot-7
On Thu, 28 Jul 2011, David Coppa wrote:

> On Thu, 28 Jul 2011, Robert Nagy wrote:
>
> > It seems that SIGTERM is not enough for mountd, according to the code
> > SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> > So I think what we should do in this case is to first send a SIGTERM to mountd,
> > and then SIGKILL it in rc_stop().
>
> Something like this? the sleep is just paranoia, don't know if it's useful...

Why not use rc_post for SIGKILL?

> Index: mountd
> ===================================================================
> RCS file: /cvs/src/etc/rc.d/mountd,v
> retrieving revision 1.1
> diff -u -p -r1.1 mountd
> --- mountd 8 Jul 2011 00:54:04 -0000 1.1
> +++ mountd 28 Jul 2011 08:15:37 -0000
> @@ -6,4 +6,10 @@ daemon="/sbin/mountd"
>  
>  . /etc/rc.d/rc.subr
>  
> +rc_stop() {
> + pkill -f "^${pexp}"
> + sleep 1
> + pkill -9 -f "^${pexp}"
> +}
> +
>  rc_cmd $1
>
>

--
Antoine

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

David Coppa
On Thu, Jul 28, 2011 at 10:22 AM, Antoine Jacoutot
<[hidden email]> wrote:

> On Thu, 28 Jul 2011, David Coppa wrote:
>
>> On Thu, 28 Jul 2011, Robert Nagy wrote:
>>
>> > It seems that SIGTERM is not enough for mountd, according to the code
>> > SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
>> > So I think what we should do in this case is to first send a SIGTERM to mountd,
>> > and then SIGKILL it in rc_stop().
>>
>> Something like this? the sleep is just paranoia, don't know if it's useful...
>
> Why not use rc_post for SIGKILL?

It doesn't work.

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Robert Nagy
In reply to this post by Antoine Jacoutot-7
On (2011-07-28 10:22), Antoine Jacoutot wrote:

> On Thu, 28 Jul 2011, David Coppa wrote:
>
> > On Thu, 28 Jul 2011, Robert Nagy wrote:
> >
> > > It seems that SIGTERM is not enough for mountd, according to the code
> > > SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> > > So I think what we should do in this case is to first send a SIGTERM to mountd,
> > > and then SIGKILL it in rc_stop().
> >
> > Something like this? the sleep is just paranoia, don't know if it's useful...
>
> Why not use rc_post for SIGKILL?

Because rc_post() does not get called if rc_stop() fails and it does
because mountd does not die after the SIGTERM.

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Ingo Schwarze
In reply to this post by Antoine Jacoutot-7
Hi Antoine,

Antoine Jacoutot wrote on Thu, Jul 28, 2011 at 10:22:56AM +0200:
> On Thu, 28 Jul 2011, David Coppa wrote:
>> On Thu, 28 Jul 2011, Robert Nagy wrote:
 
>>> It seems that SIGTERM is not enough for mountd, according to the code
>>> SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
>>> So I think what we should do in this case is to first send a SIGTERM
>>> to mountd, and then SIGKILL it in rc_stop().

>> Something like this? the sleep is just paranoia, don't know
>> if it's useful...

> Why not use rc_post for SIGKILL?

Because

  rc_do rc_wait stop || rc_exit failed

is called before rc_post.

When the daemon refuses to die, the post-mortem action will not
even be attempted.

>> Index: mountd
>> ===================================================================
>> RCS file: /cvs/src/etc/rc.d/mountd,v
>> retrieving revision 1.1
>> diff -u -p -r1.1 mountd
>> --- mountd 8 Jul 2011 00:54:04 -0000 1.1
>> +++ mountd 28 Jul 2011 08:15:37 -0000
>> @@ -6,4 +6,10 @@ daemon="/sbin/mountd"
>>  
>>  . /etc/rc.d/rc.subr
>>  
>> +rc_stop() {
>> + pkill -f "^${pexp}"
>> + sleep 1
>> + pkill -9 -f "^${pexp}"
>> +}
>> +
>>  rc_cmd $1

I worry more that fixed-time sleeps often prove to short,
not so much that it might be useless, but i don't see a better
option right now.  Sorry, I can't test or look in more detail
right now.

Yours,
  Ingo

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Antoine Jacoutot-7
In reply to this post by Robert Nagy
On Thu, 28 Jul 2011, Robert Nagy wrote:

> Because rc_post() does not get called if rc_stop() fails and it does
> because mountd does not die after the SIGTERM.

Ah that's right, I forgot about that check.

--
Antoine

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Robert Nagy
In reply to this post by Ingo Schwarze
On (2011-07-28 10:30), Ingo Schwarze wrote:

> Hi Antoine,
>
> Antoine Jacoutot wrote on Thu, Jul 28, 2011 at 10:22:56AM +0200:
> > On Thu, 28 Jul 2011, David Coppa wrote:
> >> On Thu, 28 Jul 2011, Robert Nagy wrote:
>  
> >>> It seems that SIGTERM is not enough for mountd, according to the code
> >>> SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> >>> So I think what we should do in this case is to first send a SIGTERM
> >>> to mountd, and then SIGKILL it in rc_stop().
>
> >> Something like this? the sleep is just paranoia, don't know
> >> if it's useful...
>
> > Why not use rc_post for SIGKILL?
>
> Because
>
>   rc_do rc_wait stop || rc_exit failed
>
> is called before rc_post.
>
> When the daemon refuses to die, the post-mortem action will not
> even be attempted.
>
> >> Index: mountd
> >> ===================================================================
> >> RCS file: /cvs/src/etc/rc.d/mountd,v
> >> retrieving revision 1.1
> >> diff -u -p -r1.1 mountd
> >> --- mountd 8 Jul 2011 00:54:04 -0000 1.1
> >> +++ mountd 28 Jul 2011 08:15:37 -0000
> >> @@ -6,4 +6,10 @@ daemon="/sbin/mountd"
> >>  
> >>  . /etc/rc.d/rc.subr
> >>  
> >> +rc_stop() {
> >> + pkill -f "^${pexp}"
> >> + sleep 1
> >> + pkill -9 -f "^${pexp}"
> >> +}
> >> +
> >>  rc_cmd $1
>
> I worry more that fixed-time sleeps often prove to short,
> not so much that it might be useless, but i don't see a better
> option right now.  Sorry, I can't test or look in more detail
> right now.
>
> Yours,
>   Ingo
>

I am not sure that we want to sleep or not. Theo what do you think?

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Mark Kettenis
In reply to this post by David Coppa
> Date: Thu, 28 Jul 2011 10:16:00 +0200
> From: David Coppa <[hidden email]>
>
> On Thu, 28 Jul 2011, Robert Nagy wrote:
>
> > It seems that SIGTERM is not enough for mountd, according to the code
> > SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> > So I think what we should do in this case is to first send a SIGTERM to mountd,
> > and then SIGKILL it in rc_stop().
>
> Something like this? the sleep is just paranoia, don't know if it's useful...

Well, that sleep makes some sense at least; you want to give the
daemon some time to clean up.  The question is whether a single second
is enough for that...

> Index: mountd
> ===================================================================
> RCS file: /cvs/src/etc/rc.d/mountd,v
> retrieving revision 1.1
> diff -u -p -r1.1 mountd
> --- mountd 8 Jul 2011 00:54:04 -0000 1.1
> +++ mountd 28 Jul 2011 08:15:37 -0000
> @@ -6,4 +6,10 @@ daemon="/sbin/mountd"
>  
>  . /etc/rc.d/rc.subr
>  
> +rc_stop() {
> + pkill -f "^${pexp}"
> + sleep 1
> + pkill -9 -f "^${pexp}"
> +}
> +
>  rc_cmd $1

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Gregory Edigarov-2
In reply to this post by Robert Nagy
On Thu, 28 Jul 2011 10:33:21 +0200
Robert Nagy <[hidden email]> wrote:

> On (2011-07-28 10:30), Ingo Schwarze wrote:
> > Hi Antoine,
> >
> > Antoine Jacoutot wrote on Thu, Jul 28, 2011 at 10:22:56AM +0200:
> > > On Thu, 28 Jul 2011, David Coppa wrote:
> > >> On Thu, 28 Jul 2011, Robert Nagy wrote:
> >  
> > >>> It seems that SIGTERM is not enough for mountd, according to
> > >>> the code SIGTERM only sends a RPCMNT_UMNTALL broadcast to the
> > >>> clients. So I think what we should do in this case is to first
> > >>> send a SIGTERM to mountd, and then SIGKILL it in rc_stop().
> >
> > >> Something like this? the sleep is just paranoia, don't know
> > >> if it's useful...
> >
> > > Why not use rc_post for SIGKILL?
> >
> > Because
> >
> >   rc_do rc_wait stop || rc_exit failed
> >
> > is called before rc_post.
> >
> > When the daemon refuses to die, the post-mortem action will not
> > even be attempted.
> >
> > >> Index: mountd
> > >> ===================================================================
> > >> RCS file: /cvs/src/etc/rc.d/mountd,v
> > >> retrieving revision 1.1
> > >> diff -u -p -r1.1 mountd
> > >> --- mountd 8 Jul 2011 00:54:04 -0000 1.1
> > >> +++ mountd 28 Jul 2011 08:15:37 -0000
> > >> @@ -6,4 +6,10 @@ daemon="/sbin/mountd"
> > >>  
> > >>  . /etc/rc.d/rc.subr
> > >>  
> > >> +rc_stop() {
> > >> + pkill -f "^${pexp}"
> > >> + sleep 1
> > >> + pkill -9 -f "^${pexp}"
> > >> +}
> > >> +
> > >>  rc_cmd $1
> >
> > I worry more that fixed-time sleeps often prove to short,
> > not so much that it might be useless, but i don't see a better
> > option right now.  Sorry, I can't test or look in more detail
> > right now.
> >
> > Yours,
> >   Ingo
> >
>
> I am not sure that we want to sleep or not. Theo what do you think?
>
I am not Theo, but I think we should better sleep in here, just as a
precaution to make sure all clients have unmounted.

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Robert Nagy
In reply to this post by Mark Kettenis
On (2011-07-28 11:17), Mark Kettenis wrote:

> > Date: Thu, 28 Jul 2011 10:16:00 +0200
> > From: David Coppa <[hidden email]>
> >
> > On Thu, 28 Jul 2011, Robert Nagy wrote:
> >
> > > It seems that SIGTERM is not enough for mountd, according to the code
> > > SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> > > So I think what we should do in this case is to first send a SIGTERM to mountd,
> > > and then SIGKILL it in rc_stop().
> >
> > Something like this? the sleep is just paranoia, don't know if it's useful...
>
> Well, that sleep makes some sense at least; you want to give the
> daemon some time to clean up.  The question is whether a single second
> is enough for that...

Well mountd actually dies about 1.5-2 minutes after sending it a SIGTERM...

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

David Coppa
On Thu, 28 Jul 2011, Robert Nagy wrote:

> On (2011-07-28 11:17), Mark Kettenis wrote:
> > > Date: Thu, 28 Jul 2011 10:16:00 +0200
> > > From: David Coppa <[hidden email]>
> > >
> > > On Thu, 28 Jul 2011, Robert Nagy wrote:
> > >
> > > > It seems that SIGTERM is not enough for mountd, according to the code
> > > > SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
> > > > So I think what we should do in this case is to first send a SIGTERM to mountd,
> > > > and then SIGKILL it in rc_stop().
> > >
> > > Something like this? the sleep is just paranoia, don't know if it's useful...
> >
> > Well, that sleep makes some sense at least; you want to give the
> > daemon some time to clean up.  The question is whether a single second
> > is enough for that...
>
> Well mountd actually dies about 1.5-2 minutes after sending it a SIGTERM...

It died one time with:

"Cannot register service: RPC: Timed out"

(this message is from '/sbin/mountd -d')

But it's probably unrelated to SIGTERM: now it's up since around fifteen
minutes and I've already sent it a dozen of kills...

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

Benny Lofgren
In reply to this post by Robert Nagy
On 2011-07-28 11.29, Robert Nagy wrote:

> On (2011-07-28 11:17), Mark Kettenis wrote:
>>> Date: Thu, 28 Jul 2011 10:16:00 +0200
>>> From: David Coppa <[hidden email]>
>>> On Thu, 28 Jul 2011, Robert Nagy wrote:
>>>> It seems that SIGTERM is not enough for mountd, according to the code
>>>> SIGTERM only sends a RPCMNT_UMNTALL broadcast to the clients.
>>>> So I think what we should do in this case is to first send a SIGTERM to mountd,
>>>> and then SIGKILL it in rc_stop().
>>> Something like this? the sleep is just paranoia, don't know if it's useful...
>>
>> Well, that sleep makes some sense at least; you want to give the
>> daemon some time to clean up.  The question is whether a single second
>> is enough for that...
>
> Well mountd actually dies about 1.5-2 minutes after sending it a SIGTERM...

Oh darn, how much I hate RPC...

The thing is, mountd executes this when it receives a SIGTERM:

        if (gotterm) {
                (void) clnt_broadcast(RPCPROG_MNT, RPCMNT_VER1,
                    RPCMNT_UMNTALL, xdr_void, (caddr_t)0, xdr_void,
                    (caddr_t)0, umntall_each);
                exit(0);
        }

Now, clnt_broadcast() sends broadcasts blindly to the local net, and
waits with a rather long, hardcoded timeout for answers that it may
or may not get.

If it gets at least one answer, the umntall_each() function returns 1
which makes sure it doesn't wait for other answers, thus exiting quickly.

If however, and this is perhaps the most common use case, there are no
other mountd listeners on the local net, it waits until its (arbitrarily
chosen, it seems) time is up and then exits, at which time mountd itself
also promptly exits.

I question the need for the clnt_broadcast() call to be there at all. If
my (admittedly cursory) analysis is correct, it only reaches other mountd
daemons in the neighborhood, it causes minute-long exit delays in very
common usage scenarios and mountd strangely makes no other effort to
contact the clients that may actually be associated with it.

To remove the call would certainly make mountd exit promptly, but someone
with more insight into the magic of RPC than me needs to weigh in on
potential regressions first...

In any case, my gut feeling is that to kludgily "solve" the problem with
an arbitrary sleep and then a SIGKILL in the rc script is wrong, wrong...


Regards,
/Benny

--
internetlabbet.se     / work:   +46 8 551 124 80      / "Words must
Benny Lofgren        /  mobile: +46 70 718 11 90     /   be weighed,
                    /   fax:    +46 8 551 124 89    /    not counted."
                   /    email:  benny -at- internetlabbet.se

Reply | Threaded
Open this post in threaded view
|

Re: mountd is immortal

David Coppa
On Fri, Jul 29, 2011 at 12:46 PM, Benny Lofgren <[hidden email]> wrote:

> To remove the call would certainly make mountd exit promptly, but someone
> with more insight into the magic of RPC than me needs to weigh in on
> potential regressions first...

FreeBSD removed it.
NetBSD instead has send_umntall like OpenBSD