long hangs with heavy IO (was: should 'make -j8 build' work?)

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

long hangs with heavy IO (was: should 'make -j8 build' work?)

Jan Stary
On Feb 08 08:25:49, Stuart Henderson wrote:
> On 2012-02-07, Joe Gidi <[hidden email]> wrote:
> > In every case, when the box hangs, I'm unable to break into ddb.
>
> How long do you leave it when it hangs? There have been occasions
> where a box appears to hang but then recovers.
>
> Are you using softdep?

This interests me. The OP's problem was what appeared
to be a hang during "make -j8 build'. I have experienced
something similar while doing heavy IO operations.

If I cvs up many repositories at once, while
dump|restoring a few filesystems at once,
the box *sometimes* seems totaly unresponsive,
only to react to my keybord stroke _a_long_time_
later; sometimes, it is unresponsive locally,
but can be ssh'd to - but the login process doesn't
make it to actually spawning a shell.

Yes, I am usign softdep, almost everywhere.

Stuart, could you please elaborate on how this (possibly) happens,
and what is the role of softdep in it?

        Thank you for your time

                Jan

Reply | Threaded
Open this post in threaded view
|

Re: long hangs with heavy IO (was: should 'make -j8 build' work?)

Benny Lofgren
On 2012-02-08 10.34, Jan Stary wrote:

> On Feb 08 08:25:49, Stuart Henderson wrote:
>> On 2012-02-07, Joe Gidi <[hidden email]> wrote:
>>> In every case, when the box hangs, I'm unable to break into ddb.
>>
>> How long do you leave it when it hangs? There have been occasions
>> where a box appears to hang but then recovers.
>>
>> Are you using softdep?
>
> This interests me. The OP's problem was what appeared
> to be a hang during "make -j8 build'. I have experienced
> something similar while doing heavy IO operations.
>
> If I cvs up many repositories at once, while
> dump|restoring a few filesystems at once,
> the box *sometimes* seems totaly unresponsive,
> only to react to my keybord stroke _a_long_time_
> later; sometimes, it is unresponsive locally,
> but can be ssh'd to - but the login process doesn't
> make it to actually spawning a shell.
>
> Yes, I am usign softdep, almost everywhere.
>
> Stuart, could you please elaborate on how this (possibly) happens,
> and what is the role of softdep in it?

I've seen this too, under similar circumstances.

I haven't investigated it further, but I suspect that when the file
system in softdep mode needs to write out a whole bunch of metadata
at once (which I believe it does in 30s intervals, if nothing else
(read: lots of metadata-altering activity) makes the buffers fill up
prematurely) it does so in one go, without releasing locks and/or
enabling interrupts as it works its way through committing dirty
buffers to disk.

Whatever the underlying cause, it's of course not a desired effect,
since it suspends virtually any other activity in the box, whether
disk related or not for a loooong time.

A work-around for this is probably to not use softdep, but this moves
the performance penalty elsewhere which may or may not be acceptable
to a specific use case.

I think in the long run OpenBSD:s i/o scheduling and file system
options might need an overhaul, but that's a different discussion.

(For example, I'd love to see Jeff Robertson's and Kirk McKusick's
work on soft update journaling that went into FreeBSD 9 in OpenBSD
as well. Had I the time I'd look into it myself (it's a *lot* of work
from what little I've seen of it, but no doubt it would be FUN work)
but alas I don't at the moment, so all I can do is post this wish. :-)


Regards,
/Benny

--
internetlabbet.se     / work:   +46 8 551 124 80      / "Words must
Benny Lofgren        /  mobile: +46 70 718 11 90     /   be weighed,
                    /   fax:    +46 8 551 124 89    /    not counted."
                   /    email:  benny -at- internetlabbet.se

Reply | Threaded
Open this post in threaded view
|

Re: long hangs with heavy IO (was: should 'make -j8 build' work?)

Chris Cappuccio
Benny Lofgren [[hidden email]] wrote:
>
> (For example, I'd love to see Jeff Robertson's and Kirk McKusick's
> work on soft update journaling that went into FreeBSD 9 in OpenBSD
> as well. Had I the time I'd look into it myself (it's a *lot* of work
> from what little I've seen of it, but no doubt it would be FUN work)
> but alas I don't at the moment, so all I can do is post this wish. :-)
>

It might be FUN to use when it's actually working. But if porting it over and teasing out the bugs was all that much fun, you think someone would have reaped those rewards by now. Actually, data loss is really not much fun. Softupdates is one of the worst, because while it has 'worked' for years, it has had major bugs for a long time, complete with hard to reproduce, hard to diagnose problems. If this make -j8 hang is another softdep problem, that's just another testament to how much FUN it really is. If George Soros were to fund OpenBSD development, and all the developers could permanently live in a palace in the Swiss Alps, softupdates work might be considered fun. In that case, OpenBSD would end up with its own custom modern filesystem, written by someone who didn't kill their wife. I think the Soros/Swiss Alps idea is an excellent one.

Reply | Threaded
Open this post in threaded view
|

Re: long hangs with heavy IO (was: should 'make -j8 build' work?)

Benny Lofgren
On 2012-02-09 00.38, Chris Cappuccio wrote:
> Benny Lofgren [[hidden email]] wrote:
>> (For example, I'd love to see Jeff Robertson's and Kirk McKusick's
>> work on soft update journaling that went into FreeBSD 9 in OpenBSD
>> as well. Had I the time I'd look into it myself (it's a *lot* of work
>> from what little I've seen of it, but no doubt it would be FUN work)
>> but alas I don't at the moment, so all I can do is post this wish. :-)
>>
>
> It might be FUN to use when it's actually working. But if porting it over and teasing out the bugs was all that much fun, you think someone would have reaped those rewards by now. Actually, data loss is really not much fun. Softupdates is one of the worst, because while it has 'worked' for years, it has had major bugs for a long time, complete with hard to reproduce, hard to diagnose problems. If this make -j8 hang is another softdep problem, that's just another testament to how much FUN it really is. If George Soros were to fund OpenBSD development, and all the developers could permanently live in a palace in the Swiss Alps, softupdates work might be considered fun. In that case, OpenBSD would end up with its own custom modern filesystem, written by someone who didn't kill their wife. I think the Soros/Swiss Alps idea is an excellent one.

Well, according to the OP the make problem turned out to be hardware
related, as you may have seen by now.

You are certainly entitled to your opinion and whatever banter you feel
the need to dish out get your point across, but I actually DO like to
work on complex file system code (although I've hardly touched any in
a decade or so) so yes, I would consider it a fun and rewarding task.
Really. :-)

I've run very large, very heavily utilized softdep enabled filesystems
for years and years and have never, not once, lost data. That's anecdotal
evidence for sure, and I don't doubt there are or have been nasty bugs in
the code, but in my opinion the current OpenBSD ffs/ffs2 implementation
is nonetheless *very* stable and mature.

My servers very rarely crash - they run OpenBSD after all - but when they
do it's frustrating to wait for hours for the fsck:s to complete (my file
systems are usually rather big), so I'd love to have a journaling or
logging file system with matching stability to choose from in OpenBSD.

But since one hasn't magically materialized yet I've begun to look around
for likely candidates for implementation in OpenBSD, and the most likely
route I've found so far is journaling softdep.

I've never pretended to have the final answer to anything, but if *I*
were to try to implement something I'd probably look to journaling softdep
first, because I think it's got potential and might well be the path of
least resistance to achieving a working port.

Also I'm of course not expecting anyone else to do a single minute's worth
of free work to satisfy MY needs. I tried to word my mail carefully to avoid
people getting that impression, but maybe I failed.


Regards,
/Benny

--
internetlabbet.se     / work:   +46 8 551 124 80      / "Words must
Benny Lofgren        /  mobile: +46 70 718 11 90     /   be weighed,
                    /   fax:    +46 8 551 124 89    /    not counted."
                   /    email:  benny -at- internetlabbet.se

Reply | Threaded
Open this post in threaded view
|

Re: long hangs with heavy IO (was: should 'make -j8 build' work?)

Juan Francisco Cantero Hurtado
On Thu, 09 Feb 2012 04:59:36 +0100, Benny Lofgren <[hidden email]>  
wrote:

> On 2012-02-09 00.38, Chris Cappuccio wrote:
>> Benny Lofgren [[hidden email]] wrote:
>>> (For example, I'd love to see Jeff Robertson's and Kirk McKusick's
>>> work on soft update journaling that went into FreeBSD 9 in OpenBSD
>>> as well. Had I the time I'd look into it myself (it's a *lot* of work
>>> from what little I've seen of it, but no doubt it would be FUN work)
>>> but alas I don't at the moment, so all I can do is post this wish. :-)
>>>
>>
>> It might be FUN to use when it's actually working. But if porting it  
>> over and teasing out the bugs was all that much fun, you think someone  
>> would have reaped those rewards by now. Actually, data loss is really  
>> not much fun. Softupdates is one of the worst, because while it has  
>> 'worked' for years, it has had major bugs for a long time, complete  
>> with hard to reproduce, hard to diagnose problems. If this make -j8  
>> hang is another softdep problem, that's just another testament to how  
>> much FUN it really is. If George Soros were to fund OpenBSD  
>> development, and all the developers could permanently live in a palace  
>> in the Swiss Alps, softupdates work might be considered fun. In that  
>> case, OpenBSD would end up with its own custom modern filesystem,  
>> written by someone who didn't kill their wife. I
>   think the Soros/Swiss Alps idea is an excellent one.
>
> Well, according to the OP the make problem turned out to be hardware
> related, as you may have seen by now.
>
> You are certainly entitled to your opinion and whatever banter you feel
> the need to dish out get your point across, but I actually DO like to
> work on complex file system code (although I've hardly touched any in
> a decade or so) so yes, I would consider it a fun and rewarding task.
> Really. :-)
>
> I've run very large, very heavily utilized softdep enabled filesystems
> for years and years and have never, not once, lost data. That's anecdotal
> evidence for sure, and I don't doubt there are or have been nasty bugs in
> the code, but in my opinion the current OpenBSD ffs/ffs2 implementation
> is nonetheless *very* stable and mature.
>
> My servers very rarely crash - they run OpenBSD after all - but when they
> do it's frustrating to wait for hours for the fsck:s to complete (my file
> systems are usually rather big), so I'd love to have a journaling or
> logging file system with matching stability to choose from in OpenBSD.
>
> But since one hasn't magically materialized yet I've begun to look around
> for likely candidates for implementation in OpenBSD, and the most likely
> route I've found so far is journaling softdep.

Take a look of hammer2  
http://www.shiningsilence.com/dbsdlog/2012/02/09/9173.html . This a modern  
FS and the developers wants to build a version more portable than the  
original hammer.

But you're right, the FS will not be implemented magically and without a  
lot of efforts.

>
> I've never pretended to have the final answer to anything, but if *I*
> were to try to implement something I'd probably look to journaling  
> softdep
> first, because I think it's got potential and might well be the path of
> least resistance to achieving a working port.
>
> Also I'm of course not expecting anyone else to do a single minute's  
> worth
> of free work to satisfy MY needs. I tried to word my mail carefully to  
> avoid
> people getting that impression, but maybe I failed.
>
>
> Regards,
> /Benny
>


--
Juan Francisco Cantero Hurtado http://juanfra.info

Reply | Threaded
Open this post in threaded view
|

Re: long hangs with heavy IO (was: should 'make -j8 build' work?)

Benny Lofgren
On 2012-02-11 03.00, Juan Francisco Cantero Hurtado wrote:

> On Thu, 09 Feb 2012 04:59:36 +0100, Benny Lofgren <[hidden email]>
> wrote:
>> On 2012-02-09 00.38, Chris Cappuccio wrote:
>>> Benny Lofgren [[hidden email]] wrote:
>> But since one hasn't magically materialized yet I've begun to look around
>> for likely candidates for implementation in OpenBSD, and the most likely
>> route I've found so far is journaling softdep.
>
> Take a look of hammer2
> http://www.shiningsilence.com/dbsdlog/2012/02/09/9173.html . This a
> modern FS and the developers wants to build a version more portable than
> the original hammer.
>
> But you're right, the FS will not be implemented magically and without a
> lot of efforts.

Thanks for the tip, but... Nah! I need something mature and proven,
not something where the "ToDo"- list reads longer than Tolstoy's "War
and Peace"... Nor the feature list, for that matter. I'm perfectly
content with ordinary ufs/ffs/ffs2 functionality, with the "small"
addition to my wishlist of journaling, to cut down recovery times.

Robust stability really is the key feature of any file system in my
opinion. I'm not much into the Linux mentality of experimentation and
always gravitating towards the coolest, latest greatest new technology
that someone just thought up. I'd rather go with the time-honoured,
trusted and proven.

Not that there's something inherently wrong with technological progress,
far from it! But I personally deal with systems that are in production
24/7 and I can't afford even the remotest possibility of something as
crucial as the file system to fuck up on me.

It's not called "bleeding edge" for nothing, you know. :-)


Regards,
/Benny - the trailing edge guy

--
internetlabbet.se     / work:   +46 8 551 124 80      / "Words must
Benny Lofgren        /  mobile: +46 70 718 11 90     /   be weighed,
                    /   fax:    +46 8 551 124 89    /    not counted."
                   /    email:  benny -at- internetlabbet.se