Dell 1855 Blade Perc 4\IM (LSI) controller problem

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Dell 1855 Blade Perc 4\IM (LSI) controller problem

Bob Bostwick (Lists)
I installed 3.8 on an 1855 with no problems about two weeks ago.  All my
apps worked, and had no problems until I rebooted the box (not the first
reboot, and not because of an issue of any kind.)  Upon reboot, it fails
to boot right away, with the "now trying bsd.old....etc" messages
finally booting to bsd.  Towards the end of the boot process the
following message appears.

sd0(mpt0:0:0): mpt0: timeout request index = 0xfe, seq = 0x000000ce
mpt0: Status 0x80000000, Mask 0x00000001, Doorbell 0x24000000
mpt0: request state: On Chip
panic: cannot read disk label, 0x400/0xd00, error 5
Stopped at      Debugger+0x4:     leave

Then the message to run ps before reporting this.

The only problem is that the machine is un-responsive at this point.
The only keyboard I can connect is USB (this is a blade) and is not
responsive.  I'm not sure how to get the diagnostic information that
would be useful.

I did google this, and the only fix's I found were updating to -current
(but that was in 3.5) and one person said that getting cooling to his
box fixed the problem.  I know cooling is not an issue, not even close.
Any advice would be appreciated.

Reply | Threaded
Open this post in threaded view
|

Re: Dell 1855 Blade Perc 4\IM (LSI) controller problem

JR Dalrymple
Bob Bostwick (Lists) wrote:

>I installed 3.8 on an 1855 with no problems about two weeks ago.  All my
>apps worked, and had no problems until I rebooted the box (not the first
>reboot, and not because of an issue of any kind.)  Upon reboot, it fails
>to boot right away, with the "now trying bsd.old....etc" messages
>finally booting to bsd.  Towards the end of the boot process the
>following message appears.
>
>sd0(mpt0:0:0): mpt0: timeout request index = 0xfe, seq = 0x000000ce
>mpt0: Status 0x80000000, Mask 0x00000001, Doorbell 0x24000000
>mpt0: request state: On Chip
>panic: cannot read disk label, 0x400/0xd00, error 5
>Stopped at      Debugger+0x4:     leave
>
>Then the message to run ps before reporting this.
>
>The only problem is that the machine is un-responsive at this point.
>The only keyboard I can connect is USB (this is a blade) and is not
>responsive.  I'm not sure how to get the diagnostic information that
>would be useful.
>
>I did google this, and the only fix's I found were updating to -current
>(but that was in 3.5) and one person said that getting cooling to his
>box fixed the problem.  I know cooling is not an issue, not even close.
>Any advice would be appreciated.
>
>  
>
this might be worth looking at:

<https://support.dell.com/support/edocs/systems/pe1855/en/UG/kd470c2a.htm#wp1054079>

Reply | Threaded
Open this post in threaded view
|

Re: Dell 1855 Blade Perc 4\IM (LSI) controller problem

Bob Bostwick (Lists)
In reply to this post by Bob Bostwick (Lists)
>>-----Original Message-----
>>From: Marco Peereboom [mailto:[hidden email]]
>>Sent: Friday, January 06, 2006 4:10 PM
>>To: Bob Bostwick (Lists)
>>Subject: Re: Dell 1855 Blade Perc 4\IM (LSI) controller problem
>>
>>That sounds like over heating to me.  Have you tried powering of the
box
>>for an hour or so?
>>

The room temp is 68F (and does not fluctuate more than 2 degrees), and
the fans on the blade server move massive amounts of air.  The blade
chassis is not even half full yet (4 of 10), and none of the other
blades are having problems.  However that sounds possible as the problem
is intermittent, happening roughly 40% of the time.  My problem with
heat being the cause, is that it would most likely happen while running,
however the problem usually occurs after the server has been off for a
while, then powered up.  While powered off the fans are still running,
as this is in a blade enclosure.

I got a serial connection to the DRAC now, but cannot get OBSD to send
it's boot screen to it.

#cat /etc/boot.conf
boot hd0a:/bsd
set tty com0

I also changed /etc/ttys tty00 to
tty00   "/usr/libexec/getty std.115200" vt100   on  secure

I'm guessing that this is not all I have to do to set the port to
115200, but haven't found another place to change it...yet.  I know the
default is 9600, but 115200 is what the DRAC is set to, and any other
setting causes connection problems with the DRAC. (Yes I changed it in
the BIOS, but 115200 is the only setting I have found to be reliable.)

I can connect to the DRAC from serial, and see the post, but once OBSD
starts booting I see nothing.  I'm sure I just missed a step (that's why
I haven't replied yet, I'm still trying to get that working.)  If I
could get the console to work when the problem occurs I may be able to
retrieve some usefull information.

BTW, OBSD ROCKS on this system, it's sofa king fast!


Regards,

Bob Bostwick

>>On Fri, Jan 06, 2006 at 11:41:12AM -0600, Bob Bostwick (Lists) wrote:
>>> I installed 3.8 on an 1855 with no problems about two weeks ago.
All my
>>> apps worked, and had no problems until I rebooted the box (not the
first
>>> reboot, and not because of an issue of any kind.)  Upon reboot, it
fails

>>> to boot right away, with the "now trying bsd.old....etc" messages
>>> finally booting to bsd.  Towards the end of the boot process the
>>> following message appears.
>>>
>>> sd0(mpt0:0:0): mpt0: timeout request index = 0xfe, seq = 0x000000ce
>>> mpt0: Status 0x80000000, Mask 0x00000001, Doorbell 0x24000000
>>> mpt0: request state: On Chip
>>> panic: cannot read disk label, 0x400/0xd00, error 5
>>> Stopped at      Debugger+0x4:     leave
>>>
>>> Then the message to run ps before reporting this.
>>>
>>> The only problem is that the machine is un-responsive at this point.
>>> The only keyboard I can connect is USB (this is a blade) and is not
>>> responsive.  I'm not sure how to get the diagnostic information that
>>> would be useful.
>>>
>>> I did google this, and the only fix's I found were updating to
-current
>>> (but that was in 3.5) and one person said that getting cooling to
his
>>> box fixed the problem.  I know cooling is not an issue, not even
close.
>>> Any advice would be appreciated.