PERC4/DC Error

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

PERC4/DC Error

Tom Geman
I have a backup server (Dell PowerEdge 1850) attached to the Dell PowerVault
220S.  The only function this server does is backing up remote servers
throughout the day via rsync.

The 1850 uses RAID 1 via the embedded RAID controller (PERC 4e/Si, ami0).  
On this RAID 1 is a generic install of OpenBSD plus the rsync package.  The
storage is connected via the expansion RAID controller (PERC 4/DC, ami1),
and utilizes RAID 5 across 4 SCSI disks.

Unfortunately I am having areoccurring problem, the connection with the Dual
Channel RAID controller hangs, and I am unable to access the disks.  There
is no kernel panic, I am able to log in and do anything, except access ami1.

I have tried 4 different snapshots from October, and an install from the 3.8
CD, all ending with the same result.  The hang takes anywhere from 12 hours
to 48 hours.  Also, each time it hangs I can't do a proper shutdown as the
command "shutdown -h now" never completes.  For the mean time I just
aggressively monitor is status and cold reboot it each time it hangs.

Is there any thing I can do for better system stability?  Is there any
further information I can give that will allow developers insight into the
problem?

Thanks.

ERROR LOGGED TO /var/log/messages
(this is the same error logged every time, sometimes the ccb # is different)
(sometimes it is "... ccb 58")

Nov  3 01:08:17 backup /bsd: ami1: timeout ccb 126
Nov  3 01:08:33 backup last message repeated 2 times
Nov  3 01:08:33 backup /bsd: ses0: status read error

DMESG (from snapshot Oct 31)

OpenBSD 3.8-current (GENERIC) #203: Fri Oct 21 12:35:57 MDT 2005
    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Intel(R) Xeon(TM) CPU 3.00GHz ("GenuineIntel" 686-class) 3 GHz
cpu0:
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,CNXT-ID
real mem  = 1073065984 (1047916K)
avail mem = 972574720 (949780K)
using 4278 buffers containing 53755904 bytes (52496K) of memory
mainbus0 (root)
bios0 at mainbus0: AT/286+(00) BIOS, date 09/22/05, BIOS32 rev. 0 @ 0xffe90
pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfb140/272 (15 entries)
pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82801EB/ER LPC" rev 0x00)
pcibios0: PCI bus #9 is the last bus
bios0: ROM list: 0xc0000/0xb000! 0xcb000/0x1000 0xcc000/0x1000
0xcd000/0x2200 0xcf800/0x2600 0xec000/0x4000!
ipmi0 at mainbus0: version 1.5 interface KCS iobase 0xca8/8 spacing 4
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 0 function 0 "Intel E7710 SMCH" rev 0x09
ppb0 at pci0 dev 2 function 0 "Intel E7710 MCH PCIE" rev 0x09
pci1 at ppb0 bus 1
ppb1 at pci1 dev 0 function 0 "Intel IOP331 Channel 0" rev 0x06
pci2 at ppb1 bus 2
ami0 at pci2 dev 14 function 0 "Dell PERC 4e/Di" rev 0x06: irq 7 Dell
16c/32b
ami0: FW 521S, BIOS vH430, 256MB RAM
ami0: 1 channels, 0 FC loops, 1 logical drives
scsibus0 at ami0: 40 targets
sd0 at scsibus0 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct fixed
sd0: 69880MB, 69880 cyl, 64 head, 32 sec, 512 bytes/sec, 143114240 sec total
scsibus1 at ami0: 16 targets
safte0 at scsibus1 targ 6 lun 0: <PE/PV, 1x2 SCSI BP, 1.0> SCSI2 3/processor
fixed
ppb2 at pci1 dev 0 function 2 "Intel IOP331 Channel 1" rev 0x06
pci3 at ppb2 bus 3
ami1 at pci3 dev 11 function 0 "Symbios Logic MegaRAID" rev 0x01: irq 3 Dell
518/64b/lhc
ami1: FW 351S, BIOS v1.10, 128MB RAM
ami1: 2 channels, 0 FC loops, 1 logical drives
scsibus2 at ami1: 40 targets
sd1 at scsibus2 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct fixed
sd1: 419700MB, 419700 cyl, 64 head, 32 sec, 512 bytes/sec, 859545600 sec
total
scsibus3 at ami1: 16 targets
scsibus4 at ami1: 16 targets
ses0 at scsibus4 targ 6 lun 0: <DELL, PV22XS, E.17> SCSI3 3/processor fixed
ppb3 at pci0 dev 4 function 0 "Intel E7710 MCH PCIE" rev 0x09
pci4 at ppb3 bus 4
ppb4 at pci0 dev 5 function 0 "Intel E7710 MCH PCIE" rev 0x09
pci5 at ppb4 bus 5
ppb5 at pci5 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
pci6 at ppb5 bus 6
em0 at pci6 dev 7 function 0 "Intel PRO/1000MT (82541GI)" rev 0x05: irq 11,
address 00:14:22:17:c9:76
ppb6 at pci5 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09
pci7 at ppb6 bus 7
em1 at pci7 dev 8 function 0 "Intel PRO/1000MT (82541GI)" rev 0x05: irq 3,
address 00:14:22:17:c9:77
ppb7 at pci0 dev 6 function 0 "Intel E7710 MCH PCIE" rev 0x09
pci8 at ppb7 bus 8
uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: irq 11
usb0 at uhci0: USB revision 1.0
uhub0 at usb0
uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: irq 10
usb1 at uhci1: USB revision 1.0
uhub1 at usb1
uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2 at pci0 dev 29 function 2 "Intel 82801EB/ER USB" rev 0x02: irq 7
usb2 at uhci2: USB revision 1.0
uhub2 at usb2
uhub2: Intel UHCI root hub, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: irq 5
usb3 at ehci0: USB revision 2.0
uhub3 at usb3
uhub3: Intel EHCI root hub, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
ppb8 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
pci9 at ppb8 bus 9
vga1 at pci9 dev 13 function 0 "ATI Radeon VE QY" rev 0x00
wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
pciide0 at pci0 dev 31 function 1 "Intel 82801EB/ER IDE" rev 0x02: DMA,
channel 0 configured to compatibility, channel 1 configured to compatibil
ity
atapiscsi0 at pciide0 channel 0 drive 0
scsibus5 at atapiscsi0: 2 targets
cd0 at scsibus5 targ 0 lun 0: <HL-DT-ST, CD-ROM GCR-8240N, 1.06> SCSI0
5/cdrom removable
cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
pciide0: channel 1 disabled (no drives)
isa0 at ichpcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard, using wsdisplay0
pcppi0 at isa0 port 0x61
midi0 at pcppi0: <PC speaker>
spkr0 at pcppi0
sysbeep0 at pcppi0
npx0 at isa0 port 0xf0/16: using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
biomask ffed netmask ffed ttymask ffef
pctr: user-level cycle counter enabled
uhub4 at uhub3 port 3
uhub4: Dell product 0xa001, rev 2.00/0.00, addr 2
uhub4: 2 ports with 2 removable, self powered, multiple transaction
translators
dkcsum: sd0 matches BIOS drive 0x80
dkcsum: sd1 matches BIOS drive 0x81
root on sd0a
rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
WARNING: / was not properly unmounted

COMMAND bioctl -h ami0

Volume  Status     Size           Device
ami0 0 Online              68.2G sd0     RAID1
      0 Online              68.4G 0:0.0   safte0 <MAXTOR  ATLAS10K5_73SCA
JNZM>
      1 Online              68.4G 0:1.0   safte0 <MAXTOR  ATLAS10K5_73SCA
JNZM>

COMMAND bioctl -h ami1
(before hang)

Volume  Status     Size           Device
ami1 0 Online               410G sd1     RAID5
      0 Online               137G 1:0.0   ses0   <MAXTOR  
ATLAS10K5_146SCAJNZM>
      1 Online               137G 1:1.0   ses0   <MAXTOR  
ATLAS10K5_146SCAJNZM>
      2 Online               137G 1:2.0   ses0   <MAXTOR  
ATLAS10K5_146SCAJNZM>
      3 Online               137G 1:3.0   ses0   <MAXTOR  
ATLAS10K5_146SCAJNZM>

COMMAND bioctl -h ami1
(if tried after the hang)

bioctl : BIOCINQ : Invalid arguement.

_________________________________________________________________
FREE pop-up blocking with the new MSN Toolbar  get it now!
http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/

Reply | Threaded
Open this post in threaded view
|

Re: PERC4/DC Error

Marco Peereboom
I'll start looking into this ASAP.

On Thu, Nov 03, 2005 at 02:17:12PM -0700, Tom Geman wrote:

> I have a backup server (Dell PowerEdge 1850) attached to the Dell
> PowerVault 220S.  The only function this server does is backing up remote
> servers throughout the day via rsync.
>
> The 1850 uses RAID 1 via the embedded RAID controller (PERC 4e/Si, ami0).  
> On this RAID 1 is a generic install of OpenBSD plus the rsync package.  The
> storage is connected via the expansion RAID controller (PERC 4/DC, ami1),
> and utilizes RAID 5 across 4 SCSI disks.
>
> Unfortunately I am having areoccurring problem, the connection with the
> Dual Channel RAID controller hangs, and I am unable to access the disks.  
> There is no kernel panic, I am able to log in and do anything, except
> access ami1.
>
> I have tried 4 different snapshots from October, and an install from the
> 3.8 CD, all ending with the same result.  The hang takes anywhere from 12
> hours to 48 hours.  Also, each time it hangs I can't do a proper shutdown
> as the command "shutdown -h now" never completes.  For the mean time I just
> aggressively monitor is status and cold reboot it each time it hangs.
>
> Is there any thing I can do for better system stability?  Is there any
> further information I can give that will allow developers insight into the
> problem?
>
> Thanks.
>
> ERROR LOGGED TO /var/log/messages
> (this is the same error logged every time, sometimes the ccb # is different)
> (sometimes it is "... ccb 58")
>
> Nov  3 01:08:17 backup /bsd: ami1: timeout ccb 126
> Nov  3 01:08:33 backup last message repeated 2 times
> Nov  3 01:08:33 backup /bsd: ses0: status read error
>
> DMESG (from snapshot Oct 31)
>
> OpenBSD 3.8-current (GENERIC) #203: Fri Oct 21 12:35:57 MDT 2005
>    [hidden email]:/usr/src/sys/arch/i386/compile/GENERIC
> cpu0: Intel(R) Xeon(TM) CPU 3.00GHz ("GenuineIntel" 686-class) 3 GHz
> cpu0:
> FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,SBF,SSE3,MWAIT,CNXT-ID
> real mem  = 1073065984 (1047916K)
> avail mem = 972574720 (949780K)
> using 4278 buffers containing 53755904 bytes (52496K) of memory
> mainbus0 (root)
> bios0 at mainbus0: AT/286+(00) BIOS, date 09/22/05, BIOS32 rev. 0 @ 0xffe90
> pcibios0 at bios0: rev 2.1 @ 0xf0000/0x10000
> pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfb140/272 (15 entries)
> pcibios0: PCI Interrupt Router at 000:31:0 ("Intel 82801EB/ER LPC" rev 0x00)
> pcibios0: PCI bus #9 is the last bus
> bios0: ROM list: 0xc0000/0xb000! 0xcb000/0x1000 0xcc000/0x1000
> 0xcd000/0x2200 0xcf800/0x2600 0xec000/0x4000!
> ipmi0 at mainbus0: version 1.5 interface KCS iobase 0xca8/8 spacing 4
> cpu0 at mainbus0
> pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
> pchb0 at pci0 dev 0 function 0 "Intel E7710 SMCH" rev 0x09
> ppb0 at pci0 dev 2 function 0 "Intel E7710 MCH PCIE" rev 0x09
> pci1 at ppb0 bus 1
> ppb1 at pci1 dev 0 function 0 "Intel IOP331 Channel 0" rev 0x06
> pci2 at ppb1 bus 2
> ami0 at pci2 dev 14 function 0 "Dell PERC 4e/Di" rev 0x06: irq 7 Dell
> 16c/32b
> ami0: FW 521S, BIOS vH430, 256MB RAM
> ami0: 1 channels, 0 FC loops, 1 logical drives
> scsibus0 at ami0: 40 targets
> sd0 at scsibus0 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct fixed
> sd0: 69880MB, 69880 cyl, 64 head, 32 sec, 512 bytes/sec, 143114240 sec total
> scsibus1 at ami0: 16 targets
> safte0 at scsibus1 targ 6 lun 0: <PE/PV, 1x2 SCSI BP, 1.0> SCSI2
> 3/processor fixed
> ppb2 at pci1 dev 0 function 2 "Intel IOP331 Channel 1" rev 0x06
> pci3 at ppb2 bus 3
> ami1 at pci3 dev 11 function 0 "Symbios Logic MegaRAID" rev 0x01: irq 3
> Dell 518/64b/lhc
> ami1: FW 351S, BIOS v1.10, 128MB RAM
> ami1: 2 channels, 0 FC loops, 1 logical drives
> scsibus2 at ami1: 40 targets
> sd1 at scsibus2 targ 0 lun 0: <AMI, Host drive #00, > SCSI2 0/direct fixed
> sd1: 419700MB, 419700 cyl, 64 head, 32 sec, 512 bytes/sec, 859545600 sec
> total
> scsibus3 at ami1: 16 targets
> scsibus4 at ami1: 16 targets
> ses0 at scsibus4 targ 6 lun 0: <DELL, PV22XS, E.17> SCSI3 3/processor fixed
> ppb3 at pci0 dev 4 function 0 "Intel E7710 MCH PCIE" rev 0x09
> pci4 at ppb3 bus 4
> ppb4 at pci0 dev 5 function 0 "Intel E7710 MCH PCIE" rev 0x09
> pci5 at ppb4 bus 5
> ppb5 at pci5 dev 0 function 0 "Intel PCIE-PCIE" rev 0x09
> pci6 at ppb5 bus 6
> em0 at pci6 dev 7 function 0 "Intel PRO/1000MT (82541GI)" rev 0x05: irq 11,
> address 00:14:22:17:c9:76
> ppb6 at pci5 dev 0 function 2 "Intel PCIE-PCIE" rev 0x09
> pci7 at ppb6 bus 7
> em1 at pci7 dev 8 function 0 "Intel PRO/1000MT (82541GI)" rev 0x05: irq 3,
> address 00:14:22:17:c9:77
> ppb7 at pci0 dev 6 function 0 "Intel E7710 MCH PCIE" rev 0x09
> pci8 at ppb7 bus 8
> uhci0 at pci0 dev 29 function 0 "Intel 82801EB/ER USB" rev 0x02: irq 11
> usb0 at uhci0: USB revision 1.0
> uhub0 at usb0
> uhub0: Intel UHCI root hub, rev 1.00/1.00, addr 1
> uhub0: 2 ports with 2 removable, self powered
> uhci1 at pci0 dev 29 function 1 "Intel 82801EB/ER USB" rev 0x02: irq 10
> usb1 at uhci1: USB revision 1.0
> uhub1 at usb1
> uhub1: Intel UHCI root hub, rev 1.00/1.00, addr 1
> uhub1: 2 ports with 2 removable, self powered
> uhci2 at pci0 dev 29 function 2 "Intel 82801EB/ER USB" rev 0x02: irq 7
> usb2 at uhci2: USB revision 1.0
> uhub2 at usb2
> uhub2: Intel UHCI root hub, rev 1.00/1.00, addr 1
> uhub2: 2 ports with 2 removable, self powered
> ehci0 at pci0 dev 29 function 7 "Intel 82801EB/ER USB" rev 0x02: irq 5
> usb3 at ehci0: USB revision 2.0
> uhub3 at usb3
> uhub3: Intel EHCI root hub, rev 2.00/1.00, addr 1
> uhub3: 6 ports with 6 removable, self powered
> ppb8 at pci0 dev 30 function 0 "Intel 82801BA AGP" rev 0xc2
> pci9 at ppb8 bus 9
> vga1 at pci9 dev 13 function 0 "ATI Radeon VE QY" rev 0x00
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> ichpcib0 at pci0 dev 31 function 0 "Intel 82801EB/ER LPC" rev 0x02
> pciide0 at pci0 dev 31 function 1 "Intel 82801EB/ER IDE" rev 0x02: DMA,
> channel 0 configured to compatibility, channel 1 configured to compatibil
> ity
> atapiscsi0 at pciide0 channel 0 drive 0
> scsibus5 at atapiscsi0: 2 targets
> cd0 at scsibus5 targ 0 lun 0: <HL-DT-ST, CD-ROM GCR-8240N, 1.06> SCSI0
> 5/cdrom removable
> cd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2
> pciide0: channel 1 disabled (no drives)
> isa0 at ichpcib0
> isadma0 at isa0
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pcppi0 at isa0 port 0x61
> midi0 at pcppi0: <PC speaker>
> spkr0 at pcppi0
> sysbeep0 at pcppi0
> npx0 at isa0 port 0xf0/16: using exception 16
> pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
> biomask ffed netmask ffed ttymask ffef
> pctr: user-level cycle counter enabled
> uhub4 at uhub3 port 3
> uhub4: Dell product 0xa001, rev 2.00/0.00, addr 2
> uhub4: 2 ports with 2 removable, self powered, multiple transaction
> translators
> dkcsum: sd0 matches BIOS drive 0x80
> dkcsum: sd1 matches BIOS drive 0x81
> root on sd0a
> rootdev=0x400 rrootdev=0xd00 rawdev=0xd02
> WARNING: / was not properly unmounted
>
> COMMAND bioctl -h ami0
>
> Volume  Status     Size           Device
> ami0 0 Online              68.2G sd0     RAID1
>      0 Online              68.4G 0:0.0   safte0 <MAXTOR  ATLAS10K5_73SCA
> JNZM>
>      1 Online              68.4G 0:1.0   safte0 <MAXTOR  ATLAS10K5_73SCA
> JNZM>
>
> COMMAND bioctl -h ami1
> (before hang)
>
> Volume  Status     Size           Device
> ami1 0 Online               410G sd1     RAID5
>      0 Online               137G 1:0.0   ses0   <MAXTOR  
> ATLAS10K5_146SCAJNZM>
>      1 Online               137G 1:1.0   ses0   <MAXTOR  
> ATLAS10K5_146SCAJNZM>
>      2 Online               137G 1:2.0   ses0   <MAXTOR  
> ATLAS10K5_146SCAJNZM>
>      3 Online               137G 1:3.0   ses0   <MAXTOR  
> ATLAS10K5_146SCAJNZM>
>
> COMMAND bioctl -h ami1
> (if tried after the hang)
>
> bioctl : BIOCINQ : Invalid arguement.
>
> _________________________________________________________________
> FREE pop-up blocking with the new MSN Toolbar  get it now!
> http://toolbar.msn.click-url.com/go/onm00200415ave/direct/01/