Urgent problem with an arc RAID controller

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Urgent problem with an arc RAID controller

Federico Giannici
This night we had some problems with an Areca ARC-1220 RAID controller
(arc driver) installed in an OpenBSD 4.4 amd64.

After the substitution of a couple of HDs the system restarted (in
rebuilding mode), but there was a problem: the RAID disk no longer boot!!!

The system see the controller, it correctly boot and seems to work ok.
The controller BIOS seems ok, apart from being in "Rebuild" state, but
this already occurred another couple of times before with no particular
problem.

But this time, after the controller starts up, when the PC BIOS should
boot from the disk, the following message is written to video and
nothing more happens:

     Using drive 0, partition 3.
     Loading...

Looking at the controller "System events" I found a message that I have
never seen before:

     Rebuild LBA

It seems that the controller messed-up with something in the boot
sequence of OpenBSD, but I cannot find exactly what. I have done a lot
of attempts with the fidk, disklabel and instalboot, but with no
success. I'm worried to experiment too much with that commands, because
I'm really scared to destroy all the data in the disk.

Anybody can tell me what is the exact problem, and maybe suggest the
correct commands to restore the situation?

Thanks.

--
___________________________________________________
     __
    |-                      [hidden email]
    |ederico Giannici      http://www.neomedia.it
___________________________________________________

Reply | Threaded
Open this post in threaded view
|

Re: Urgent problem with an arc RAID controller

Noah Pugsley
If you can boot from an install and see your partitions and data, back
your shit up NOW. Worry about the repair later.

Once you are ready to repair you need to include a lot more info if you
want help. If you have properly backed up you might just save yourself
some time and reinstall. What exactly did you 'attempt' with fdisk,
disklabel and installboot?

Cheers,
noah

Federico Giannici wrote:

> This night we had some problems with an Areca ARC-1220 RAID controller
> (arc driver) installed in an OpenBSD 4.4 amd64.
>
> After the substitution of a couple of HDs the system restarted (in
> rebuilding mode), but there was a problem: the RAID disk no longer boot!!!
>
> The system see the controller, it correctly boot and seems to work ok.
> The controller BIOS seems ok, apart from being in "Rebuild" state, but
> this already occurred another couple of times before with no particular
> problem.
>
> But this time, after the controller starts up, when the PC BIOS should
> boot from the disk, the following message is written to video and
> nothing more happens:
>
>     Using drive 0, partition 3.
>     Loading...
>
> Looking at the controller "System events" I found a message that I have
> never seen before:
>
>     Rebuild LBA
>
> It seems that the controller messed-up with something in the boot
> sequence of OpenBSD, but I cannot find exactly what. I have done a lot
> of attempts with the fidk, disklabel and instalboot, but with no
> success. I'm worried to experiment too much with that commands, because
> I'm really scared to destroy all the data in the disk.
>
> Anybody can tell me what is the exact problem, and maybe suggest the
> correct commands to restore the situation?
>
> Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Urgent problem with an arc RAID controller

Chris Cappuccio
In reply to this post by Federico Giannici
Assuming the Areca controller's virtual disk shows up as sd0, you can reinstall the MBR and boot blocks by:

1. Boot bsd.rd   (from CD perhaps?)
2. fdisk -i sd0  (MBR)
3. mount /dev/sd0a to /mnt
4. installboot /mnt/boot /usr/mdec/biosboot sd0  (Boot blocks)

Of course, I'm assuming here that your DOS disk partition was created as default by the installer.  The disklabel should still show up in the same place this way.

Did you already reinstall the MBR with fdisk at some point?

It sounds like the rebuild process changed the size of the virtual disk (which seems unlikely to me but I guess it's possible)

Federico Giannici [[hidden email]] wrote:

> This night we had some problems with an Areca ARC-1220 RAID controller  
> (arc driver) installed in an OpenBSD 4.4 amd64.
>
> After the substitution of a couple of HDs the system restarted (in  
> rebuilding mode), but there was a problem: the RAID disk no longer
> boot!!!
>
> The system see the controller, it correctly boot and seems to work ok.  
> The controller BIOS seems ok, apart from being in "Rebuild" state, but  
> this already occurred another couple of times before with no particular  
> problem.
>
> But this time, after the controller starts up, when the PC BIOS should  
> boot from the disk, the following message is written to video and  
> nothing more happens:
>
>     Using drive 0, partition 3.
>     Loading...
>
> Looking at the controller "System events" I found a message that I have  
> never seen before:
>
>     Rebuild LBA
>
> It seems that the controller messed-up with something in the boot  
> sequence of OpenBSD, but I cannot find exactly what. I have done a lot  
> of attempts with the fidk, disklabel and instalboot, but with no  
> success. I'm worried to experiment too much with that commands, because  
> I'm really scared to destroy all the data in the disk.
>
> Anybody can tell me what is the exact problem, and maybe suggest the  
> correct commands to restore the situation?
>
> Thanks.
>
> --
> ___________________________________________________
>     __
>    |-                      [hidden email]
>    |ederico Giannici      http://www.neomedia.it
> ___________________________________________________

--
Trying to bring taste and skill into a branch of artistic endeavor which had sunk to the lowest possible depths.

Reply | Threaded
Open this post in threaded view
|

Re: Urgent problem with an arc RAID controller

Federico Giannici
Chris Cappuccio wrote:
> Assuming the Areca controller's virtual disk shows up as sd0, you can reinstall the MBR and boot blocks by:
>
> 1. Boot bsd.rd   (from CD perhaps?)
> 2. fdisk -i sd0  (MBR)
> 3. mount /dev/sd0a to /mnt
> 4. installboot /mnt/boot /usr/mdec/biosboot sd0  (Boot blocks)

I think that I already issued both the "fdick -i" and "installboot"
commands. Anyway it could be that I did them in a wrong way because I
was in panic...

Now I have attached another disk, were I dump/restore-ed a copy of the
root partition (the other partitions with all the data are mounted from
the original disk) and so the server is alive again, and now I can think
more lucidly...

I think that at least part of the problem is in the fdisk partitioning,
due to the great size of the disk.

The system is a RAID 0+1 with 6 1TB disks, so it appears as a 3TB disk.
The last (and bigger) partition is formatted in FFS2.


First of all, here are the relevant parts of dmesg:

arc0 at pci3 dev 14 function 0 "Areca ARC-1220" rev 0x00: apic 4 int 16
(irq 7)
arc0: 8 ports, 256MB SDRAM, firmware V1.46 2009-01-06
scsibus0 at arc0: 16 targets, initiator 16
sd0 at scsibus0 targ 0 lun 0: <Areca, ARC-1220-VOL#00, R001> SCSI3
0/direct fixed
sd0: 2861022MB, 44966 cyl, 511 head, 255 sec, 512 bytes/sec, 5859374592
sec total


Here it is a copy of the "fdisk sd0" command before the problem (after I
installed the system):

Disk: sd0       geometry: 364729/255/63 [1564407296 Sectors]
Offset: 0       Signature: 0xAA55
             Starting         Ending         LBA Info:
  #: id      C   H   S -      C   H   S [       start:        size ]
-------------------------------------------------------------------------------
  0: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
  1: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
  2: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
*3: A6      0   1   1 -  97379 165  59 [          63:  1564404026 ] OpenBSD


And here it is how it appears now:

Disk: sd0       geometry: 26157922/7/32 [1564407296 Sectors]
Offset: 0       Signature: 0xAA55
             Starting         Ending         LBA Info:
  #: id      C   H   S -      C   H   S [       start:        size ]
-------------------------------------------------------------------------------
  0: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
  1: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
  2: 00      0   0   0 -      0   0   0 [           0:           0 ] unused
*3: A6      0   1  32 - 6983946   5  25 [          63:  1564404026 ] OpenBSD

I don't remember if the LBA part was already this way or if I set it
this way. I tried to set the CHS parameters but had a lot of problems,
and even if fdisk get the parameters and said he wrote them to the disk,
then they were still the same (or anyway different from the ones I set)!


Here it is the disklabel sd0 output:

# Inside MBR partition 3: type A6 start 63 size 1564404026
# /dev/rsd0c:
type: SCSI
disk: SCSI disk
label: ARC-1220-VOL
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 364729
total sectors: 5859374592
rpm: 10000
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0           # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
#                size           offset  fstype [fsize bsize  cpg]
   a:        104872257               63  4.2BSD   2048 16384    1
   b:         41945715        104872320    swap
   c:       5859374592                0  unused      0     0
   d:        104872320        146818035  4.2BSD   2048 16384    1
   e:        104872320        251690355  4.2BSD   2048 16384    1
   f:       5502811917        356562675  4.2BSD   2048 16384    1


Now, what do you suggest to set the disk in a consistent way and make it
correctly boot?

Thanks.



> Of course, I'm assuming here that your DOS disk partition was created as default by the installer.  The disklabel should still show up in the same place this way.
>
> Did you already reinstall the MBR with fdisk at some point?
>
> It sounds like the rebuild process changed the size of the virtual disk (which seems unlikely to me but I guess it's possible)
>
> Federico Giannici [[hidden email]] wrote:
>> This night we had some problems with an Areca ARC-1220 RAID controller  
>> (arc driver) installed in an OpenBSD 4.4 amd64.
>>
>> After the substitution of a couple of HDs the system restarted (in  
>> rebuilding mode), but there was a problem: the RAID disk no longer
>> boot!!!
>>
>> The system see the controller, it correctly boot and seems to work ok.  
>> The controller BIOS seems ok, apart from being in "Rebuild" state, but  
>> this already occurred another couple of times before with no particular  
>> problem.
>>
>> But this time, after the controller starts up, when the PC BIOS should  
>> boot from the disk, the following message is written to video and  
>> nothing more happens:
>>
>>     Using drive 0, partition 3.
>>     Loading...
>>
>> Looking at the controller "System events" I found a message that I have  
>> never seen before:
>>
>>     Rebuild LBA
>>
>> It seems that the controller messed-up with something in the boot  
>> sequence of OpenBSD, but I cannot find exactly what. I have done a lot  
>> of attempts with the fidk, disklabel and instalboot, but with no  
>> success. I'm worried to experiment too much with that commands, because  
>> I'm really scared to destroy all the data in the disk.
>>
>> Anybody can tell me what is the exact problem, and maybe suggest the  
>> correct commands to restore the situation?
>>
>> Thanks.
>>
>> --
>> ___________________________________________________
>>     __
>>    |-                      [hidden email]
>>    |ederico Giannici      http://www.neomedia.it
>> ___________________________________________________
>


--
___________________________________________________
     __
    |-                      [hidden email]
    |ederico Giannici      http://www.neomedia.it
___________________________________________________