Softraid data recovery

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Softraid data recovery

Steven Surdock
I have a simple RAID1 configuration on wd0, wd1.  I was in the process of performing a rebuild on wd1, as it failed during some heavy reads.  During the rebuild wd0 went into a failure state.  After some troubleshooting I decided to reboot and now my RAID disk, sd1, is unavailable.  Disks wd0 and wd1 don't show any errors, but I have a replacement disk.  I have backups for the critical data and I'd like to try and recover as much recent data as possible.  My thought was to create a disk image of the "/home/public" data and mount it using vnconfig, but I seem to be having issues with the appropriate 'dd' command to do that.

How can I recover as much data as possible off the failed RAID array.
If I recreate the array, "bioctl -c 1 -l /dev/wd0d,/dev/wd1d softraid0", will the existing data be preserved?

root@host# disklabel wd0
# /dev/rwd0c:
type: ESDI
disk: ESDI/IDE disk
label: WDC WD4001FAEX-0
duid: acce36f25df51c8c
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 486401
total sectors: 7814037168
boundstart: 64
boundend: 4294961685
drivedata: 0

16 partitions:
#                size           offset  fstype [fsize bsize   cpg]
  c:       7814037168                0  unused
  d:       7814037104               64    RAID

root@host# more /var/backups/disklabel.sd1.backup
# /dev/rsd1c:
type: SCSI
disk: SCSI disk
label: SR RAID 1
duid: 8ec2330eabf7cd26
flags:
bytes/sector: 512
sectors/track: 63
tracks/cylinder: 255
sectors/cylinder: 16065
cylinders: 486401
total sectors: 7814036576
boundstart: 64
boundend: 7814036576
drivedata: 0

16 partitions:
#                size           offset  fstype [fsize bsize   cpg]
  a:       2147488704               64  4.2BSD   8192 65536     1 # /home/public/
  c:       7814036576                0  unused
  d:       5666547712       2147488768  4.2BSD   8192 65536     1 # /home/Backups/

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Aaron Mason
On Tue, Oct 15, 2019 at 7:34 AM Steven Surdock
<[hidden email]> wrote:

>
> I have a simple RAID1 configuration on wd0, wd1.  I was in the process of performing a rebuild on wd1, as it failed during some heavy reads.  During the rebuild wd0 went into a failure state.  After some troubleshooting I decided to reboot and now my RAID disk, sd1, is unavailable.  Disks wd0 and wd1 don't show any errors, but I have a replacement disk.  I have backups for the critical data and I'd like to try and recover as much recent data as possible.  My thought was to create a disk image of the "/home/public" data and mount it using vnconfig, but I seem to be having issues with the appropriate 'dd' command to do that.
>
> How can I recover as much data as possible off the failed RAID array.
> If I recreate the array, "bioctl -c 1 -l /dev/wd0d,/dev/wd1d softraid0", will the existing data be preserved?
>
> root@host# disklabel wd0
> # /dev/rwd0c:
> type: ESDI
> disk: ESDI/IDE disk
> label: WDC WD4001FAEX-0
> duid: acce36f25df51c8c
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 486401
> total sectors: 7814037168
> boundstart: 64
> boundend: 4294961685
> drivedata: 0
>
> 16 partitions:
> #                size           offset  fstype [fsize bsize   cpg]
>   c:       7814037168                0  unused
>   d:       7814037104               64    RAID
>
> root@host# more /var/backups/disklabel.sd1.backup
> # /dev/rsd1c:
> type: SCSI
> disk: SCSI disk
> label: SR RAID 1
> duid: 8ec2330eabf7cd26
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 486401
> total sectors: 7814036576
> boundstart: 64
> boundend: 7814036576
> drivedata: 0
>
> 16 partitions:
> #                size           offset  fstype [fsize bsize   cpg]
>   a:       2147488704               64  4.2BSD   8192 65536     1 # /home/public/
>   c:       7814036576                0  unused
>   d:       5666547712       2147488768  4.2BSD   8192 65536     1 # /home/Backups/
>

I think at this point you're far better off restoring from backup.
You do have a backup, right?

As for the disks, ddrescue would be a better option than dd - it'll
keep trying if it encounters another URE whereas dd will up and quit.
Expect it to take several days on disks that big - it's designed to be
gentle to dying disks.

--
Aaron Mason - Programmer, open source addict
I've taken my software vows - for beta or for worse

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Steven Surdock
> -----Original Message-----
> From: Aaron Mason <[hidden email]>
> Sent: Monday, October 14, 2019 7:13 PM
> To: Steven Surdock <[hidden email]>
> Cc: [hidden email]
> Subject: Re: Softraid data recovery
>
> On Tue, Oct 15, 2019 at 7:34 AM Steven Surdock <ssurdock@engineered-
> net.com> wrote:
> >
> > I have a simple RAID1 configuration on wd0, wd1.  I was in the process
> of performing a rebuild on wd1, as it failed during some heavy reads.
> During the rebuild wd0 went into a failure state.  After some
> troubleshooting I decided to reboot and now my RAID disk, sd1, is
> unavailable.  Disks wd0 and wd1 don't show any errors, but I have a
> replacement disk.  I have backups for the critical data and I'd like to
> try and recover as much recent data as possible.  My thought was to
> create a disk image of the "/home/public" data and mount it using
> vnconfig, but I seem to be having issues with the appropriate 'dd'
> command to do that.
> >
> > How can I recover as much data as possible off the failed RAID array.
> > If I recreate the array, "bioctl -c 1 -l /dev/wd0d,/dev/wd1d
> softraid0", will the existing data be preserved?
> >
> > root@host# disklabel wd0
> > # /dev/rwd0c:
> > type: ESDI
> > disk: ESDI/IDE disk
> > label: WDC WD4001FAEX-0
> > duid: acce36f25df51c8c
> > flags:
> > bytes/sector: 512
> > sectors/track: 63
> > tracks/cylinder: 255
> > sectors/cylinder: 16065
> > cylinders: 486401
> > total sectors: 7814037168
> > boundstart: 64
> > boundend: 4294961685
> > drivedata: 0
> >
> > 16 partitions:
> > #                size           offset  fstype [fsize bsize   cpg]
> >   c:       7814037168                0  unused
> >   d:       7814037104               64    RAID
> >
> > root@host# more /var/backups/disklabel.sd1.backup # /dev/rsd1c:
> > type: SCSI
> > disk: SCSI disk
> > label: SR RAID 1
> > duid: 8ec2330eabf7cd26
> > flags:
> > bytes/sector: 512
> > sectors/track: 63
> > tracks/cylinder: 255
> > sectors/cylinder: 16065
> > cylinders: 486401
> > total sectors: 7814036576
> > boundstart: 64
> > boundend: 7814036576
> > drivedata: 0
> >
> > 16 partitions:
> > #                size           offset  fstype [fsize bsize   cpg]
> >   a:       2147488704               64  4.2BSD   8192 65536     1 #
> /home/public/
> >   c:       7814036576                0  unused
> >   d:       5666547712       2147488768  4.2BSD   8192 65536     1 #
> /home/Backups/
> >
>
> I think at this point you're far better off restoring from backup.
> You do have a backup, right?
>
> As for the disks, ddrescue would be a better option than dd - it'll keep
> trying if it encounters another URE whereas dd will up and quit.
> Expect it to take several days on disks that big - it's designed to be
> gentle to dying disks.

I believe the disks are mostly healthy.  In fact I've tried several attempts at dd'ing the data from wd0 with no read issues.  It takes about 12 hours to read 1TB.  I suspect I'm not aligning sectors properly and the filesystem is not readable.  I've tried making an image of /home/public (which is _mostly_ backed up), but fsck doesn't see a reasonable filesystem after I vnconfig the image.  So, if anyone has some insight on 'dd if=/dev/wd0d of=public.img bs=512 count=5666547712 skip=xx', it would be great.

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Patrick Dohman-4
In reply to this post by Steven Surdock

> On Oct 14, 2019, at 3:04 PM, Steven Surdock wrote:
>
> root@host# more /var/backups/disklabel.sd1.backup
> # /dev/rsd1c:
> type: SCSI
> disk: SCSI disk
> label: SR RAID 1
> duid: 8ec2330eabf7cd26
> flags:
> bytes/sector: 512
> sectors/track: 63
> tracks/cylinder: 255
> sectors/cylinder: 16065
> cylinders: 486401
> total sectors: 7814036576
> boundstart: 64
> boundend: 7814036576
> drivedata: 0
>
> 16 partitions:
> #                size           offset  fstype [fsize bsize   cpg]
>  a:       2147488704               64  4.2BSD   8192 65536     1 # /home/public/
>  c:       7814036576                0  unused
>  d:       5666547712       2147488768  4.2BSD   8192 65536     1 # /home/Backups/
>


A combination of revised partition lettering & a custom fstab may allow for mounting of the partitions without a software device.

For example:

$cat /etc/fstab
/dev/wd0a  /home ffs rw,nodev,nosuid 1 2
/dev/wd0d  /home/Backups/ ffs rw,nodev,nosuid 1 2

The device naming may take some massaging to work...
man fstab & disklabel for more info.

Regards
Patrick

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Karel Gardas
In reply to this post by Steven Surdock


On 2019-10-15 04:26, Steven Surdock wrote:
> I believe the disks are mostly healthy.

I seriously doubt that. What's the output from smartctl -a for both
drives? I can't imagine why would you get failures on heave reads on one
drive and then later failures on another one and yet it would not show
in SMART info as some kind of error(s). Another possibility maybe your
SATA cables just too old and fragile, but smartctl will tell that too.

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Steven Surdock
> -----Original Message-----
> From: Karel Gardas <[hidden email]>
> Sent: Tuesday, October 15, 2019 5:31 AM
> To: Steven Surdock <[hidden email]>
> Cc: [hidden email]
> Subject: Re: Softraid data recovery
>
>
>
> On 2019-10-15 04:26, Steven Surdock wrote:
> > I believe the disks are mostly healthy.
>
> I seriously doubt that. What's the output from smartctl -a for both
> drives? I can't imagine why would you get failures on heave reads on one
> drive and then later failures on another one and yet it would not show
> in SMART info as some kind of error(s). Another possibility maybe your
> SATA cables just too old and fragile, but smartctl will tell that too.

root@host# smartctl -a /dev/wd0c
smartctl 7.0 2018-12-30 r4883 [i386-unknown-openbsd6.5] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Black
Device Model:     WDC WD4001FAEX-00MJRA0
Serial Number:    WD-WCC131134311
LU WWN Device Id: 5 0014ee 2090b4beb
Firmware Version: 01.01L01
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 15 07:40:39 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (46080) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 497) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x70b5) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0027   151   151   021    Pre-fail  Always       -       11425
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       24
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   030   030   000    Old_age   Always       -       51197
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       24
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       12
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       13
194 Temperature_Celsius     0x0022   104   100   000    Old_age   Always       -       48
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       9
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       9
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       9

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

root@host# smartctl -a /dev/wd1c
smartctl 7.0 2018-12-30 r4883 [i386-unknown-openbsd6.5] (local build)
Copyright (C) 2002-18, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Black
Device Model:     WDC WD4003FZEX-00Z4SA0
Serial Number:    WD-WMC5D0D50MLK
LU WWN Device Id: 5 0014ee 0598032b8
Firmware Version: 01.01A01
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2 (minor revision not indicated)
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Tue Oct 15 07:40:55 2019 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                                        was completed without error.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (43080) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 466) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x7035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       6
  3 Spin_Up_Time            0x0027   144   144   021    Pre-fail  Always       -       11766
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       9
  5 Reallocated_Sector_Ct   0x0033   200   200   140    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   200   200   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   064   064   000    Old_age   Always       -       26484
 10 Spin_Retry_Count        0x0032   100   253   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   253   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       9
 16 Unknown_Attribute       0x0022   013   187   000    Old_age   Always       -       243933147908
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       3
193 Load_Cycle_Count        0x0032   200   200   000    Old_age   Always       -       7
194 Temperature_Celsius     0x0022   105   100   000    Old_age   Always       -       47
196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       4
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       6

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Karel Gardas
On 2019-10-15 13:44, Steven Surdock wrote:
> Model Family:     Western Digital Black
> Device Model:     WDC WD4001FAEX-00MJRA0
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       9
> 198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       9
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       9

Looks like 9 bad sectors which can't be remapped for whatever reason.
UDMA_CRC error count is on 0, which looks like your SATA cable is fine.
The drive is kind of strange since it still claim Raw read error rate to
have on 0.

> Model Family:     Western Digital Black
> Device Model:     WDC WD4003FZEX-00Z4SA0
> Serial Number:    WD-WMC5D0D50MLK
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
>    1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail  Always       -       6
> 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age   Always       -       0
> 197 Current_Pending_Sector  0x0032   200   200   000    Old_age   Always       -       0
> 198 Offline_Uncorrectable   0x0030   200   200   000    Old_age   Offline      -       4
> 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
> 200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age   Offline      -       6

Looks like 4 uncorrectable sectors while 6 raw read error happened.

You can attempt to run -t long <drive> to learn more about your 2 drives
(with -a following long test), but I still consider both drives happily
dyeing.

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Steven Surdock
> -----Original Message-----
> From: Karel Gardas <[hidden email]>
> Sent: Wednesday, October 16, 2019 11:26 AM
> To: Steven Surdock <[hidden email]>
> Cc: [hidden email]
> Subject: Re: Softraid data recovery
>
> On 2019-10-15 13:44, Steven Surdock wrote:
> > Model Family:     Western Digital Black
> > Device Model:     WDC WD4001FAEX-00MJRA0
> > 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> > 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       9
> > 198 Offline_Uncorrectable   0x0030   200   200   000    Old_age
> Offline      -       9
> > 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> > 200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age
> Offline      -       9
>
> Looks like 9 bad sectors which can't be remapped for whatever reason.
> UDMA_CRC error count is on 0, which looks like your SATA cable is fine.
> The drive is kind of strange since it still claim Raw read error rate to
> have on 0.
>
> > Model Family:     Western Digital Black
> > Device Model:     WDC WD4003FZEX-00Z4SA0
> > Serial Number:    WD-WMC5D0D50MLK
> > Vendor Specific SMART Attributes with Thresholds:
> > ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
> UPDATED  WHEN_FAILED RAW_VALUE
> >    1 Raw_Read_Error_Rate     0x002f   200   200   051    Pre-fail
> Always       -       6
> > 196 Reallocated_Event_Count 0x0032   200   200   000    Old_age
> Always       -       0
> > 197 Current_Pending_Sector  0x0032   200   200   000    Old_age
> Always       -       0
> > 198 Offline_Uncorrectable   0x0030   200   200   000    Old_age
> Offline      -       4
> > 199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age
> Always       -       0
> > 200 Multi_Zone_Error_Rate   0x0008   200   200   000    Old_age
> Offline      -       6
>
> Looks like 4 uncorrectable sectors while 6 raw read error happened.
>
> You can attempt to run -t long <drive> to learn more about your 2 drives
> (with -a following long test), but I still consider both drives happily
> dyeing.

Considered and working to replace.  I'm still working on recovering as much data as possible.  As noted, one partition is backups, but I had some scripts on there I did not backup.  Thanks.

Reply | Threaded
Open this post in threaded view
|

Re: Softraid data recovery

Steven Surdock
In reply to this post by Aaron Mason
> -----Original Message-----
> From: Aaron Mason <[hidden email]>
> Sent: Monday, October 14, 2019 7:13 PM
> To: Steven Surdock <[hidden email]>
> Cc: [hidden email]
> Subject: Re: Softraid data recovery
>
> On Tue, Oct 15, 2019 at 7:34 AM Steven Surdock <ssurdock@engineered-
> net.com> wrote:
> >
...
> >
> > How can I recover as much data as possible off the failed RAID array.
> > If I recreate the array, "bioctl -c 1 -l /dev/wd0d,/dev/wd1d
> softraid0", will the existing data be preserved?
> >
...
Based on the information found here:  https://marc.info/?l=openbsd-misc&m=136553269631163&w=2 I was successfully able to create a disk image off the failing drive.

$ dd if=/dev/wd0d of=raid.img conv=noerror,sync skip=528
$ vnconfig vnd0 raid.img
$ fsck /dev/vnd0a
$ fsck /dev/vnd0d
$ mount /dev/vnd0a /home/public