RAID 5 toast?

zmrzlina · Oct 13, 2006

I have a RAID 5 setup with four SCSI disks. Two have failed. Am I
right
that the data is completely unrecoverable? Was just wondering
if it's possible to run some kind of utility on at least one of the
disks
that might allow it to "un-fail" long enough for me to get the
partition
mounted to copy a few of the files off of it.

thanks

Odie Ferrous · Oct 13, 2006

I have a RAID 5 setup with four SCSI disks. Two have failed. Am I
right
that the data is completely unrecoverable? Was just wondering
if it's possible to run some kind of utility on at least one of the
disks
that might allow it to "un-fail" long enough for me to get the
partition
mounted to copy a few of the files off of it.

thanks

Having two drives fail at the same time is rare and normally points to a
power supply or raid adaptor problem.

If the data is critical, you could always send the array to me. I don't
believe you will be able to run any utility on the array to get the data
back.

I'm UK-based, so fairly local to cz.

Odie

Arno Wagner · Oct 13, 2006

Previously said:
I have a RAID 5 setup with four SCSI disks. Two have failed. Am I
right
that the data is completely unrecoverable? Was just wondering
if it's possible to run some kind of utility on at least one of the
disks
that might allow it to "un-fail" long enough for me to get the
partition
mounted to copy a few of the files off of it.

You have 2/3 of every stripe left. The rest is only present if
you get one of the failed disks to work again.

Best bet for "unfailing" is professional data recovery, see, e.g.,
Odie's posting. I also agree that having 2/4 failed siultaneously
without some common, and possibly external, problem is highly
unlikely.

Arno

sean · Oct 13, 2006

Try Spinrite you can get it from Http://grc.com I have used this on
many disks and it has and been able to recover most of them. Never
tried it on a drive from a raid array, but what do you have to lose at
this point.

Arno Wagner · Oct 13, 2006

Previously sean said:
Try Spinrite you can get it from Http://grc.com I have used this on
many disks and it has and been able to recover most of them. Never
tried it on a drive from a raid array, but what do you have to lose at
this point.

He could well make recovery more difficult.

Arno

Folkert Rienstra · Oct 13, 2006

sean said:
Try Spinrite you can get it from Http://grc.com I have used this on
many disks and it has and been able to recover most of them.

Never tried it on a drive from a raid array,

Obviously, since it only works on formatted harddrives.

but what do you have to lose at this point.

Whatever SpinRite costs these days?

Folkert Rienstra · Oct 13, 2006

I have a RAID 5 setup with four SCSI disks.

Two have failed.

Define 'failed'.

zmrzlina · Oct 16, 2006

Folkert said:
Define 'failed'.

System Type: SCSI/IDE Board
CPU: AMD-K6(tm)-III Processor 451 MHz
Memory: 256 MB
SCSI ID: 7
NIC Status: 100 Mbps
Main Board: Cyclone 1
Power Supply: AT

Running a proprietary version of linux.

Previously during bootup, eight SCSI drives (Seagate Barracuda
ST1181677LCV) were detected
and assigned ids
/dev/sd[abcd]1 were part of a RAID 5 setup md0
/dev/sd[efgh]1 were part of a RAID 5 setup md1

Now during bootup it detects only six drives. It assigns
/dev/sd[cdef]1 to md1 correctly, for md0 it gives this message:

Sep 13 14:21:06 kernel: considering sdb2 ...
Sep 13 14:21:06 kernel: adding sdb2 ...
Sep 13 14:21:06 kernel: adding sda2 ...
Sep 13 14:21:06 kernel: created md0
Sep 13 14:21:06 kernel: bind
Sep 13 14:21:06 kernel: bind
Sep 13 14:21:06 kernel: running:
Sep 13 14:21:06 kernel: now!
Sep 13 14:21:06 kernel: sdb2's event counter: 0000001a
Sep 13 14:21:06 kernel: sda2's event counter: 0000001a
Sep 13 14:21:06 kernel: md: device name has changed from sdc2 to sdb2
since last import!
Sep 13 14:21:06 kernel: md: device name has changed from sdb2 to sda2
since last import!
Sep 13 14:21:06 kernel: md0: removing former faulty sda2!
Sep 13 14:21:06 kernel: md: md0: raid array is not clean -- starting
background reconstruction
Sep 13 14:21:06 kernel: md0: max total readahead window set to 384k
Sep 13 14:21:06 kernel: md0: 3 data-disks, max readahead per data-disk:
128k
Sep 13 14:21:06 kernel: raid5: device sdb2 operational as raid disk 2
Sep 13 14:21:06 kernel: raid5: device sda2 operational as raid disk 1
Sep 13 14:21:06 kernel: raid5: not enough operational devices for md0
(2/4 failed)
Sep 13 14:21:06 kernel: RAID5 conf printout:
Sep 13 14:21:06 kernel: --- rd:4 wd:2 fd:2
Sep 13 14:21:06 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sda2
Sep 13 14:21:06 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdb2
Sep 13 14:21:06 kernel: raid5: failed to run raid set md0
Sep 13 14:21:06 kernel: pers->run() failed ...
Sep 13 14:21:06 kernel: do_md_run() returned -22

In the monitoring software for the system I find this:

SCSI0-1 177 GB SEAGATE ST1181677LCV 173,144MB 0MB Attention
SCSI0-2 177 GB SEAGATE ST1181677LCV 173,144MB 0MB Attention
SCSI0-4 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-5 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-6 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-15 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use

Makes me think that SCSI0-0 and SCSI0-3 (the first and fourth drives)
are not being seen by the OS.

I should mention that the disk failures could have happened months
apart from each other, since the system was not being adequately
monitored.

Someone mentioned spinrite. Buying a copy of this would be no problem,
but would it really help? If the SCSI BIOS doesn't even detect the
drive, is it worth buying any repair software?

Thanks in advance

Arno Wagner · Oct 16, 2006

System Type: SCSI/IDE Board
CPU: AMD-K6(tm)-III Processor 451 MHz
Memory: 256 MB
SCSI ID: 7
NIC Status: 100 Mbps
Main Board: Cyclone 1
Power Supply: AT

Running a proprietary version of linux.

Previously during bootup, eight SCSI drives (Seagate Barracuda
ST1181677LCV) were detected
and assigned ids
/dev/sd[abcd]1 were part of a RAID 5 setup md0
/dev/sd[efgh]1 were part of a RAID 5 setup md1

Now during bootup it detects only six drives. It assigns
/dev/sd[cdef]1 to md1 correctly, for md0 it gives this message:

Sep 13 14:21:06 kernel: considering sdb2 ...
Sep 13 14:21:06 kernel: adding sdb2 ...
Sep 13 14:21:06 kernel: adding sda2 ...
Sep 13 14:21:06 kernel: created md0
Sep 13 14:21:06 kernel: bind
Sep 13 14:21:06 kernel: bind
Sep 13 14:21:06 kernel: running:
Sep 13 14:21:06 kernel: now!
Sep 13 14:21:06 kernel: sdb2's event counter: 0000001a
Sep 13 14:21:06 kernel: sda2's event counter: 0000001a
Sep 13 14:21:06 kernel: md: device name has changed from sdc2 to sdb2
since last import!
Sep 13 14:21:06 kernel: md: device name has changed from sdb2 to sda2
since last import!
Sep 13 14:21:06 kernel: md0: removing former faulty sda2!
Sep 13 14:21:06 kernel: md: md0: raid array is not clean -- starting
background reconstruction
Sep 13 14:21:06 kernel: md0: max total readahead window set to 384k
Sep 13 14:21:06 kernel: md0: 3 data-disks, max readahead per data-disk:
128k
Sep 13 14:21:06 kernel: raid5: device sdb2 operational as raid disk 2
Sep 13 14:21:06 kernel: raid5: device sda2 operational as raid disk 1
Sep 13 14:21:06 kernel: raid5: not enough operational devices for md0
(2/4 failed)
Sep 13 14:21:06 kernel: RAID5 conf printout:
Sep 13 14:21:06 kernel: --- rd:4 wd:2 fd:2
Sep 13 14:21:06 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sda2
Sep 13 14:21:06 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdb2
Sep 13 14:21:06 kernel: raid5: failed to run raid set md0
Sep 13 14:21:06 kernel: pers->run() failed ...
Sep 13 14:21:06 kernel: do_md_run() returned -22

In the monitoring software for the system I find this:

SCSI0-1 177 GB SEAGATE ST1181677LCV 173,144MB 0MB Attention
SCSI0-2 177 GB SEAGATE ST1181677LCV 173,144MB 0MB Attention
SCSI0-4 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-5 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-6 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-15 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use

Makes me think that SCSI0-0 and SCSI0-3 (the first and fourth drives)
are not being seen by the OS.

I should mention that the disk failures could have happened months
apart from each other, since the system was not being adequately
monitored.

Someone mentioned spinrite. Buying a copy of this would be no problem,
but would it really help? If the SCSI BIOS doesn't even detect the
drive, is it worth buying any repair software?

Spinrite will not help.

As far as I can see this, the two disks are not detected at all,
and the monitroring software just remembers them or deduces their
existence from the RAID superblock,

Best bet for a quick diagnosis is a different Linux computer with
a known to wirk SCSI adapter. Put each disk in there and see whether
they are detected. If they are, lests go from there. If they are not,
get a diagnosis and recovery quote from a professional data recovery
service. Note that you may need to recover a specific one
of the disks, namely the one that failed second. To find out
which that was and when that failed, you should look into the
system log file, e.g. /var/log/syslog or /var/log/messages.
Since I never had a failed disk in a RAID, I don't know
exactly what the messahe looks like, but look for the failed
didk's device name (sda or the like).

Arno

Folkert Rienstra · Oct 17, 2006

Folkert said:
Folkert said:

Define 'failed'.

Click to expand...

System Type: SCSI/IDE Board
CPU: AMD-K6(tm)-III Processor 451 MHz
Memory: 256 MB
SCSI ID: 7
NIC Status: 100 Mbps
Main Board: Cyclone 1
Power Supply: AT

Running a proprietary version of linux.

Previously during bootup, eight SCSI drives (Seagate Barracuda
ST1181677LCV) were detected
and assigned ids
/dev/sd[abcd]1 were part of a RAID 5 setup md0
/dev/sd[efgh]1 were part of a RAID 5 setup md1

Now during bootup it detects only six drives.

It assigns /dev/sd[cdef]1 to md1 correctly,

That's debatable.

for md0 it gives this message:

Sep 13 14:21:06 kernel: considering sdb2 ...
Sep 13 14:21:06 kernel: adding sdb2 ...
Sep 13 14:21:06 kernel: adding sda2 ...

Sep 13 14:21:06 kernel: created md0

Sep 13 14:21:06 kernel: bind
Sep 13 14:21:06 kernel: bind
Sep 13 14:21:06 kernel: running:
Sep 13 14:21:06 kernel: now!
Sep 13 14:21:06 kernel: sdb2's event counter: 0000001a
Sep 13 14:21:06 kernel: sda2's event counter: 0000001a
Sep 13 14:21:06 kernel: md: device name has changed from sdc2 to sdb2 since last import!
Sep 13 14:21:06 kernel: md: device name has changed from sdb2 to sda2 since last import!
Sep 13 14:21:06 kernel: md0: removing former faulty sda2!
Sep 13 14:21:06 kernel: md: md0: raid array is not clean -- starting background reconstruction
Sep 13 14:21:06 kernel: md0: max total readahead window set to 384k
Sep 13 14:21:06 kernel: md0: 3 data-disks, max readahead per data-disk: 128k
Sep 13 14:21:06 kernel: raid5: device sdb2 operational as raid disk 2
Sep 13 14:21:06 kernel: raid5: device sda2 operational as raid disk 1
Sep 13 14:21:06 kernel: raid5: not enough operational devices for md0 (2/4 failed)

Unfortunately this rather 'fails' at defining 'failed'.

Sep 13 14:21:06 kernel: RAID5 conf printout:
Sep 13 14:21:06 kernel: --- rd:4 wd:2 fd:2
Sep 13 14:21:06 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sda2
Sep 13 14:21:06 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdb2
Sep 13 14:21:06 kernel: raid5: failed to run raid set md0
Sep 13 14:21:06 kernel: pers->run() failed ...
Sep 13 14:21:06 kernel: do_md_run() returned -22

In the monitoring software for the system I find this:

SCSI0-1 177 GB SEAGATE ST1181677LCV 173,144MB 0MB Attention
SCSI0-2 177 GB SEAGATE ST1181677LCV 173,144MB 0MB Attention
SCSI0-4 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-5 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-6 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use
SCSI0-15 177 GB SEAGATE ST1181677LCV 173,144MB 0MB In Use

Makes me think that SCSI0-0 and SCSI0-3 (the first and fourth drives)
are not being seen by the OS.

Which still not defines 'failed'.

I should mention that the disk failures could have happened months
apart from each other, since the system was not being adequately
monitored.

Someone mentioned spinrite. Buying a copy of this would be no problem,
but would it really help?

That question was answered.

If the SCSI BIOS doesn't even detect the drive,

Now _that_ would define 'failed'.

timeOday · Oct 21, 2006

Odie said:
Having two drives fail at the same time is rare and normally points to a
power supply or raid adaptor problem.

Whoever said they'd failed at the same time?

CJT · Oct 21, 2006

timeOday said:
Whoever said they'd failed at the same time?

Are you suggesting operator error? Because once the first one fails,
it's incumbent on the operator to replace it. :-)

zmrzlina · Oct 30, 2006

I have a RAID 5 setup with four SCSI disks. Two have failed. Am I right
Me again. Just so it's clear, operator failure among myself and others
in not monitoring the logs was to blame for the initial drive failure
not being spotted immediately. So yes, I'm a muppet, worthy of scorn.

I removed all four drives from the computer in question, attached them
to another linux PC with a known working SCSI card, and ran SeaTools on
them (they are seagate drives). One drive is toast - it's not detected
at boot by the SCSI bios. The other tree pass all the SeaTools
advanced tests (which takes ~ 2 hours to run) without a single failure.

Now when I replace the four drives in the original PC, three of the
drives (sda sdb sdc below) are detected. Can't RAID5 on four drives
operate in degraded mode with three working drives?

There's something going wrong to do with superblocks, but I'm not sure
what.

Finally one more tidbit. This system is weird in that there is no
command line interface - everything is done through a proprietary web
interface, and thus some instructions I'm reading about on google (i.e.
mdadm) are not an option. Is it possible to remove the three working
drives and attach them to another linux PC with a hardware RAID card
while keeping any data in the (degraded) RAID 5 intact?

Startup messages follow. It is md0 I am concerned with.
I'm sorry for the length, but I want to include everything in case
something is important

<4> Oct 30 16:52:52 kernel: scsi : 1 host.
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sda at scsi0, channel 0,
id 0, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdb at scsi0, channel 0,
id 1, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdc at scsi0, channel 0,
id 2, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdd at scsi0, channel 0,
id 4, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sde at scsi0, channel 0,
id 5, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdf at scsi0, channel 0,
id 6, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdg at scsi0, channel 0,
id 15, lun 0
<4> Oct 30 16:52:52 kernel: scsi : detected 7 SCSI disks total.
<6> Oct 30 16:52:52 kernel: sym53c875E-0-<0,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: SCSI device sda: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdb: Spinning up
disk...<6>sym53c875E-0-<1,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<1,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<1,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdb: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdc: Spinning up
disk...<6>sym53c875E-0-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<2,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<2,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 15 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdc: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdd: Spinning up
disk...<6>sym53c875E-0-<4,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<4,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<4,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 16 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdd: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sde: Spinning up
disk...<6>sym53c875E-0-<5,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<5,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<5,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 15 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sde: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdf: Spinning up
disk...<6>sym53c875E-0-<6,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<6,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<6,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 15 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdf: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdg: Spinning up
disk...<6>sym53c875E-0-<15,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<15,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<15,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 16 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdg: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<6> Oct 30 16:52:52 kernel: Intel(R) PRO/1000 Network Driver - version
4.3.15
<6> Oct 30 16:52:52 kernel: Copyright (c) 1999-2002 Intel Corporation.
<4> Oct 30 16:52:52 kernel: NIC: Adding device 12098086
<4> Oct 30 16:52:52 kernel: PCI latency timer (CFLT) is unreasonably
low at 0. Setting to 32 clocks.
<4> Oct 30 16:52:52 kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
<4> Oct 30 16:52:52 kernel: eepro100.c: $Revision: 1.26 $ 2000/05/31
Modified by Andrey V. Savochkin and others (cb,rk)
<6> Oct 30 16:52:52 kernel: eth0: Intel PCI EtherExpress Pro100
82559ER, 00:80:A1:42:F1:E3, IRQ 10.
<6> Oct 30 16:52:52 kernel: Board assembly 000000-000, Physical
connectors present: RJ45
<6> Oct 30 16:52:52 kernel: Primary interface chip i82555 PHY #1.
<6> Oct 30 16:52:52 kernel: General self-test: passed.
<6> Oct 30 16:52:52 kernel: Serial sub-system self-test: passed.
<6> Oct 30 16:52:52 kernel: Internal registers self-test: passed.
<6> Oct 30 16:52:52 kernel: ROM checksum self-test: passed
(0xdbd8681d).
<6> Oct 30 16:52:52 kernel: Receiver lock-up workaround activated.
<4> Oct 30 16:52:52 kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
<4> Oct 30 16:52:52 kernel: eepro100.c: $Revision: 1.26 $ 2000/05/31
Modified by Andrey V. Savochkin and others (cb,rk)
<6> Oct 30 16:52:52 kernel: Partition check:
<6> Oct 30 16:52:52 kernel: sda:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sda1 sda2
<6> Oct 30 16:52:52 kernel: sdb:SCSI disk error : host 0 channel 0 id
1 lun 0 return code = 28000002
<4> Oct 30 16:52:52 kernel: Info fld=0x0, Current sd08:10: sns = f0 4
<4> Oct 30 16:52:52 kernel: ASC=15 ASCQ= 1
<4> Oct 30 16:52:52 kernel: Raw sense data:0xf0 0x00 0x04 0x00 0x00
0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x15 0x01 0x01 0x00
<4> Oct 30 16:52:52 kernel: scsidisk I/O error: dev 08:10, sector 0
<4> Oct 30 16:52:52 kernel: unable to read partition table
<6> Oct 30 16:52:52 kernel: sdc:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdc1 sdc2
<6> Oct 30 16:52:52 kernel: sdd:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdd1 sdd2
<6> Oct 30 16:52:52 kernel: sde:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sde1 sde2
<6> Oct 30 16:52:52 kernel: sdf:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdf1 sdf2
<6> Oct 30 16:52:52 kernel: sdg:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdg1 sdg2
<4> Oct 30 16:52:52 kernel: md.c: sizeof(mdp_super_t) = 4096
<5> Oct 30 16:52:52 kernel: RAMDISK: Compressed image found at block 0
<6> Oct 30 16:52:52 kernel: autodetecting RAID arrays
<4> Oct 30 16:52:52 kernel: (read) sda2's sb offset: 177220928 [events:
00000019]
<4> Oct 30 16:52:52 kernel: (read) sdc2's sb offset: 177220928 [events:
0000001a]
<4> Oct 30 16:52:52 kernel: (read) sdd2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: (read) sde2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: (read) sdf2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: (read) sdg2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: autorun ...
<4> Oct 30 16:52:52 kernel: considering sdg2 ...
<4> Oct 30 16:52:52 kernel: adding sdg2 ...
<4> Oct 30 16:52:52 kernel: adding sdf2 ...
<4> Oct 30 16:52:52 kernel: adding sde2 ...
<4> Oct 30 16:52:52 kernel: adding sdd2 ...
<4> Oct 30 16:52:52 kernel: created md1
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: running:
<4> Oct 30 16:52:52 kernel: now!
<4> Oct 30 16:52:52 kernel: sdg2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: sdf2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: sde2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: sdd2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: md: device name has changed from sdf2 to
sdg2 since last import!
<4> Oct 30 16:52:52 kernel: md: device name has changed from sde2 to
sdf2 since last import!
<4> Oct 30 16:52:52 kernel: md: device name has changed from sdd2 to
sde2 since last import!
<4> Oct 30 16:52:52 kernel: md: device name has changed from sdc2 to
sdd2 since last import!
<6> Oct 30 16:52:52 kernel: md1: max total readahead window set to 384k

<6> Oct 30 16:52:52 kernel: md1: 3 data-disks, max readahead per
data-disk: 128k
<6> Oct 30 16:52:52 kernel: raid5: device sdg2 operational as raid disk
3
<6> Oct 30 16:52:59 /bin/cron[65]: (CRON) STARTUP (fork ok)
<6> Oct 30 16:52:52 kernel: raid5: device sdf2 operational as raid disk
2
<6> Oct 30 16:52:52 kernel: raid5: device sde2 operational as raid disk
1
<6> Oct 30 16:52:52 kernel: raid5: device sdd2 operational as raid disk
0
<6> Oct 30 16:52:52 kernel: raid5: allocated 4293kB for md1
<4> Oct 30 16:52:52 kernel: raid5: raid level 5 set md1 active with 4
out of 4 devices, algorithm 0
<4> Oct 30 16:52:52 kernel: RAID5 conf printout:
<4> Oct 30 16:52:52 kernel: --- rd:4 wd:4 fd:0
<4> Oct 30 16:52:52 kernel: disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdd2
<4> Oct 30 16:52:52 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde2
<4> Oct 30 16:52:52 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdf2
<4> Oct 30 16:52:52 kernel: disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdg2
<4> Oct 30 16:52:52 kernel: RAID5 conf printout:
<4> Oct 30 16:52:52 kernel: --- rd:4 wd:4 fd:0
<4> Oct 30 16:52:52 kernel: disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdd2
<4> Oct 30 16:52:52 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde2
<4> Oct 30 16:52:52 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdf2
<4> Oct 30 16:52:52 kernel: disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdg2
<6> Oct 30 16:52:52 kernel: md: updating md1 RAID superblock on device
<4> Oct 30 16:52:52 kernel: sdg2 [events: 00000020](write) sdg2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: sdf2 [events: 00000020](write) sdf2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: sde2 [events: 00000020](write) sde2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: sdd2 [events: 00000020](write) sdd2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: .
<4> Oct 30 16:52:52 kernel: considering sdc2 ...
<4> Oct 30 16:52:52 kernel: adding sdc2 ...
<4> Oct 30 16:52:52 kernel: adding sda2 ...
<4> Oct 30 16:52:52 kernel: created md0
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: running:
<4> Oct 30 16:52:52 kernel: now!
<4> Oct 30 16:52:52 kernel: sdc2's event counter: 0000001a
<4> Oct 30 16:52:52 kernel: sda2's event counter: 00000019
<3> Oct 30 16:52:52 kernel: md: superblock update time inconsistency --
using the most recent one
<4> Oct 30 16:52:52 kernel: freshest: sdc2
<4> Oct 30 16:52:52 kernel: md0: kicking faulty sda2!
<4> Oct 30 16:52:52 kernel: unbind
<4> Oct 30 16:52:52 kernel: export_rdev(sda2)
<4> Oct 30 16:52:52 kernel: md0: former device sdb2 is unavailable,
removing from array!
<3> Oct 30 16:52:52 kernel: md: md0: raid array is not clean --
starting background reconstruction
<6> Oct 30 16:52:52 kernel: md0: max total readahead window set to 384k

<6> Oct 30 16:52:52 kernel: md0: 3 data-disks, max readahead per
data-disk: 128k
<6> Oct 30 16:52:52 kernel: raid5: device sdc2 operational as raid disk
2
<3> Oct 30 16:52:52 kernel: raid5: not enough operational devices for
md0 (3/4 failed)
<4> Oct 30 16:52:52 kernel: RAID5 conf printout:
<4> Oct 30 16:52:52 kernel: --- rd:4 wd:1 fd:3
<4> Oct 30 16:52:52 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdc2
<1> Oct 30 16:52:52 kernel: raid5: failed to run raid set md0
<4> Oct 30 16:52:52 kernel: pers->run() failed ...
<4> Oct 30 16:52:52 kernel: do_md_run() returned -22
<4> Oct 30 16:52:52 kernel: unbind
<4> Oct 30 16:52:52 kernel: export_rdev(sdc2)
<6> Oct 30 16:52:52 kernel: md0 stopped.
<4> Oct 30 16:52:52 kernel: ... autorun DONE.

Steve Cousins · Oct 30, 2006

Now when I replace the four drives in the original PC, three of the
drives (sda sdb sdc below) are detected. Can't RAID5 on four drives
operate in degraded mode with three working drives?

There's something going wrong to do with superblocks, but I'm not sure
what.

Finally one more tidbit. This system is weird in that there is no
command line interface - everything is done through a proprietary web
interface, and thus some instructions I'm reading about on google (i.e.
mdadm) are not an option. Is it possible to remove the three working
drives and attach them to another linux PC with a hardware RAID card
while keeping any data in the (degraded) RAID 5 intact?

Startup messages follow. It is md0 I am concerned with.

Since it looks like this is a md array I would send your messages to the
Linux-raid list ([email protected]). I believe they will be
able to help you very quickly. Since three out of the four drives are
ok then yes, you can move them to another PC and use mdadm to create the
RAID set so you can back them up and then add a drive to make the array
redundant again. Make sure you consult the list for details though
because it can be a bit touchy. It is a very active and helpful list.

Good luck,

Steve

Folkert Rienstra · Oct 30, 2006

Me again. Just so it's clear, operator failure among myself and others
in not monitoring the logs was to blame for the initial drive failure
not being spotted immediately. So yes, I'm a muppet, worthy of scorn.

I removed all four drives from the computer in question, attached them
to another linux PC with a known working SCSI card, and ran SeaTools on
them (they are seagate drives).

One drive is toast - it's not detected
at boot by the SCSI bios. The other tree pass all the SeaTools
advanced tests (which takes ~ 2 hours to run) without a single failure.

Did that include writing? If so, you know what that means don't you?

Now when I replace

Presumably you meant 'place'.

the four drives in the original PC, three of
the drives (sda sdb sdc below) are detected.

Presumably you mean that the Meta data is detected, right?

Can't RAID5 on four drives operate in degraded mode with three working drives?

As said before the sequence in which they 'died' is crucial.
If the still working drive 'died' first then the data (assuming it was'nt written to
with SeaTools) on it is old. That may well be reflected in the RAID Meta data.

There's something going wrong to do with superblocks, but I'm not sure what.

See above.

Finally one more tidbit. This system is weird in that there is no
command line interface - everything is done through a proprietary web
web interface, and thus some instructions I'm reading about on google

(i.e. mdadm) are not an option.

Is it possible to remove the three working drives and attach
them to another linux PC with a hardware RAID card
while keeping any data in the (degraded) RAID 5 intact?

Sure, if it has the same controller.
Which obviously doesn't make a difference for you.

Why a hardware RAID card when mdadm is (apparently) a linux software
RAID tool?

Startup messages follow. It is md0 I am concerned with.
I'm sorry for the length, but I want to include everything in case
something is important

[snip]

Arno Wagner · Oct 30, 2006

Me again. Just so it's clear, operator failure among myself and others
in not monitoring the logs was to blame for the initial drive failure
not being spotted immediately. So yes, I'm a muppet, worthy of scorn.

I removed all four drives from the computer in question, attached them
to another linux PC with a known working SCSI card, and ran SeaTools on
them (they are seagate drives). One drive is toast - it's not detected
at boot by the SCSI bios. The other tree pass all the SeaTools
advanced tests (which takes ~ 2 hours to run) without a single failure.

Now when I replace the four drives in the original PC, three of the
drives (sda sdb sdc below) are detected. Can't RAID5 on four drives
operate in degraded mode with three working drives?

It can and at least on Linux software RAID, it can also be brough
up that way. I have done it.

There's something going wrong to do with superblocks, but I'm not sure
what.

See below.

Finally one more tidbit. This system is weird in that there is no
command line interface - everything is done through a proprietary web
interface, and thus some instructions I'm reading about on google (i.e.
mdadm) are not an option. Is it possible to remove the three working
drives and attach them to another linux PC with a hardware RAID card
while keeping any data in the (degraded) RAID 5 intact?

If th card is compatible, yes. You could also move the card
with the drives.

Startup messages follow. It is md0 I am concerned with.
I'm sorry for the length, but I want to include everything in case
something is important

I think I understand the superblock thing: One disk (sda) has an old
event counter and is therefore kicked. That leaves one disk (sdc)
operational in the array. With this 3 out of 4 disks are failed. If
you do a forced assembly (which mdadm supports), this will still be
just 2/4 disks operational, which is not enough.

If I read this correctly, you have one array with all 4 disks present
(i.e. no problem) named md1, and one with 2 disks present (md0) that
have different event counts. Unless you find a 3rd disk for md0,
you data is gone.

Side note: Linux assembles RAID arrays by a sort of globally unique
identifier. If all parts are present (or enough for reassembly), the
array is started. If not, it is left entirely untouched.

Arno

<4> Oct 30 16:52:52 kernel: scsi : 1 host.
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sda at scsi0, channel 0,
id 0, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdb at scsi0, channel 0,
id 1, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdc at scsi0, channel 0,
id 2, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdd at scsi0, channel 0,
id 4, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sde at scsi0, channel 0,
id 5, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdf at scsi0, channel 0,
id 6, lun 0
<3> Oct 30 16:52:52 kernel: Vendor: SEAGATE Model: ST1181677LCV
Rev: 0002
<4> Oct 30 16:52:52 kernel: Type: Direct-Access
ANSI SCSI revision: 03
<3> Oct 30 16:52:52 kernel: Detected scsi disk sdg at scsi0, channel 0,
id 15, lun 0
<4> Oct 30 16:52:52 kernel: scsi : detected 7 SCSI disks total.
<6> Oct 30 16:52:52 kernel: sym53c875E-0-<0,*>: FAST-20 WIDE SCSI 40.0
MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: SCSI device sda: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdb: Spinning up
disk...<6>sym53c875E-0-<1,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<1,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<1,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdb: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdc: Spinning up
disk...<6>sym53c875E-0-<2,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<2,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<2,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 15 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdc: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdd: Spinning up
disk...<6>sym53c875E-0-<4,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<4,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<4,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 16 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdd: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sde: Spinning up
disk...<6>sym53c875E-0-<5,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<5,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<5,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 15 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sde: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdf: Spinning up
disk...<6>sym53c875E-0-<6,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<6,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<6,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 15 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdf: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<4> Oct 30 16:52:52 kernel: sdg: Spinning up
disk...<6>sym53c875E-0-<15,*>: FAST-20 WIDE SCSI 40.0 MB/s (50 ns,
offset 16)
<4> Oct 30 16:52:52 kernel: ..<6>sym53c875E-0-<15,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 kernel: .<6>sym53c875E-0-<15,*>: FAST-20 WIDE SCSI
40.0 MB/s (50 ns, offset 16)
<4> Oct 30 16:52:52 last message repeated 16 times
<4> Oct 30 16:52:52 kernel: ready
<4> Oct 30 16:52:52 kernel: SCSI device sdg: hdwr sector= 512 bytes.
Sectors= 354600001 [173144 MB] [173.1 GB]
<6> Oct 30 16:52:52 kernel: Intel(R) PRO/1000 Network Driver - version
4.3.15
<6> Oct 30 16:52:52 kernel: Copyright (c) 1999-2002 Intel Corporation.
<4> Oct 30 16:52:52 kernel: NIC: Adding device 12098086
<4> Oct 30 16:52:52 kernel: PCI latency timer (CFLT) is unreasonably
low at 0. Setting to 32 clocks.
<4> Oct 30 16:52:52 kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
<4> Oct 30 16:52:52 kernel: eepro100.c: $Revision: 1.26 $ 2000/05/31
Modified by Andrey V. Savochkin and others (cb,rk)
<6> Oct 30 16:52:52 kernel: eth0: Intel PCI EtherExpress Pro100
82559ER, 00:80:A1:42:F1:E3, IRQ 10.
<6> Oct 30 16:52:52 kernel: Board assembly 000000-000, Physical
connectors present: RJ45
<6> Oct 30 16:52:52 kernel: Primary interface chip i82555 PHY #1.
<6> Oct 30 16:52:52 kernel: General self-test: passed.
<6> Oct 30 16:52:52 kernel: Serial sub-system self-test: passed.
<6> Oct 30 16:52:52 kernel: Internal registers self-test: passed.
<6> Oct 30 16:52:52 kernel: ROM checksum self-test: passed
(0xdbd8681d).
<6> Oct 30 16:52:52 kernel: Receiver lock-up workaround activated.
<4> Oct 30 16:52:52 kernel: eepro100.c:v1.09j-t 9/29/99 Donald Becker
http://cesdis.gsfc.nasa.gov/linux/drivers/eepro100.html
<4> Oct 30 16:52:52 kernel: eepro100.c: $Revision: 1.26 $ 2000/05/31
Modified by Andrey V. Savochkin and others (cb,rk)
<6> Oct 30 16:52:52 kernel: Partition check:
<6> Oct 30 16:52:52 kernel: sda:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sda1 sda2
<6> Oct 30 16:52:52 kernel: sdb:SCSI disk error : host 0 channel 0 id
1 lun 0 return code = 28000002
<4> Oct 30 16:52:52 kernel: Info fld=0x0, Current sd08:10: sns = f0 4
<4> Oct 30 16:52:52 kernel: ASC=15 ASCQ= 1
<4> Oct 30 16:52:52 kernel: Raw sense data:0xf0 0x00 0x04 0x00 0x00
0x00 0x00 0x0a 0x00 0x00 0x00 0x00 0x15 0x01 0x01 0x00
<4> Oct 30 16:52:52 kernel: scsidisk I/O error: dev 08:10, sector 0
<4> Oct 30 16:52:52 kernel: unable to read partition table
<6> Oct 30 16:52:52 kernel: sdc:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdc1 sdc2
<6> Oct 30 16:52:52 kernel: sdd:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdd1 sdd2
<6> Oct 30 16:52:52 kernel: sde:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sde1 sde2
<6> Oct 30 16:52:52 kernel: sdf:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdf1 sdf2
<6> Oct 30 16:52:52 kernel: sdg:msdos_partition magic chk 55 aa
<4> Oct 30 16:52:52 kernel: sdg1 sdg2
<4> Oct 30 16:52:52 kernel: md.c: sizeof(mdp_super_t) = 4096
<5> Oct 30 16:52:52 kernel: RAMDISK: Compressed image found at block 0
<6> Oct 30 16:52:52 kernel: autodetecting RAID arrays
<4> Oct 30 16:52:52 kernel: (read) sda2's sb offset: 177220928 [events:
00000019]
<4> Oct 30 16:52:52 kernel: (read) sdc2's sb offset: 177220928 [events:
0000001a]
<4> Oct 30 16:52:52 kernel: (read) sdd2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: (read) sde2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: (read) sdf2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: (read) sdg2's sb offset: 177220928 [events:
0000001f]
<4> Oct 30 16:52:52 kernel: autorun ...
<4> Oct 30 16:52:52 kernel: considering sdg2 ...
<4> Oct 30 16:52:52 kernel: adding sdg2 ...
<4> Oct 30 16:52:52 kernel: adding sdf2 ...
<4> Oct 30 16:52:52 kernel: adding sde2 ...
<4> Oct 30 16:52:52 kernel: adding sdd2 ...
<4> Oct 30 16:52:52 kernel: created md1
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: running:
<4> Oct 30 16:52:52 kernel: now!
<4> Oct 30 16:52:52 kernel: sdg2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: sdf2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: sde2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: sdd2's event counter: 0000001f
<4> Oct 30 16:52:52 kernel: md: device name has changed from sdf2 to
sdg2 since last import!
<4> Oct 30 16:52:52 kernel: md: device name has changed from sde2 to
sdf2 since last import!
<4> Oct 30 16:52:52 kernel: md: device name has changed from sdd2 to
sde2 since last import!
<4> Oct 30 16:52:52 kernel: md: device name has changed from sdc2 to
sdd2 since last import!
<6> Oct 30 16:52:52 kernel: md1: max total readahead window set to 384k

<6> Oct 30 16:52:52 kernel: md1: 3 data-disks, max readahead per
data-disk: 128k
<6> Oct 30 16:52:52 kernel: raid5: device sdg2 operational as raid disk
3
<6> Oct 30 16:52:59 /bin/cron[65]: (CRON) STARTUP (fork ok)
<6> Oct 30 16:52:52 kernel: raid5: device sdf2 operational as raid disk
2
<6> Oct 30 16:52:52 kernel: raid5: device sde2 operational as raid disk
1
<6> Oct 30 16:52:52 kernel: raid5: device sdd2 operational as raid disk
0
<6> Oct 30 16:52:52 kernel: raid5: allocated 4293kB for md1
<4> Oct 30 16:52:52 kernel: raid5: raid level 5 set md1 active with 4
out of 4 devices, algorithm 0
<4> Oct 30 16:52:52 kernel: RAID5 conf printout:
<4> Oct 30 16:52:52 kernel: --- rd:4 wd:4 fd:0
<4> Oct 30 16:52:52 kernel: disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdd2
<4> Oct 30 16:52:52 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde2
<4> Oct 30 16:52:52 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdf2
<4> Oct 30 16:52:52 kernel: disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdg2
<4> Oct 30 16:52:52 kernel: RAID5 conf printout:
<4> Oct 30 16:52:52 kernel: --- rd:4 wd:4 fd:0
<4> Oct 30 16:52:52 kernel: disk 0, s:0, o:1, n:0 rd:0 us:1 dev:sdd2
<4> Oct 30 16:52:52 kernel: disk 1, s:0, o:1, n:1 rd:1 us:1 dev:sde2
<4> Oct 30 16:52:52 kernel: disk 2, s:0, o:1, n:2 rd:2 us:1 dev:sdf2
<4> Oct 30 16:52:52 kernel: disk 3, s:0, o:1, n:3 rd:3 us:1 dev:sdg2
<6> Oct 30 16:52:52 kernel: md: updating md1 RAID superblock on device
<4> Oct 30 16:52:52 kernel: sdg2 [events: 00000020](write) sdg2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: sdf2 [events: 00000020](write) sdf2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: sde2 [events: 00000020](write) sde2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: sdd2 [events: 00000020](write) sdd2's sb
offset: 177220928
<4> Oct 30 16:52:52 kernel: .
<4> Oct 30 16:52:52 kernel: considering sdc2 ...
<4> Oct 30 16:52:52 kernel: adding sdc2 ...
<4> Oct 30 16:52:52 kernel: adding sda2 ...
<4> Oct 30 16:52:52 kernel: created md0
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: bind
<4> Oct 30 16:52:52 kernel: running:
<4> Oct 30 16:52:52 kernel: now!
<4> Oct 30 16:52:52 kernel: sdc2's event counter: 0000001a
<4> Oct 30 16:52:52 kernel: sda2's event counter: 00000019
<3> Oct 30 16:52:52 kernel: md: superblock update time inconsistency --
using the most recent one
<4> Oct 30 16:52:52 kernel: freshest: sdc2
<4> Oct 30 16:52:52 kernel: md0: kicking faulty sda2!
<4> Oct 30 16:52:52 kernel: unbind
<4> Oct 30 16:52:52 kernel: export_rdev(sda2)
<4> Oct 30 16:52:52 kernel: md0: former device sdb2 is unavailable,
removing from array!
<3> Oct 30 16:52:52 kernel: md: md0: raid array is not clean --
starting background reconstruction
<6> Oct 30 16:52:52 kernel: md0: max total readahead window set to 384k

zmrzlina · Oct 31, 2006

Thanks to everyone for your help.

RAID 5 toast?

zmrzlina

Odie Ferrous

Arno Wagner

sean

Arno Wagner

Folkert Rienstra

Folkert Rienstra

zmrzlina

Arno Wagner

Folkert Rienstra

timeOday

CJT

zmrzlina

Steve Cousins

Folkert Rienstra

Arno Wagner

zmrzlina