RAID Problem diagnostic help desperately needed.

  • Thread starter Thread starter Phil Ellett
  • Start date Start date
P

Phil Ellett

Hi folks,

I have been trying for over three months to work out the source of my
problems using hardware RAID 1 with a Promise TX2000 controller, two 120Gb
IBM/Hitachi drives and a Asus A7M266-D Mboard with single XP2400+ chip, 1Gb
RAM.

The symptoms in ALL cases are that the RAID array fails and goes into a
CRITICAL state.

The following are my observations and conclusions and I would welcome ANY
comments.

(O = Observation, C = Conclusion)

O1: I have managed to get the array to fail by simply completing the
"Define Array" section of the RAID BIOS during the disk to disk copying
state of the setup procedure. (problem happens with shipped 2.0.0.28 and
latest 2.0.0.33 BIOS versions).

C1: Problem is OS and/or driver independent

O2: Problem still exists when card is installed in another machine.

C2: Problem is psu/motherboard/memory/processor etc. independent

O3: Drives pass "Drive Fitness Test" software testing either on independent
machines or by testing using the Promise card in situe as test software
detects seperate channels on controller allowing each drive to be tested
independently.

C3: Fault does not lie with drives.

O4: In one test (off-site) RAID array failed using round ATA100 cable but
appears to work fine with old standard 40 way ATA33 flat cables and two
(non-matching) older 40 and 60Gb IBM drives.

C4: Problem relates to corruption of data over IDE cables.

O5: Test repeated (on-site) with same 40 way ATA33 flat cables as above but
with IBM/Hitachi 120Gb 120GX drives. Problem STILL present.

C5: Problem may relate to a transfer speed exhibited by new 120GX drives but
not apparent on older drive due to decrease speed. Other possibility
(although I think unlikely) is noise on electrical circuit present on-site
but not a off-site test location. This however does not explain problem
exhibited with ATA100 round cables at off-site location.


The only sensible conclusion I can come up with from the above observations
is that my Promise TX2000 card exhibits an intermittent fault causing data
corruption that appears to be worse with higher data transfer rate and Non
flat cables. I am therefore assuming I have a faulty card and hence will be
returning it. Has anyone else come up with similar problems on this card
(or that of any other manufacturer), ie. is such a fault (manufacturing
fault?) possible such that replacing card with another one will fix the
problem.

Any advice or ideas for other test to accurately confirm which component is
a fault would be GREATLY appreciated. This my first (and at the current
rate) LAST venture into using RAID !!!!.

Thanks in advance,

Phil.
Sheffield.
 
Hi folks,

I have been trying for over three months to work out the source of my
problems using hardware RAID 1 with a Promise TX2000 controller, two 120Gb
IBM/Hitachi drives and a Asus A7M266-D Mboard with single XP2400+ chip, 1Gb
RAM.

The symptoms in ALL cases are that the RAID array fails and goes into a
CRITICAL state.

The following are my observations and conclusions and I would welcome ANY
comments.

(O = Observation, C = Conclusion)

O1: I have managed to get the array to fail by simply completing the
"Define Array" section of the RAID BIOS during the disk to disk copying
state of the setup procedure. (problem happens with shipped 2.0.0.28 and
latest 2.0.0.33 BIOS versions).

C1: Problem is OS and/or driver independent

O2: Problem still exists when card is installed in another machine.

C2: Problem is psu/motherboard/memory/processor etc. independent

O3: Drives pass "Drive Fitness Test" software testing either on independent
machines or by testing using the Promise card in situe as test software
detects seperate channels on controller allowing each drive to be tested
independently.

C3: Fault does not lie with drives.

O4: In one test (off-site) RAID array failed using round ATA100 cable but
appears to work fine with old standard 40 way ATA33 flat cables and two
(non-matching) older 40 and 60Gb IBM drives.

C4: Problem relates to corruption of data over IDE cables.

O5: Test repeated (on-site) with same 40 way ATA33 flat cables as above but
with IBM/Hitachi 120Gb 120GX drives. Problem STILL present.

C5: Problem may relate to a transfer speed exhibited by new 120GX drives but
not apparent on older drive due to decrease speed. Other possibility
(although I think unlikely) is noise on electrical circuit present on-site
but not a off-site test location. This however does not explain problem
exhibited with ATA100 round cables at off-site location.


The only sensible conclusion I can come up with from the above observations
is that my Promise TX2000 card exhibits an intermittent fault causing data
corruption that appears to be worse with higher data transfer rate and Non
flat cables. I am therefore assuming I have a faulty card and hence will be
returning it. Has anyone else come up with similar problems on this card
(or that of any other manufacturer), ie. is such a fault (manufacturing
fault?) possible such that replacing card with another one will fix the
problem.

Any advice or ideas for other test to accurately confirm which component is
a fault would be GREATLY appreciated. This my first (and at the current
rate) LAST venture into using RAID !!!!.

Thanks in advance,

Phil.
Sheffield.
Have you tried standard (not round) 80 wire cables? Many round cables
are noisier, less reliable than the flat ones. Also, make sure the
cables are not too long. 40 wire cables should cause the drives to
operate at ata33 mode, limiting the speed, and making it less
sensitive to errors/crosstalk/noise outside of the cables.. Could be
that the card can't handle the speed of the new drives. Could also be
that your cables aren't really ata133. The folding done on some cheap
round cables will reduce the effectiveness of the crosstalk prevention
of the extra wires.

Jim
 
Phil Ellett wrote:

O4: In one test (off-site) RAID array failed using round ATA100 cable but
appears to work fine with old standard 40 way ATA33 flat cables and two
(non-matching) older 40 and 60Gb IBM drives.

C4: Problem relates to corruption of data over IDE cables.

O5: Test repeated (on-site) with same 40 way ATA33 flat cables as above
but with IBM/Hitachi 120Gb 120GX drives. Problem STILL present.

Have you tried 80 wire flat cables on the newer drives? And I wouldn't rule
out a flakey drive. I have a 15 gig WD drive that would pass any test done
on it but shortly after installing any os it the OS would crap out due to
files disappearing. Happened with windows and linux and replacing the drive
fixed the problem. With raid you get to guess which drive is the problem!
 
Jim Turner said:
Have you tried standard (not round) 80 wire cables? Many round cables
are noisier, less reliable than the flat ones. Also, make sure the
cables are not too long. 40 wire cables should cause the drives to
operate at ata33 mode, limiting the speed, and making it less
sensitive to errors/crosstalk/noise outside of the cables.. Could be
that the card can't handle the speed of the new drives. Could also be
that your cables aren't really ata133. The folding done on some cheap
round cables will reduce the effectiveness of the crosstalk prevention
of the extra wires.

Jim

Problem still exist with Promise supplied 80 flat cables .. Promise card is
ATA133 and as far as I know drives are ATA100 .. therefore card should NOT
be too slow for drives.

Thanks.
 
Problem still exist with Promise supplied 80 flat cables .. Promise card is
ATA133 and as far as I know drives are ATA100 .. therefore card should NOT
be too slow for drives.

Thanks.

Not a problem of the card or drives being too slow (in theory). The
only time it has worked is when it was slowed down to ata33 speed by
using a 40 wire cable. It has never worked with any 80 wire cable
setup. I would replace the promise card.

JT
 
Back
Top