RAID disk "degraded" but divorced they are happy

  • Thread starter Thread starter 116e32s
  • Start date Start date
1

116e32s

I have a PC setup with RAID1 by somebody else.
It has Asus P6T SE board with ICH10R.
Upon booting, the BIOS whinges that a disk is degraded, giving serial
number thereof. Then Windows also pops up a helful alert that disk on
port3 is degraded and that human should replace offending item.

Instead, I went into BIOS and changed SATA settings from RAID to IDE,
then ran Seagate disk toola on both disks. It says SMART is okay for
both, so I ran long diagnostic. Both disk pass.
So maybe I have intermittent fault, either drive or controller?
They have done 33000 hours, so perhaps time for a trade-in...
 
Instead, I went into BIOS and changed SATA settings from RAID to IDE,
then ran Seagate disk toola on both disks. It says SMART is okay for
both, so I ran long diagnostic. Both disk pass.
So maybe I have intermittent fault, either drive or controller?
They have done 33000 hours, so perhaps time for a trade-in...

"Smart" Diagnostic hours - bloody humbug. Got an old Seagate 200G,
maybe three times that. I forget. Been replaced with two SSDs on
either side of a token of more modern T-class plattered drives now
available. As a backup of one of the SSDs, the Seagate is fine,
functionally not going anywhere else anytime too soon.

How do I know these things - I don't;- but, I'll be goddamned before I
let some diagnostic horseshit tell me the drive's no good. Old school
was usually some form of churn routine, writing and verifying sectors
(or iterations thereof) for fault-prone "indications";- _factory
level_ initialization was another option: a Low Level Format, once
specific to a manufacturer and sometimes the drive series (not
entirely sure what might newer LLF "routines" similarly intend to
inculcate).

Not coming out of a RAID routine, though, at least in my book, is as
deserving as just going ahead and buying a new drive. Without having
reformatted the drive for at least a cursory inspection of operable
characteristics contrary to consequent software modifications, not to
exclude controller chips and what's idiomatic, if at all, in that
sense.

I had two Silicon Technology PCI HDD controllers, by way of
illustration, whereupon one, the older purchase, would randomly lock
during transfers and sometimes cause inexcusable delays in software
operation.

Who's to say said diagnostics is indicative. . .aside whatever else,
god forbid, MS purports;- I'm not even going to tell you why I don't
run RAID -MODE nothing- in anything I care/dare, particularly, to
build;- for last, so simple a format/fdisk routine and careful
inspection/analysis of anything subsequently amiss or suspect.

Granted, a looney PCI controller isn't something for hard, set and
fast rules. They're aren't any. The only diagnostics available, you
have to reach behind, and pull them from off the seat of your pants.

Brown for knock on wood. I'm still not over nor entirely comfortable
since upgrading to another series, same-manufacture PCI controller.
But, hey, if that's about the only band playing, where else ya gonna
go danse?
 
I have a PC setup with RAID1 by somebody else.
It has Asus P6T SE board with ICH10R.
Upon booting, the BIOS whinges that a disk is degraded, giving serial
number thereof. Then Windows also pops up a helful alert that disk on
port3 is degraded and that human should replace offending item.

Instead, I went into BIOS and changed SATA settings from RAID to IDE,
then ran Seagate disk toola on both disks. It says SMART is okay for
both, so I ran long diagnostic. Both disk pass.
So maybe I have intermittent fault, either drive or controller?
They have done 33000 hours, so perhaps time for a trade-in...

RAID is not a backup system.

Make a backup of the remaining healthy disk.

Then, you can attempt a rebuild if you want. The rebuild will clone
the good disk, to the bad disk (terminology applies to RAID1 mirror).

An array can degrade, if the timeout value for a disk, exceeds
the timeout for the hardware. This is where disks with TLER come
in. Some (more expensive for no good reason) WD disks, have the
recovery time truncated, so that a disk having trouble
reading a sector, won't exceed the response timeout value of
the RAID hardware and driver. Regular disks can take as long
as 15 seconds, when attempting to read a bad sector. The RAID
timeout is less than that. Perhaps a disk with TLER enabled
would be 5 seconds. Declaring an issue early, means fewer
attempts to recover the sector were made (which is bad), but
at least the array doesn't drop to degraded state unnecessarily
(which is good).

Paul
 
I have a PC setup with RAID1 by somebody else.
It has Asus P6T SE board with ICH10R.
Upon booting, the BIOS whinges that a disk is degraded, giving serial
number thereof. Then Windows also pops up a helful alert that disk on
port3 is degraded and that human should replace offending item.

Instead, I went into BIOS and changed SATA settings from RAID to IDE,
then ran Seagate disk toola on both disks. It says SMART is okay for
both, so I ran long diagnostic. Both disk pass.
So maybe I have intermittent fault, either drive or controller?
They have done 33000 hours, so perhaps time for a trade-in...

RAID can and does hiccup from time to time. It's hard to keep the two
disks synchronized all of the time. One might run into a weak sector
(still readable, just requiring more retries), while the other is just
fine, and that looks like a disk going bad to the RAID software. That's
why RAID disks are sold specially designed for RAID, they have higher
tolerances for intermittent timing problems.

Yousuf Khan
 
I have a PC setup with RAID1 by somebody else.
It has Asus P6T SE board with ICH10R.
Upon booting, the BIOS whinges that a disk is degraded, giving serial
number thereof. Then Windows also pops up a helful alert that disk on
port3 is degraded and that human should replace offending item.

Instead, I went into BIOS and changed SATA settings from RAID to IDE,
then ran Seagate disk toola on both disks. It says SMART is okay for
both, so I ran long diagnostic. Both disk pass.
So maybe I have intermittent fault, either drive or controller?
They have done 33000 hours, so perhaps time for a trade-in...
I don't know how many times I have run it cases where the SATA cable was
of the non-locking type or other parts used like hard drive or
motherboard, were made before locking SATA cables and sockets became the
"standard". Add a little vibration of some kind and arrays start
showing up as degraded.

Then again many of the people I assist are still using equipment made at
the dawn of SATA...
 
In the last episode of <[email protected]>,
Mark F said:
I thought the main difference for drives intended for RAID
was that the internal error recovery was
defaulted to time limited techniques, so an error would be returned
in a couple of seconds (or less), rather than after a 5 minute (or
longer) attempt.

They're supposedly better able to handle the vibration involved with
being installed in an array of similar drives, although whether that's
true or not is a matter of some debate.
 
Back
Top