Question on IDE RAID 1

  • Thread starter Thread starter Bob Rafuse
  • Start date Start date
B

Bob Rafuse

I have P4P800 that I have successfully migrated from a standalone IDE
to an IDE RAID 1 configuration (see earlier thread). Everything is
working fine, but I have a question:

How can I tell if/when one of the drives in the IDE RAID 1 goes bad
and needs replacing? Do I need to run the VIA RAID Tool at startup?
Will that tell me? Or does the VIA XP driver write an event to the XP
event log?

Any pointers, suggestions and/or dope-slaps anyone can provide would
be greatly appreciated.
 
"Bob Rafuse" said:
I have P4P800 that I have successfully migrated from a standalone IDE
to an IDE RAID 1 configuration (see earlier thread). Everything is
working fine, but I have a question:

How can I tell if/when one of the drives in the IDE RAID 1 goes bad
and needs replacing? Do I need to run the VIA RAID Tool at startup?
Will that tell me? Or does the VIA XP driver write an event to the XP
event log?

Any pointers, suggestions and/or dope-slaps anyone can provide would
be greatly appreciated.

Bob

The first warning you will get, is in the BIOS. The BIOS
should have RAID code, and when that code is loaded at
POST, it checks the connected drives, for the special
reserved sector containing RAID ID info. If it finds
one drive of a mirrored pair, but cannot find the other,
it should be marking the array as "busted".

The BIOS should stop at that point.

For example, you may get the occasional nuisance error
report from the RAID BIOS, say if one of the drives in
the mirror is not detected within the timeout period.
In a case like that, I think people repair the problem
by deleting the array and creating the array again.
Creating the array, should copy the data from the drive
that was first detected, to the drive that was late
starting up. A good RAID controller will allow the rebuild
to occur, while the array is being used.

The RAID BIOS or OS software, is going to try to do
things such that the drives continue to be exact mirrors.
The info stored in the reserved sector should be
maintained by those softwares, to keep track of what
state the array is currently in. (And failure of the
reserved sector, is a possible failure mechanism for
the drive. Say if there is a power failure while the
heads of the disks are positioned over the reserved
sector. Sometimes the only thing missing in a failure
event, is the reserved sector.)

You should be offered the option to boot with just
one drive of the mirrored set available. Once booted,
any RAID tools provided for use with the OS can be used.
You may be able to build the array while the system is
running, and return to a proper mirrored pair again.

The time to test how a RAID mirror works, is before you
have the data on it. Create a mirror, then disconnect
both drives, format one of the drives (on an ordinary
IDE interface somewhere) to simulate a problem, connect
the drives to the RAID controller, and practice repairing
the problem. You don't want to be learning how to do
maintenance on the drive, when your life's work is stored
on there, you have no backup, and you cannot figure out
which drive is which. (Placing sticky labels on the drives
would be a good idea.)

For those reasons, I would rather have a good backup on
a removable device, than have a mirror. I like to have
a single drive on a computer, to get the best theoretical
reliability numbers possible, without going to the
complexity of using a mirror or RAID5. Operator error
represents a significant danger to your reliability numbers,
so knowing how to use the RAID array is important.

Having a RAID does not eliminate the need for backups. You
could have a power supply failure, say the +12V goes too
high, and burns out the motors on all disk drives at the
same time. Or say a lightning strike takes out the computer.
A backup on a removable piece of media would be invaluable
in a situation like that, whether you have a mirror or not.

You might also purchase a UPS, the kind that will allow an
orderly shutdown if you are away from the computer. The
array can be desynchronized if the power goes off in the
middle of a write operation, and I doubt the array will
detect it. (The data on the two disks would diverge, and
the drives might not be exact mirrors any more.) For that
reason, it is a good idea to equip a RAID computer with a UPS.

What the mirror could buy you, is better "uptime", say if
you are doing professional Photoshop work on the machine,
and a disk goes out on you in the middle of the day. You
could continue to run with just the one disk, until your
current project is finished. Maybe that is worth something.
For other more casual uses of the computer, I'd prefer the
symplicity of using a single drive, plus frequent incremental
and total backups. A failure is a more abrupt event, but
with your spare blank drive in hand, and last night's
backup, you can be back up in a couple of hours.

(Also, on the topic of backups. In my years of watching other
people using computers, I've seen multiple occurrences of
people who were taking backup snapshots, without ever testing
that the restore function works. Typically, a tape drive that
never gets cleaned, is involved. The shocked looks on their
faces is priceless, when they discover their backup software
doesn't do restores properly, or none of the tapes in the
current rotation, has any data on it.)

Paul
 
As per Paul +

I have not yet come across a built in RAID controller that provides for
automatic notification of drive failure by EG email. Add In cards do often
and that is part of their selling point.

With the Intel ICH5R raid, an icon appears in the task bar that flashes
letting you know there has been a raid disc failure. An easy way to simulate
one is to pull the power on the drive (If your drive is PATA do not pull out
the IDE cable and do this at you own risk). Whamo drive failure. The Intel
ICH5R does rebuild RAID 1 volumes at Windows run time, but it is very slow -
about 1 minute per GB and last I did it, the system was not usable during
that time.

It *is* very important to know what happens. The usual way to fix a failed
raid is to plug in a replacement drive - the controller should automatically
do the rest - ICH5R does. You can check the current drive, reformat it and
ploug it back in if no faults are found - your decision.

Too often people confuse the purpose of RAID 1. I will say it again, it is
*not* a data backup mechanism, it is a data protection mechanism for drive
failure scenarios only. To this end, it is common to find that a RAID 1 (or
5...) controller has marked a disc defective and broken the RAID pair
automatically and for you to not be able to find a fault with the drive.
Why? Perhaps the drive took 1 millisecond too long to respond than timeout
settings allowed. Perhaps the drive allocated or needs to allocate an
alternate sector...
 
Paul,
For example, you may get the occasional nuisance error
report from the RAID BIOS, say if one of the drives in
the mirror is not detected within the timeout period.
In a case like that, I think people repair the problem
by deleting the array and creating the array again.
Creating the array, should copy the data from the drive
that was first detected, to the drive that was late
starting up. A good RAID controller will allow the rebuild
to occur, while the array is being used.

With the VIA RAID BIOS, I believe I can just specify the "working"
disk as the new Source and a new disk as the new Mirror and the set
would be rebuilt. I haven't tried that though.
The time to test how a RAID mirror works, is before you
have the data on it. Create a mirror, then disconnect
both drives, format one of the drives (on an ordinary
IDE interface somewhere) to simulate a problem, connect
the drives to the RAID controller, and practice repairing
the problem. You don't want to be learning how to do
maintenance on the drive, when your life's work is stored
on there...

Thats a good idea. I think I'll pick up another 80GB HD and create a
mirror on the secondary channel and try to force an issue so I'll know
what to do in the event of a real incident. Thanks.
You might also purchase a UPS, the kind that will allow an
orderly shutdown if you are away from the computer.

Already got one, thanks.
What the mirror could buy you, is better "uptime", say if
you are doing professional Photoshop work on the machine,
and a disk goes out on you in the middle of the day. You
could continue to run with just the one disk, until your
current project is finished. Maybe that is worth something.
For other more casual uses of the computer, I'd prefer the
symplicity of using a single drive, plus frequent incremental
and total backups. A failure is a more abrupt event, but
with your spare blank drive in hand, and last night's
backup, you can be back up in a couple of hours.

I'm anal about peforming backups. I wanted the mirror for two reasons
(well three if you count 'pure geek factor' ;-)): I do a lot of work
on large media and data files and wanted to be able to carry on in the
event of a disk failure. I also do a fair bit of database work and
use this PC as the server, so in both cases uptime is pretty
important... and well worth the "pittance" of another HD.

Thanks for the info!
 
Mercury,
I have not yet come across a built in RAID controller that provides for
automatic notification of drive failure by EG email. Add In cards do often
and that is part of their selling point.

I've dealt with server setups that e-mailed in the event of RAID disk
fails, but IIRC it was a special app running that monitored the
hardware RAID controller.

I was just curious what I should look for with the IDE RAID, having
never used or setup a mobo IDE RAID before.
With the Intel ICH5R raid, an icon appears in the task bar that flashes
letting you know there has been a raid disc failure. An easy way to simulate
one is to pull the power on the drive (If your drive is PATA do not pull out
the IDE cable and do this at you own risk). Whamo drive failure. The Intel
ICH5R does rebuild RAID 1 volumes at Windows run time, but it is very slow -
about 1 minute per GB and last I did it, the system was not usable during
that time.

Good to know, thanks.
It *is* very important to know what happens. The usual way to fix a failed
raid is to plug in a replacement drive - the controller should automatically
do the rest - ICH5R does. You can check the current drive, reformat it and
ploug it back in if no faults are found - your decision.

As I mentioned in my reply to Paul, thats a good idea. I'm going to
pick up a clone of my 80GB drive and perform a dry run on a secondary
RAID. (Gawd, I can't believe how cheap HDs are these days!)
Too often people confuse the purpose of RAID 1. I will say it again, it is
*not* a data backup mechanism, it is a data protection mechanism for drive
failure scenarios only.

Yep. I have a plethora of DVD-RWs with my "real" backups on them.
The mirror was mainly for uptime in the event of disk failure during
long media file processing or the occasional database work.

Thanks!
 
Back
Top