RAID and NTFS Question

  • Thread starter Thread starter john_20_28_2000
  • Start date Start date
J

john_20_28_2000

Hi, I a Windows 2000 Server (standard). It has a system disk and a
logical RAID set in addition. A few weeks ago, one of the drives in
the three of the RAID set died (not the system disk). RAID kept the
drives working and the other two drives performed as they are supposed
to.

WHat happened, however, was that the server got rebooted and when the
server started up NTFS or Win2k could not figure out where to put
certain files. Therefore, it created a bunch of dir.000x folders and
put the data in those folders. It wasn't all of them, just about 29
folders. Unfortunately, one of the folders was the Exchange priv.edb
mdbdata folder. But I digress.

Here is my question/statement: If RAID 5 works fine, but NTFS can't
keep up with "something" don't know what, then what good is the RAID 5?
Or, am I doing something fundementally wrong? Is there some reason
that the system wouldn't be able to reference the files on the system
disk because one of the RAID 5 drives died? I thought that was the
whole point of having it.

Thank you for comments.
 
I'll see if I can find something on this later this evening as this will
take a bit of digging.

Was this a software RAID in windows or a hardware RAID? Who is the RAID
card's vendor? Dynamic disks? How is the caching configured?

PS: Drive failure on a RAID array should be treated as an emergency. RAID
is supposed to protect you from failure, but not work as an extended crutch.
The drive should have been replaced within hours or days. With software
RAID there can sometimes be problems with caching and generally this is to
be avoided like the plague.
 
Here is my question/statement: If RAID 5 works fine, but NTFS can't
keep up with "something" don't know what, then what good is the RAID 5?
Or, am I doing something fundementally wrong? Is there some reason
that the system wouldn't be able to reference the files on the system
disk because one of the RAID 5 drives died? I thought that was the
whole point of having it.

It is; it's supposed to work. It should (probably) get
slower when you lose 1 or 3 but it should operate.

NTFS is at another logical level and shouldn't care if
the disks "work".

Same for AD, which is higher yet again.

Since you didn't say "what files", perhaps those are the
files for replication of SysVol -- do you have (replicated)
DFS on this server? Either way the FRS has to replicate
at least SysVol.

Service pack / hot fix level?

Check the KB?
 
Sorry to hear about your troubles, I have to agree though that
software raid should be avoided like "At all costs"

Newer ide based motherboards offer raid support through hardware
and can vouch for both highpoint and ite raid in a mirror config.

One thing to watch with the lower cost hardware raid controllers and
that is if one of your disks developes bad sectors you might have a
problem figuring out which of the 2 disks in a mirror is bad.
Typically there is no provision in the driver to check for bad sectors
on a disk, so you have to dispand the array, run chkdsk.. Hope smart
on the disks updates if bad sectors are found,,, then re-create the
mirror.

Josh.
 
Hi, I a Windows 2000 Server (standard). It has a system disk and a
logical RAID set in addition. A few weeks ago, one of the drives in
the three of the RAID set died (not the system disk). RAID kept the
drives working and the other two drives performed as they are supposed
to.

In my experience with soft RAID-5 on a three drive system, you will see
a SIGNIFICANT performance hit if one drive dies. Almost to the point
that the system will crawl like a snail.
WHat happened, however, was that the server got rebooted and when the
server started up NTFS or Win2k could not figure out where to put
certain files. Therefore, it created a bunch of dir.000x folders and
put the data in those folders. It wasn't all of them, just about 29
folders. Unfortunately, one of the folders was the Exchange priv.edb
mdbdata folder. But I digress.

It looks like you had two problem - disk corruption based on some issue
and then a drive fault. If you don't check the RAID status before you
reboot then, again, you are asking for trouble.
Here is my question/statement: If RAID 5 works fine, but NTFS can't
keep up with "something" don't know what, then what good is the RAID 5?

RAID-5, when all disks are working, is a performance boost for multiple
requests of the drive system - meaning that more requests for data
spread across the drives can be sustained than on a Single disk. NTFS
has nothing to do with it.
Or, am I doing something fundementally wrong? Is there some reason
that the system wouldn't be able to reference the files on the system
disk because one of the RAID 5 drives died? I thought that was the
whole point of having it.

The point of using RAID-5 as opposed to RAID-1 is performance for random
reads in a multi-task environment. In addition to performance, you get
the benefit of being able to have any one drive go bad - hardware wise -
and not loose all the data in most cases. Drive corruption, because of a
drive "going" bad, can and does happen, even without RAID.

So, when was your last backup to something other than the 3 drives?

If the server was properly configured, you would have at least run NT
Backup for the Exchange service once a night and stored that backup on a
separate disk (at least).
 
But why would Windows 2000 not "understand" what happened and just
function as normal? I thought it wouldn't know anything about the
RAID, since it is hardware, but that it simply accesses them. That it
would care how many drives I had, just that the data was accessible.
Instead, when Win2k booted, it started talking about inconsistencies
and moved the files into the dir.000x files.
 
You almost certainly have multiple or more serious
problems than just the RAID failure.
 
It is Adaptec hardware, stripe size 8k, 3 stripes, rebuild rate 170,
cache size 64mb. WriteThru, Read Ahead, Cached I/O. Not sure what you
mean by dynamic. They are hot-swappable, if that is what you mean.
 
I would agree with the two problem analysis. It sounds like windows
recovered just fine when you had the drive failure -- It did work for quite
some time after the failure. Also, as was mentioned, some RAID controllers
don't do sufficient checking and corruption on one disk can lead to
corruption of the entire volume. It is possible that the problem that lead
to the drive failure was part and parcel of the issue that you are dealing
with now.

I know that's not a lot of help, but I would suggest that you look to your
last backups for the files and do full checks on the remaining volume to
repair any errors that are there and block off corrupted sectors.
 
But why would Windows 2000 not "understand" what happened and just
function as normal? I thought it wouldn't know anything about the
RAID, since it is hardware, but that it simply accesses them. That it
would care how many drives I had, just that the data was accessible.
Instead, when Win2k booted, it started talking about inconsistencies
and moved the files into the dir.000x files.

As I said yesterday, you have two problem - one is the fault the other
was corruption.
 
Back
Top