Win2K Server defrag problem

  • Thread starter Thread starter Brian Steele
  • Start date Start date
B

Brian Steele

Situation:
Compaq DL380 server, four drives, 2x18GB, 2x9GB, hardware-mirrored
18GB mirror partitioned into three partitions, with two combined into one
dynamic volume (D:). Formattted NTFS, compression enabled.

Problem:
When I try to defrag the D: volume, the defrag process seems to start
properly, but shortly afterwards the system grinds to a halt and nothing but
a cold boot restores operation. Running defrag in analyze mode seems to work
Ok, and in fact running it after the cold boot shows that some of the
existing fragmentation problem was cleared up, before the system hung..

CHKDSK/F against the D: volume does not report any errors.

Any ideas on what might be causing this problem?


Regards,
Brian
 
When you say 'grinds to a halt' does that mean that the mouse and keyboard
are non-responsive or that performance is completely tanked?

Have you tried running the defrag operation w/Taskman running to see which
process(es) are using the CPU?

When you run Defrag, there are basically 2 steps:
1) The defrag utility reads the cluster bitmap, which will tell it which
clusters are in use, which are free (it's used to create the image in the
utility of free/busy clusters).

2) Using that info, it basically just tells the OS to grab a cluster and
move its contents. The filesystem actually does the moves.

So...it could be a problem with a driver that isn't dealing well with the
stress load of the high IOs. Some virus scanners don't deal well with a
defrag operation (they detect that a file was touched and kick off a scan
for each cluster). So, that could be something too. Taskman would tell you
who/what is using the CPU and you may be able to make a diagnosis from
there.

Pat
 
Answers:

"Grinds to a halt" = mouse and keyboard become non-responsive. Ok, the
mouse continues working for a while, but everything else comes to a stop.
CPU usage drops to zero, and CTRL-ALT-DELETE to bring up the Task Manager
does not work. Trying to stop the defragger also does not work.The only
thing that does work after this point is reached is cold-booting the system.

The virus scanner that I'm using on the system is TrendMicro's
ServerProtect. I tried stopping the ServerProtect service and got the same
results I was getting before.

I'm trying to use the freeware "contig" utility from Sysinternals, but it's
taking forever to do each file (particularly on the "scanning disk" part),
CPU usage pegs at 100% and the server's performance is being significantly
affected. Well, at least it doesn't hang to the point that the server stops
responding... :-(.


Regards,
Brian
 
So, the good news is that it sounds like things aren't hung. The bad news
is that the reason for the non-responsiveness is that there is a driver in
the kernel that is likely queuing quite badly. If you want to debug the
issue, you can call MS-Support. The most likely scenario is that you will
need to repro the problem then do a manual blue screen (CTRL+SCRLOCK) to
create a memory dump which they can analyze and tell you what is queuing.

Or (and this is what I would recommend) contact the control card vendor and
see if there are updated drivers and/or firmware.


Pat
 
Back
Top