Detecting drive temporarily offline

  • Thread starter Thread starter _R
  • Start date Start date
R

_R

I occasionally hear a drive sound like it's going offline and back
online in one of the systems nearby. Problem is, I can't tell which
computer, drive, or power cable is the culprit. Event viewer has no
clues. Is there a monitoring utility that will catch this?
 
_R said:
I occasionally hear a drive sound like it's going
offline and back online in one of the systems nearby.
Problem is, I can't tell which computer, drive, or power
cable is the culprit. Event viewer has no clues.

Likely because the drive itself is doing it.
Is there a monitoring utility that will catch this?

It might show up on the Everest SMART data.

Whether it does or not depends on the drive and
whether it includes the spindown in the SMART data.
 
Previously _R said:
I occasionally hear a drive sound like it's going offline and back
online in one of the systems nearby. Problem is, I can't tell which
computer, drive, or power cable is the culprit. Event viewer has no
clues. Is there a monitoring utility that will catch this?

More likely a recalibration. No way to tell easily, since
the disk does not announce this to the world.

Arno
 
I occasionally hear a drive sound like it's going offline and back
online in one of the systems nearby. Problem is, I can't tell which
computer, drive, or power cable is the culprit. Event viewer has no
clues. Is there a monitoring utility that will catch this?

Windows? Power Options | Turn off hard disks after x minutes/hours
 
More likely a recalibration. No way to tell easily, since
the disk does not announce this to the world.

Arno

In 2003 I was using WD drives. After numerous spin-downs on RAID
systems they finally had to announce a BIOS problems. Wasn't easy
for them, evidently.

That particular sound gets my attention now. In the case of the WD,
the errors would turn up in the event cue as delayed writes or
complete write fails. It seems like some temporary polling mechanism
could detect if that's happening. It would probably turn up within a
day or so.

Maybe I have to get up to speed on how to access drives' SMART data.
 
_R said:
In 2003 I was using WD drives. After numerous spin-downs on RAID
systems they finally had to announce a BIOS problems. Wasn't easy
for them, evidently.

That particular sound gets my attention now. In the case of the WD,
the errors would turn up in the event cue as delayed writes or
complete write fails. It seems like some temporary polling mechanism
could detect if that's happening. It would probably turn up within a
day or so.

Maybe I have to get up to speed on how to access drives' SMART data.

Just use Everest.
http://www.lavalys.com/products/overview.php?pid=1&lang=en
 
Previously _R said:
More likely a recalibration. No way to tell easily, since
the disk does not announce this to the world.

Arno
[/QUOTE]
In 2003 I was using WD drives. After numerous spin-downs on RAID
systems they finally had to announce a BIOS problems. Wasn't easy
for them, evidently.
That particular sound gets my attention now. In the case of the WD,
the errors would turn up in the event cue as delayed writes or
complete write fails. It seems like some temporary polling mechanism
could detect if that's happening. It would probably turn up within a
day or so.
Maybe I have to get up to speed on how to access drives' SMART data.

Hmm. If it is this type of problem, SMART might not help, since it will
likely not be accessible during the spin-down as well. I think
you may have to detect the disk being unresponsive. How to do that
depends on your OS and the disk load.

On the other hand the Start_Stop_Count in SMART may show increases
that should not be there. This you can poll with some sort of
cron-job or polling daemon. Easiest way to get SMART data for
further processing is with the commandline smartmontools. You
may even be able to configure smartd (part of the smartmontools) to
do this monitoring for you.

Arno
 
Peter said:
Windows? Power Options | Turn off hard disks after x minutes/hours

What chance of a drive timing out after several minutes to a hour of in-
activity and then be immediately hit by a request that fires it up again?
 
Hmm. If it is this type of problem, SMART might not help, since
it will likely not be accessible during the spin-down as well.

That shouldnt be a problem because the spin down appears to
be only a short term spindown and if it isnt, access to the SMART
data should spin it up again, and even if it doesnt, you will know which
drive has spun down because the SMART data wont be available.
I think you may have to detect the disk being unresponsive.
How to do that depends on your OS and the disk load.

Nope, just the SMART data not being available is all the info you need.
 
Back
Top