Drive issues or just paranoia

  • Thread starter Thread starter John
  • Start date Start date
J

John

Hi there,

I just put two new Seagate Constellation ES 500GB drives into my
system and was monitoring their SMART data (via Everest) to see if
anything looked amiss. I am noticing that the Raw Read Error Rates
look substantially worse than my other dirves (full SMART report
follows, please excuse the poor formatting). At first, the boot drive
(9WJ1BS99) showed 100 for both "Value" and "Worse", but after about a
day it dropped to the values listed below. The second drive is pretty
much empty (formatted but only a few files on it) and has always shown
the numbers listed. Other drives that I've used all had RRER's in the
mid to high 90's or 100.

Also, the Seek Error Rate on the second drive also seems to be
considerably worse than the first drive.

I am also curious about the Hardware EEC Recovered numbers. My other
drives do not report on this number so I can't compare them. From
what I've read on the net, they don't look bad but I was wondering.

I tried running HD Tach a couple of times to see if the numbers
changed. The RRER's never went above mid 70's. I also tried running
SeaTools and both drives passed with no problems reported.

Anyhoo.. I would really appreciate it if people who knew about these
things could take a look at the SMART reports and let me know if I
have a genuine worry or am just being overly concerned. Both drives
are still under warantee and I haven't put much on either yet so this
would be the time to replace them if they look bad.

Many thanks.


=== SMART REPORTS ====

[ ST3500514NS (9WJ1BS99) ]
Thresh Value
Worse Data
01 Raw Read Error Rate 44 66 63 1248295
OK: Value is normal
03 Spinup Time 0 97 97
0 OK: Always passes
04 Start/Stop Count 20 100 100
11 OK: Value is normal
05 Reallocated Sector Count 36 100 100 0 OK:
Value is normal
07 Seek Error Rate 30 100 253
14330 OK: Value is normal
09 Power-On Time Count 0 100 100 37
OK: Always passes
0A Spinup Retry Count 97 100 100 0
OK: Value is normal
0C Power Cycle Count 20 100 100 14
OK: Value is normal
B8 End-to-End Error 99 100 100
0 OK: Value is normal
BB Reported Uncorrectable Errors 0 100 100 0 OK:
Always passes
BC Command Timeout 0 100 100 0
OK: Always passes
BD High Fly Writes 0 100 100
0 OK: Always passes
BE Airflow Temperature 45 74 72 471400474
OK: Value is normal
BF Mechanical Shock 0 100 100
0 OK: Always passes
C0 Power-Off Retract Count 0 100 100 6
OK: Always passes
C1 Load/Unload Cycle Count 0 100 100 14
OK: Always passes
C2 Temperature 0 26
40 26 OK: Always passes
C3 Hardware ECC Recovered 0 54 24 1248295 OK:
Always passes
C5 Current Pending Sector Count 0 100 100 0
OK: Always passes
C6 Offline Uncorrectable Sector Count 0 100 100
0 OK: Always passes
C7 Ultra ATA CRC Error Rate 0 200 200
0 OK: Always passes

[ ST3500514NS (9WJ1BRV8) ]

01 Raw Read Error Rate 44 78 63
73295541 OK: Value is normal
03 Spinup Time 0 98
97 0 OK: Always passes
04 Start/Stop Count 20 100
100 12 OK: Value is normal
05 Reallocated Sector Count 36 100 100 0
OK: Value is normal
07 Seek Error Rate 30 61 60
1425386 OK: Value is normal
09 Power-On Time Count 0 100 100
37 OK: Always passes
0A Spinup Retry Count 97 100 100
0 OK: Value is normal
0C Power Cycle Count 20 100 100
14 OK: Value is normal
B8 End-to-End Error 99 100 100
0 OK: Value is normal
BB Reported Uncorrectable Errors 0 100 100 0
OK: Always passes
BC Command Timeout 0 100 100
0 OK: Always passes
BD High Fly Writes 0 100
100 0 OK: Always passes
BE Airflow Temperature 45 75 71
454557721 OK: Value is normal
BF Mechanical Shock 0 100 100
0 OK: Always passes
C0 Power-Off Retract Count 0 100 100
6 OK: Always passes
C1 Load/Unload Cycle Count 0 100 100 15
OK: Always passes
C2 Temperature 0 25
40 25 OK: Always passes
C3 Hardware ECC Recovered 0 51 31 73295541
OK: Always passes
C5 Current Pending Sector Count 0 100 100 0
OK: Always passes
C6 Offline Uncorrectable Sector Count 0 100 100
0 OK: Always passes
C7 Ultra ATA CRC Error Rate 0 200 200
0 OK: Always passes
 
John said:
I just put two new Seagate Constellation ES 500GB drives into my
system and was monitoring their SMART data (via Everest) to see if
anything looked amiss. I am noticing that the Raw Read Error Rates
look substantially worse than my other dirves (full SMART report
follows, please excuse the poor formatting).

Thats normal, Seagate drives do that field very differently to everyone else.
At first, the boot drive (9WJ1BS99) showed 100 for both "Value" and
"Worse", but after about a day it dropped to the values listed below.
The second drive is pretty much empty (formatted but only a few files
on it) and has always shown the numbers listed. Other drives that
I've used all had RRER's in the mid to high 90's or 100.

See above.
Also, the Seek Error Rate on the second drive also
seems to be considerably worse than the first drive.
I am also curious about the Hardware EEC Recovered numbers.
My other drives do not report on this number so I can't compare them.

Yeah, thats one downside with SMART, it varys quite a bit with manufacturer.
From what I've read on the net, they don't look bad but I was wondering.

Dangerous business, you can end up with a melt down between the ears.
I tried running HD Tach a couple of times to see if the numbers changed.
The RRER's never went above mid 70's. I also tried running SeaTools
and both drives passed with no problems reported.
Anyhoo.. I would really appreciate it if people who knew about these
things could take a look at the SMART reports and let me know if I
have a genuine worry or am just being overly concerned.

The latter, Seagate drives do those fields differently.

You can see that using google to look at SMART reports on Seagate drives.
Both drives are still under warantee and I haven't put much on either
yet so this would be the time to replace them if they look bad.

They're fine, its just a Seagate quirk.
Many thanks.


=== SMART REPORTS ====

[ ST3500514NS (9WJ1BS99) ]
Thresh Value
Worse Data
01 Raw Read Error Rate 44 66 63 1248295
OK: Value is normal
03 Spinup Time 0 97 97
0 OK: Always passes
04 Start/Stop Count 20 100 100
11 OK: Value is normal
05 Reallocated Sector Count 36 100 100 0 OK:
Value is normal
07 Seek Error Rate 30 100 253
14330 OK: Value is normal
09 Power-On Time Count 0 100 100 37
OK: Always passes
0A Spinup Retry Count 97 100 100 0
OK: Value is normal
0C Power Cycle Count 20 100 100 14
OK: Value is normal
B8 End-to-End Error 99 100 100
0 OK: Value is normal
BB Reported Uncorrectable Errors 0 100 100 0 OK:
Always passes
BC Command Timeout 0 100 100 0
OK: Always passes
BD High Fly Writes 0 100 100
0 OK: Always passes
BE Airflow Temperature 45 74 72 471400474
OK: Value is normal
BF Mechanical Shock 0 100 100
0 OK: Always passes
C0 Power-Off Retract Count 0 100 100 6
OK: Always passes
C1 Load/Unload Cycle Count 0 100 100 14
OK: Always passes
C2 Temperature 0 26
40 26 OK: Always passes
C3 Hardware ECC Recovered 0 54 24 1248295 OK:
Always passes
C5 Current Pending Sector Count 0 100 100 0
OK: Always passes
C6 Offline Uncorrectable Sector Count 0 100 100
0 OK: Always passes
C7 Ultra ATA CRC Error Rate 0 200 200
0 OK: Always passes

[ ST3500514NS (9WJ1BRV8) ]

01 Raw Read Error Rate 44 78 63
73295541 OK: Value is normal
03 Spinup Time 0 98
97 0 OK: Always passes
04 Start/Stop Count 20 100
100 12 OK: Value is normal
05 Reallocated Sector Count 36 100 100 0
OK: Value is normal
07 Seek Error Rate 30 61 60
1425386 OK: Value is normal
09 Power-On Time Count 0 100 100
37 OK: Always passes
0A Spinup Retry Count 97 100 100
0 OK: Value is normal
0C Power Cycle Count 20 100 100
14 OK: Value is normal
B8 End-to-End Error 99 100 100
0 OK: Value is normal
BB Reported Uncorrectable Errors 0 100 100 0
OK: Always passes
BC Command Timeout 0 100 100
0 OK: Always passes
BD High Fly Writes 0 100
100 0 OK: Always passes
BE Airflow Temperature 45 75 71
454557721 OK: Value is normal
BF Mechanical Shock 0 100 100
0 OK: Always passes
C0 Power-Off Retract Count 0 100 100
6 OK: Always passes
C1 Load/Unload Cycle Count 0 100 100 15
OK: Always passes
C2 Temperature 0 25
40 25 OK: Always passes
C3 Hardware ECC Recovered 0 51 31 73295541
OK: Always passes
C5 Current Pending Sector Count 0 100 100 0
OK: Always passes
C6 Offline Uncorrectable Sector Count 0 100 100
0 OK: Always passes
C7 Ultra ATA CRC Error Rate 0 200 200
0 OK: Always passes
 
John said:
Hi there,
I just put two new Seagate Constellation ES 500GB drives into my
system and was monitoring their SMART data (via Everest) to see if
anything looked amiss. I am noticing that the Raw Read Error Rates
look substantially worse than my other dirves (full SMART report
follows, please excuse the poor formatting). At first, the boot drive
(9WJ1BS99) showed 100 for both "Value" and "Worse", but after about a
day it dropped to the values listed below. The second drive is pretty
much empty (formatted but only a few files on it) and has always shown
the numbers listed. Other drives that I've used all had RRER's in the
mid to high 90's or 100.

Don't worry about it. If you do, for example, a lot of seeks,
the raw read error rate will be bad. That is normal. Also,
the raw read error rate is a pretty obscure field, that
some people here have gine to great lengths to understand.

Also, the Seek Error Rate on the second drive also seems to be
considerably worse than the first drive.
Indeed.

I am also curious about the Hardware EEC Recovered numbers. My other
drives do not report on this number so I can't compare them. From
what I've read on the net, they don't look bad but I was wondering.

They actually look pretty bad.
I tried running HD Tach a couple of times to see if the numbers
changed. The RRER's never went above mid 70's. I also tried running
SeaTools and both drives passed with no problems reported.
Anyhoo.. I would really appreciate it if people who knew about these
things could take a look at the SMART reports and let me know if I
have a genuine worry or am just being overly concerned. Both drives
are still under warantee and I haven't put much on either yet so this
would be the time to replace them if they look bad.
Many thanks.

I think something is wrong here. Marginal PSU, wobbly
mounting, string vibration could all be potential
sources.

Arno




=== SMART REPORTS ====
[ ST3500514NS (9WJ1BS99) ]
Thresh Value
Worse Data
01 Raw Read Error Rate 44 66 63 1248295
OK: Value is normal
03 Spinup Time 0 97 97
0 OK: Always passes
04 Start/Stop Count 20 100 100
11 OK: Value is normal
05 Reallocated Sector Count 36 100 100 0 OK:
Value is normal
07 Seek Error Rate 30 100 253
14330 OK: Value is normal
09 Power-On Time Count 0 100 100 37
OK: Always passes
0A Spinup Retry Count 97 100 100 0
OK: Value is normal
0C Power Cycle Count 20 100 100 14
OK: Value is normal
B8 End-to-End Error 99 100 100
0 OK: Value is normal
BB Reported Uncorrectable Errors 0 100 100 0 OK:
Always passes
BC Command Timeout 0 100 100 0
OK: Always passes
BD High Fly Writes 0 100 100
0 OK: Always passes
BE Airflow Temperature 45 74 72 471400474
OK: Value is normal
BF Mechanical Shock 0 100 100
0 OK: Always passes
C0 Power-Off Retract Count 0 100 100 6
OK: Always passes
C1 Load/Unload Cycle Count 0 100 100 14
OK: Always passes
C2 Temperature 0 26
40 26 OK: Always passes
C3 Hardware ECC Recovered 0 54 24 1248295 OK:
Always passes
C5 Current Pending Sector Count 0 100 100 0
OK: Always passes
C6 Offline Uncorrectable Sector Count 0 100 100
0 OK: Always passes
C7 Ultra ATA CRC Error Rate 0 200 200
0 OK: Always passes
[ ST3500514NS (9WJ1BRV8) ]
 
You could download and install HD Setinel, then go to
Configuration/Advanced Options and set
Health Calculation Method to Analyze Data Field (More Strict,
recommended for servers).

The Health % readings for the drives may then be below 100%, which would
indicate a not so perfect state. On the SMART tab you can see if it
thinks anything's bad.

If paid for it can initiate the long self-test for the drives. But you
can probably download a Seagate utility, or just use Western Digital's
Data Lifeguard Diagnostic for Windows.

Plus whatever Arno says!
--
Ed Light

Better World News TV Channel:
http://realnews.com

Iraq Veterans Against the War and Related:
http://ivaw.org
http://couragetoresist.org
http://antiwar.com

Send spam to the FTC at
(e-mail address removed)
Thanks, robots.
 
I just put two new Seagate Constellation ES 500GB drives into my
system and was monitoring their SMART data (via Everest) to see if
anything looked amiss. I am noticing that the Raw Read Error Rates
look substantially worse than my other dirves (full SMART report
follows, please excuse the poor formatting). At first, the boot drive
(9WJ1BS99) showed 100 for both "Value" and "Worse", but after about a
day it dropped to the values listed below. The second drive is pretty
much empty (formatted but only a few files on it) and has always shown
the numbers listed. Other drives that I've used all had RRER's in the
mid to high 90's or 100.
Also, the Seek Error Rate on the second drive also seems to be
considerably worse than the first drive.
I am also curious about the Hardware EEC Recovered numbers.
=== SMART REPORTS ====
[ ST3500514NS (9WJ1BS99) ]
Thresh Value Worse Data
01 Raw Read Error Rate 44 66 63 1248295
07 Seek Error Rate 30 100 253 14330
C3 Hardware ECC Recovered 0 54 24 1248295
[ ST3500514NS (9WJ1BRV8) ]
01 Raw Read Error Rate 44 78 63 73295541
07 Seek Error Rate 30 61 60 1425386
C3 Hardware ECC Recovered 0 51 31 73295541

Although they are counterintuitive, the Seek Error Rate values for
both drives represent a perfect score.

I believe the relationship between the raw and normalised values of
the SER attribute is given by ...

normalised SER = -10 log (lifetime seek errors / lifetime seeks)

The total number of seek errors is recorded in the uppermost 16 bits
of the raw SER value, while the number of seeks is stored in the lower
32 bits. You will need a SMART utility that reports all 48 bits,
preferably in hexadecimal.

BTW, in the above formula, if the drive has recorded no errors, then
you will still need to set the number of errors to 1, otherwise the
result will be indeterminate.

If we use your second drive as an example, then ...

normalised value of SER = -10 log (1/1425386) = 61.5

The reason that the first drive is still showing values of 100 and 253
for the current and worst values is that the data are not considered
to be significant until the drive has recorded 1 million seeks. When
this target is reached, the values should drop to 60 and 60, assuming
there have been no errors.

The raw values of the RRER and HER attributes represent a sector
count, not an error count. This figure rolls over to 0 once the count
reaches about 250 million. I suspect that the normalised values of
each attribute are recalculated when this occurs. This suggests that
these two attributes are updated according to a rolling average rather
than on a lifetime basis. I'm betting that the normalised values are
also logarithmic, but I'm not certain how they are calculated.

- Franc Zabkar
 
Back
Top