Wierd chkdsk issue

  • Thread starter Thread starter teckytim
  • Start date Start date
I have a weird problem when I run chkdsk /r on a NTFS partition (win XP
SP2). I get a the message like:

"Windows replaced bad clusters in file 733
of name \FOLDER\FOLDER\FILE.EXT."

It's weird because chkdsk also reports:

"Windows has checked the file system and found no problem."
And "0 KB in bad sectors."

I guess the sectors are bad but readable, not bad enough to be set aside.
Each time, chkdsk makes several attempts to read them and then rewrite them.
By making "chkdsk /r /v", you'll perhaps have more information.
thought it was corrupt files. This seemed likely as I restored the
files to another disk from an old backup and got the same chkdsk
behaviour. Interestingly chkdsk /f reports no file problems (on either
disk) and the files seem to me perfectly OK.

chkdsk /f doesn't search for bad sectors. It doesn't read the files but the
directories and other metadata.
 
Rod Speed wrote:

(edited for brevity)
Yeah, bet that's it.
Perhaps


Not with a folder full of very large files, those get deleted
because there isnt enough room for them in the bin.

I always configure against that
Not if it just makes it too easy to delete a folder by not asking for confirmation of that delete.

I always configure against that.

But if I were in your shoes I'd find this story hard to swallow as
well.
 
Cl.Massé said:
I guess the sectors are bad but readable, not bad enough to be set aside.

That was one of my early suspicians. Unfortunately there is some
evidence against it or at least it is more complicated than that
becasue the files have now become affected.
Each time, chkdsk makes several attempts to read them and then rewrite them.
By making "chkdsk /r /v", you'll perhaps have more information.

I didn't think /v made the output any more verbose on NTFS with /r /f.
I'll try it and post new info if any.
behaviour. Interestingly chkdsk /f reports no file problems (on either
disk) and the files seem to me perfectly OK.

chkdsk /f doesn't search for bad sectors. It doesn't read the files but the
directories and other metadata.

I guess I was unclear. copying the questionable files to a new disk
yielded the identical chkdsk /r output on *both* disks, identifying the
copied files as suspect. Very wierd IMHO.
 
I guess the sectors are bad but readable, not bad enough to be set aside.

Unlikely given that restoring that file from the
backup produces the same result with just that file.
Each time, chkdsk makes several attempts to read them and then rewrite
them. By making "chkdsk /r /v", you'll perhaps have more information.
chkdsk /f doesn't search for bad sectors. It doesn't
read the files but the directories and other metadata.

Doesnt explain that cluster error message.
 
Rod said:
Not even possible.

Sure it is. By making sure there is always ample room.
Not even possible.

Flat wrong. You can choose to disable or enable delete confirmation.
Regardless of your choice delete confirmation is by default with
shift-delete in windows explorer.
In spades when its MUCH more likely than some ghost in the machine.

No ghost in the machine. Some freaky damage perhaps. You have your
speculation I have mine. Neither has enough evidence to be proven or
discounted. But yes, of course it isn't the most convincing example,
because it easily invites your kind speculation, being so rare and
freakish. But this whole incident is an OT distraction.
From my perspective the chkdsk issue remains unanswered. As a side
note it exists on at least 2 separate ATA disks and I've never seen
anything like it on enterprise storage even when the same files or
types or sizes and file systems are involved. Frankly I don't buy
speculation that it is a harmelss chkdsk quirk and basically nothing
went wrong. Doesn't jive with the history and behavior IMHO. Although
I appreciate all the stabs taken at the problem.
 
teckytim said:
Sure it is. By making sure there is always ample room.


Flat wrong. You can choose to disable or enable delete confirmation.
Regardless of your choice delete confirmation is by default with
shift-delete in windows explorer.


No ghost in the machine. Some freaky damage perhaps. You have your
speculation I have mine. Neither has enough evidence to be proven or
discounted. But yes, of course it isn't the most convincing example,
because it easily invites your kind speculation, being so rare and
freakish. But this whole incident is an OT distraction.

note it exists on at least 2 separate ATA disks and I've never seen
anything like it on enterprise storage even when the same files or
types or sizes and file systems are involved. Frankly I don't buy
speculation that it is a harmelss chkdsk quirk and basically nothing
went wrong. Doesn't jive with the history and behavior IMHO. Although
I appreciate all the stabs taken at the problem.
Hi,

I forgot to ask the obvious: Are the files with this problem still
good? (i.e. do the Trueimage archives check out if you run Acronis
verify archive on them?).

Maybe the PATA interface/controller is the problem. A hardware failure
might have caused some form of corruption when writing large files (bad
cache ram perhaps?) that chkdsk can see but can't fix.

I now think you should just chuck the PATA and go all SCSI. :)

In contrast to some of the people on this board, I try to use enterprise
grade equipment because I'm very lazy about these kind of puzzles, and I
just want things to work. SCSI has never taxed my competence the way
IDE has... :)
 
teckytim said:
Rod Speed wrote
Sure it is. By making sure there is always ample room.

Not even possible with a folder full of drive images.
Flat wrong.
Nope.

You can choose to disable or enable delete confirmation.

Sometimes you can, sometimes you cant.
Regardless of your choice delete confirmation is
by default with shift-delete in windows explorer.

That aint the only way to delete folders.
No ghost in the machine.

Yep, just the usual problem between the chair and the keyboard.
Some freaky damage perhaps.

MUCH more likely to be the usual problem between the chair and the keyboard.
You have your speculation I have mine.

And I have the evidence that the problem is MUCH more
likely to have been between the chair and the keyboard than
some ghost in the machine which no one else has reported.
Neither has enough evidence to be proven or discounted.
Wrong.

But yes, of course it isn't the most convincing example, because
it easily invites your kind speculation, being so rare and freakish.

Thats evidence that your original claim is very unlikely indeed.
But this whole incident is an OT distraction.
Nope.

From my perspective the chkdsk issue remains unanswered.

Sure, and its unlikely to be relevant to what is being discussed in this subthread.
As a side note it exists on at least 2 separate ATA disks and I've
never seen anything like it on enterprise storage even when the
same files or types or sizes and file systems are involved.

It wont be due to the 'non enterprise storage'
Frankly I don't buy speculation that it is a harmelss
chkdsk quirk and basically nothing went wrong.

You have always been, and always will be, completely and utterly irrelevant.

What you might or might not buy in spades.

There's a wealth of evidence of chkdsk quirks with NTFS file systems.
Doesn't jive with the history and behavior IMHO.

Corse it does, most obviously that you get the
same result after a restore to a different drive.

That means it cant be some problem with bads on one drive.
 
David Flory said:
Hi,

I forgot to ask the obvious: Are the files with this problem still
good? (i.e. do the Trueimage archives check out if you run Acronis
verify archive on them?).

He already said that they are.
Maybe the PATA interface/controller is the problem.

Hard to see why that would affect just one file so repeatably.
A hardware failure might have caused some form of corruption when writing large files (bad cache
ram perhaps?) that chkdsk can see but can't fix.

chkdsk cant even check for file corruption of that sort.
I now think you should just chuck the PATA and go all SCSI. :)
In contrast to some of the people on this board,

It isnt a board.
I try to use enterprise grade equipment because I'm very lazy about these kind of puzzles, and I
just want things to work. SCSI has never taxed my competence the way IDE has... :)

Plenty of others have had exactly the opposite result.
 
Hi,

I forgot to ask the obvious: Are the files with this problem still
good? (i.e. do the Trueimage archives check out if you run Acronis
verify archive on them?).

Acronis verify says it's OK. I was able to restore but did not
examine what was restored with any depth.

Clearly something did happen to the file that was not related to chkdsk
choking on the local filesystem. There were previous OK chkdsks and
copying over the file to a different filesystem yiled the same /r
problem.

Interestingly I recently did a chkdsk /r /v. It did not report any
extra verbose output (as I suspected). But now the disk image is not
identified as suspect (although the vmdisks previously identified still
are).
Maybe the PATA interface/controller is the problem. A hardware failure
might have caused some form of corruption when writing large files (bad
cache ram perhaps?) that chkdsk can see but can't fix.

The problem is so rare, minor, and poorly identified that IMHO blaming
anything too specifically is pure conjecture.

Disks are pretty sketchy devices. The frequency of soft errors on
modern high capacity disks is mind-boggling. That puts a lot of
responsibility on the ECC mechanisms. IMHO one has to wonder about
these things in a first-to-market & purely $/MB climate. But it's hard
to do anything more than speculate without good reporting and
diagnostic mechanisms.
I now think you should just chuck the PATA and go all SCSI. :)

I did for a while and was very happy as my old ata/PS headaches
disappeared. I went back for price reasons and the many Usenet
self-proclaimed *experts* who are always jumping up and down screaming
that PS is equally or at least sufficiently reliable. Now the old PS
headaches are back.
In contrast to some of the people on this board, I try to use enterprise
grade equipment because I'm very lazy about these kind of puzzles, and I
just want things to work. SCSI has never taxed my competence the way
IDE has... :)

Well I don't mean to represent that all enterprise storage is
perfect. There are and have been a lot of real dogs posing as ES. But
with the last couple dozen models I worked really closely with, if I
had to generalize, problems are typically more severe or obvious with
ES. My ATA disks basically worked a long time. But I always found it
far more likely that they would slowly creep down a worrisome slope and
I'd decommission them for reliability issues rather than total outright
failure.

Frankly I'd much rather have a drive die and simply replace it and
restore from backup than worry and wonder about strange behaviour. I
don't find it consoling when a data-related malfunction is *minor.*
My problem now is that, on paper at least, it seemed stupid to go SCSI
for the LAN's usage and volume size- making more sense to put the
extra money in better backup.


But of course this is all just an OT distraction. If there *is* an
actual chkdsk & NTFS quirk someone should be able to post the actual
known bug/issue rather than just _ass_uming because none of us know how
to troubleshoot it.
 
But of course this is all just an OT distraction. If there *is* an
actual chkdsk & NTFS quirk someone should be able to post the actual
known bug/issue rather than just _ass_uming because none of us know how
to troubleshoot it.

Hmm. I wonder (more speculation!) if this bug could be related to the
fact that you are using both Windows 2000 and XP. As you know, NTFS was
updated slightly with XP, and the differences (along with very large
files) might be setting off a NTFS/chkdsk glitch. If it doesn't
actually cause corruption, it's likely that few people would have heard
about it. Maybe MSDN has something.

BR,

Dave

PS--I was joking a bit about the "all SCSI" comment! I'm a heavy SATA
user because I need the cheap storage, and SCSI is too expensive for
that purpose. I also don't have anything critical, though. I still
don't trust the cheap SATA drives enough for server use, although they
are great for keeping my huge media collection instantly available (in
case I feel like watching that 1.5GB 1982 MTV VHS rip).
 
Acronis verify says it's OK. I was able to restore but did not
examine what was restored with any depth.
Clearly something did happen to the file that was not related to
chkdsk choking on the local filesystem. There were previous OK
chkdsks and copying over the file to a different filesystem yiled the
same /r problem.
Interestingly I recently did a chkdsk /r /v. It did not report any
extra verbose output (as I suspected). But now the disk image is not
identified as suspect (although the vmdisks previously identified still are).
The problem is so rare, minor, and poorly identified that
IMHO blaming anything too specifically is pure conjecture.

Nope, not when no one else has reported what you claim to have seen.

MUCH more likely to be a problem between the chair and the keyboard.
Disks are pretty sketchy devices.

Like hell they are.
The frequency of soft errors on modern high capacity disks is mind-boggling.

Soft errors are completely irrelevant. Its unrecoverable hard errors that matter.

And all modern drives record those events when they occur.
That puts a lot of responsibility on the ECC mechanisms.

Nope, its just basic mathematics.
IMHO one has to wonder about these things
in a first-to-market & purely $/MB climate.

Nope, that's the whole point of the ECC mechanism and the
SMART system, it keeps track of unrecoverable read errors.
But it's hard to do anything more than speculate
without good reporting and diagnostic mechanisms.

We've had those for years and years now.
I did for a while and was very happy as my old ata/PS headaches
disappeared. I went back for price reasons and the many Usenet
self-proclaimed *experts* who are always jumping up and down
screaming that PS is equally or at least sufficiently reliable.
Now the old PS headaches are back.

Nope. chkdsk has problems with SCSI drives too.
Well I don't mean to represent that all enterprise storage is
perfect. There are and have been a lot of real dogs posing
as ES. But with the last couple dozen models I worked really
closely with, if I had to generalize, problems are typically more
severe or obvious with ES. My ATA disks basically worked a
long time. But I always found it far more likely that they would
slowly creep down a worrisome slope and I'd decommission
them for reliability issues rather than total outright failure.

How odd that we dont get that effect.
Frankly I'd much rather have a drive die and simply replace it and
restore from backup than worry and wonder about strange behaviour.

How odd that we dont get strange behaviour.
I don't find it consoling when a data-related malfunction is *minor.*

Your problem. And its got nothing to do with the
drive, everything to do with the complexity of NTFS.
My problem now is that, on paper at least, it seemed stupid
to go SCSI for the LAN's usage and volume size- making
more sense to put the extra money in better backup.

Yep, SCSI is WAY past its useby date for personal desktop systems.
But of course this is all just an OT distraction. If there *is*
an actual chkdsk & NTFS quirk someone should be able
to post the actual known bug/issue rather than just
_ass_uming because none of us know how to troubleshoot it.

No assumption what so ever. Even someone as stupid as you should
be able to turn up plenty of examples of warts with chkdsk and NTFS.

It would in fact be a hell of a lot more surprising if there
werent given the complexity of what NTFS is doing.
 
David said:
Hmm. I wonder (more speculation!) if this bug could be related to the
fact that you are using both Windows 2000 and XP. As you know, NTFS was
updated slightly with XP, and the differences (along with very large
files) might be setting off a NTFS/chkdsk glitch. If it doesn't
actually cause corruption, it's likely that few people would have heard
about it. Maybe MSDN has something.

None of these are new. If it the fault lies squarely on the shoulders
of NTFS and chkdsk this problem shouldn't be unheard of. There would
have to be a KB or something around. My search isn't over...
 
Rod Speed wrote:

(OT ignorant copouts edited for brevity)
No assumption what so ever. Even someone as stupid as you should
be able to turn up plenty of examples of warts with chkdsk and NTFS.

Still with the copouts I see. If you have evidence of NTFS/chkdsk
failure in this particular way, post it.
It would in fact be a hell of a lot more surprising if there
werent given the complexity of what NTFS is doing.

That's just stupid.


Oh and BTW. Remember this position?

[David A. Flory]
BTW, Chkdsk can report errors where there aren't any.

[Rod Speed]
Thats file system errors tho, not bad clusters.


Have a nice day.
 
teckytim said:
Acronis verify says it's OK. I was able to restore but did not
examine what was restored with any depth.

Clearly something did happen to the file that was not related to chkdsk
choking on the local filesystem. There were previous OK chkdsks and
copying over the file to a different filesystem yiled the same /r
problem.

Interestingly I recently did a chkdsk /r /v. It did not report any
extra verbose output (as I suspected). But now the disk image is not
identified as suspect (although the vmdisks previously identified still
are).


The problem is so rare, minor, and poorly identified that IMHO blaming
anything too specifically is pure conjecture.
Disks are pretty sketchy devices.
The frequency of soft errors on modern high capacity disks is mind-boggling.

What makes you think they weren't previously?
That puts a lot of responsibility on the ECC mechanisms. IMHO one
has to wonder about these things in a first-to-market & purely $/MB
climate.
But it's hard to do anything more than speculate without good reporting
and diagnostic mechanisms.

Exactly. Previously these were not available and you were blissfully
unaware of what was going on.

[snip]
 
teckytim said:
Rod Speed wrote
Still with the copouts I see.

Still with your pathetic excuse for bullshit, we can all see.
If you have evidence of NTFS/chkdsk
failure in this particular way, post it.

Go and **** yourself.
That's just stupid.

Never ever could bullshit its way out of a wet paper bag.
Oh and BTW. Remember this position?
[David A. Flory]
BTW, Chkdsk can report errors where there aren't any.
[Rod Speed]
Thats file system errors tho, not bad clusters.

So what, ****wit child ?
Have a nice day.

Have a shitty one yourself.
 
On Mon, 18 Dec 2006 20:02:02 +0100, "Folkert Rienstra"

(edited for brevity)
What makes you think they weren't previously?

I didn't intend to imply any comparison or history. I guess this
comment is also misleading as I'm more concerned about drive
electronics than disk surfaces per se.
Exactly. Previously these were not available and you were blissfully
unaware of what was going on.

Although I don't want to go further OT I don't think SMART is the
great salvation it purports to be.

But more importantly, are you aware of instances where chkdsk
*erroneously* reports bad clusters? Or is it safe to assume that in my
case no bad sectors were reported because the drive remapped them
(rather than chkdsk)? Or that chkdsk erroneously *believes* these
sectors were remapped by the drive?

Thanks
 
teckytim said:
Nope.

Nope.
I didn't intend to imply any comparison or history.
I guess this comment is also misleading as I'm more
concerned about drive electronics than disk surfaces per se.
Waffle.

Nope, its just basic mathematics.

We've had those for years now.
Although I don't want to go further OT
Copout.

I don't think SMART is the great salvation it purports to be.

Only a fool has ever claimed its anything even remotely
resembling anything like any 'great salvation'

It does however accurately report the number of soft errors and hard errors.
But more importantly, are you aware of instances
where chkdsk *erroneously* reports bad clusters?

Even you should be able to find plenty of examples using google and groups.google.
Or is it safe to assume that in my case no bad sectors were
reported because the drive remapped them (rather than chkdsk)?

Nope, because reallocated sectors will show up in the
SMART report and that doesnt explain why restoring the file
to a different hard drive produces the same chkdsk report.

chkdsk DOES NOT AND CANNOT ANALYSE FILES FOR
PROBLEMS, ALL IT CAN EVER DO IS ANALYSE AND
OPTIONALLY REPAIR PROBLEMS WITH THE FILE SYSTEMS.
Or that chkdsk erroneously *believes* these
sectors were remapped by the drive?

chkdsk has no way of knowing that since by definition
the drive will only reallocate bad sectors on writes and
after that has been done, its invisible to chkdsk.
 
Back
Top