Hi,
I forgot to ask the obvious: Are the files with this problem still
good? (i.e. do the Trueimage archives check out if you run Acronis
verify archive on them?).
Acronis verify says it's OK. I was able to restore but did not
examine what was restored with any depth.
Clearly something did happen to the file that was not related to chkdsk
choking on the local filesystem. There were previous OK chkdsks and
copying over the file to a different filesystem yiled the same /r
problem.
Interestingly I recently did a chkdsk /r /v. It did not report any
extra verbose output (as I suspected). But now the disk image is not
identified as suspect (although the vmdisks previously identified still
are).
Maybe the PATA interface/controller is the problem. A hardware failure
might have caused some form of corruption when writing large files (bad
cache ram perhaps?) that chkdsk can see but can't fix.
The problem is so rare, minor, and poorly identified that IMHO blaming
anything too specifically is pure conjecture.
Disks are pretty sketchy devices. The frequency of soft errors on
modern high capacity disks is mind-boggling. That puts a lot of
responsibility on the ECC mechanisms. IMHO one has to wonder about
these things in a first-to-market & purely $/MB climate. But it's hard
to do anything more than speculate without good reporting and
diagnostic mechanisms.
I now think you should just chuck the PATA and go all SCSI.
I did for a while and was very happy as my old ata/PS headaches
disappeared. I went back for price reasons and the many Usenet
self-proclaimed *experts* who are always jumping up and down screaming
that PS is equally or at least sufficiently reliable. Now the old PS
headaches are back.
In contrast to some of the people on this board, I try to use enterprise
grade equipment because I'm very lazy about these kind of puzzles, and I
just want things to work. SCSI has never taxed my competence the way
IDE has...
Well I don't mean to represent that all enterprise storage is
perfect. There are and have been a lot of real dogs posing as ES. But
with the last couple dozen models I worked really closely with, if I
had to generalize, problems are typically more severe or obvious with
ES. My ATA disks basically worked a long time. But I always found it
far more likely that they would slowly creep down a worrisome slope and
I'd decommission them for reliability issues rather than total outright
failure.
Frankly I'd much rather have a drive die and simply replace it and
restore from backup than worry and wonder about strange behaviour. I
don't find it consoling when a data-related malfunction is *minor.*
My problem now is that, on paper at least, it seemed stupid to go SCSI
for the LAN's usage and volume size- making more sense to put the
extra money in better backup.
But of course this is all just an OT distraction. If there *is* an
actual chkdsk & NTFS quirk someone should be able to post the actual
known bug/issue rather than just _ass_uming because none of us know how
to troubleshoot it.