File-Compare "fc" falsely reports mismatch between identical files

  • Thread starter Thread starter Rich Pasco
  • Start date Start date
I must say, I've enjoyed reading through this thread and am delighted to find
a board without the standard "You don't agree with me so you're an idiot"
comments that are so prevalent everywhere else.

Rich stated something in this post that reminded me of a comment that
billious made that might help explain part of what is going on here. Then
again, maybe not.

Rich Pasco said:
The files are identical, as confirmed by the "COMP" command (which
is *only* binary). In fact they came to exist by the COPY command:
copy IMG_0001.dcm.raw file.bin

billious said:
COPY for instance appears to
recognise ^Z as end-of-file and will terminate copying a binary file at the
first one encountered

Now if this binary file does have a ^Z in it, could the copy command be
creating a shortened verison of the file?
This doesn't explain why comp and fc produce different results though. Just
thought I throw that observation out there.
 
Vizoere said:
I must say, I've enjoyed reading through this thread and am delighted to
find
a board without the standard "You don't agree with me so you're an idiot"
comments that are so prevalent everywhere else.

Rich stated something in this post that reminded me of a comment that
billious made that might help explain part of what is going on here. Then
again, maybe not.





Now if this binary file does have a ^Z in it, could the copy command be
creating a shortened verison of the file?
This doesn't explain why comp and fc produce different results though.
Just
thought I throw that observation out there.

Oh, no! COPY! There's another can of worms!

You're quite correct, of course. COPY seems to create a truncated version
when fed with a "binary" file, such as MPEG.MPG, requiring the "/B" switch,
but appears to work happily with "ASCII" files. The question that arises
from there is what does COPY use to determine whether the file is "ASCII" or
"binary?" In the early days, ASCII would be roughly "all bytes have
top-bit=0" - but with non-English extensions for accented characters and
later Unicode, what is the difference between "ASCII text" (as appears in
the COPY /? "documentation") "Non-ASCII text" (like a text file containing
accented characters or Unicode) and "binary." AND is the difference now
significant?

I'd suggest that the position-sensitive /A switches available for COPY which
appear to control recognition and regeneration of ^Zs in COPY's append-mode
are now pretty well anachronistic.

When I use COPY, I want a copy, byte-for-byte; hence a copy should simply
duplicate the file (given that I have not issued instructions to modify this
behaviour via control-switches) - or stated more simply, the default SHOULD
be /B. Naturally, this has the potential to break batches based on prior
assumptions, so it really can't be arbitrarily introduced at this late stage
of the game. Yet another aspect of the software application of phlogiston
theory to me, though.

---

I believe that the files in question were known to be identical, and the
problem arose because FC A B consistently gave one result whereas FC B A
consistently gave another. I'd have expected FC to yield the same result
consistently, independent of parameter order, and to me this points to a
fundamental flaw in FC, regardless of the fact that running FC in the
correct mode for "binary" files "corrects" the problem.

There's also the interesting contribution from Alex K. Angelopoulos - that
there's a fix to FC available. Reading between the lines, this appears to be
scheduled for XPSP4, IIRC, this fix is more than a year old but appears to
address a problem where the differences between files were NOT detected if
the only differences occured at byte position (some multiple of 128.) This
problem, I have not (knowingly) encountered. This fix, if that is all it
fixes, would not seem designed to cure the apparent parameter-dependency
though..Perhaps Rich could re-test with the original files (but I have grave
doubts that the problem would be magically rectified.)
 
Dos 6.22 Help

The /B switch specifies that the command interpreter is to read the
number of bytes specified by the file size in the directory. The /B
switch is the default value for COPY unless COPY is combining files.
 
Binary is for block things like files. The text mode is for stream type
devices like ports, eg copy com1 test.txt
 
Vizoere said:
I must say, I've enjoyed reading through this thread and am delighted to find
a board without the standard "You don't agree with me so you're an idiot"
comments that are so prevalent everywhere else.

Rich stated something in this post that reminded me of a comment that
billious made that might help explain part of what is going on here. Then
again, maybe not.





Now if this binary file does have a ^Z in it, could the copy command be
creating a shortened verison of the file?
This doesn't explain why comp and fc produce different results though. Just
thought I throw that observation out there.

No, the second file is not a truncated version of the first.
As I noted the "COMP" command indicated they were identical.
If they of were different lengths, it would have said so, instead.

- Rich
 
Back
Top