"preparing to copy" what is it doing? just copy my files already!

  • Thread starter Thread starter Jarkko Mattinen
  • Start date Start date
J

Jarkko Mattinen

What is the purpose of the "preparing to copy" stage of a file copy?
Is it checking for fre space or what? Is there a way to skip it by either
diddling with the registery or is there perhaps a util out there that is a
smarter copy command?

When copying a few files it is no biggie, but the other day I was moving a
folder with about 20Gb consisting of bout 900000 various sized files.
The "preparing to copy" tok nearly 1(!) hour and the copying took a whopping
4 hours!! for 20 damn Gb!??!?

Is it checking for free space or what? Does it REALLY need to examine every
single one of the files before a copy?

And another thing. Why does it take less than 3 minutes to burn a CD with a
700MB totally legally acquired DivX movie, but copying the same file back to
disk takes nearly 6 minutes? From the same cd-drive that created it, which
is capable of 52x speed.


JM
-
 
Jarkko Mattinen said:
What is the purpose of the "preparing to copy" stage of a file copy?

Beats me. If I were writing the function, I'd use such a stage to map out a
plan that would optimize disk accesses (head movement) during the actual
copying process. Since developing that plan would read in the file metadata
information that you'd need anyway during the copy operation, any disk
access time you spend there will be saved later (as long as that information
remains in memory for that later use), so any benefit that the devised plan
confers is pretty much cost-free.
Is it checking for fre space or what? Is there a way to skip it by either
diddling with the registery or is there perhaps a util out there that is a
smarter copy command?

If it's what I described above, it's the smartest file-oriented copy
operation you're likely to be able to perform. So since skipping it would
only make performance worse, that's not an option I'd expect would be
provided.
When copying a few files it is no biggie, but the other day I was moving a
folder with about 20Gb consisting of bout 900000 various sized files.
The "preparing to copy" tok nearly 1(!) hour and the copying took a whopping
4 hours!! for 20 damn Gb!??!?

Let's see. 900 K files in 20 GB (I assume you meant GB rather than Gb)
works out to about 22 KB average per file. If you're using a 7200 rpm IDE
drive, and if every file is internally contiguous on the disk (rather than
fragmented into multiple pieces, which is likely in a FAT file system and
possible even with NTFS), then if the files are randomly spread around the
disk it will take about 13 ms. to access each file, yielding a best-case
data rate of about 1.7 MB/sec (which works out to over 3 hours total right
there for 20 GB worth of files).

As I noted, the above is a best-case scenario. Among other things, it
assumes 1) that you're copying the files to a different disk rather than to
another location on the same disk (so that you can be writing to the output
disk while reading the next file from the input disk; if you're copying to
the same disk, then a naive implementation would approximately double the
required time, since the writing would take nearly as much time as the
reading), 2) that you're somehow batch-updating the output-side directory
structure rather than performing a directory write and FAT write (in
addition to the file data write itself) for each file copied (though if you
have the output disk's write-back cache enabled it will do this for you to a
large extent), and 3) that you've previously scanned in the input-side
directory structure as described earlier (else you'd need some extra disk
activity there as well).

In other words, if you're copying the files to the same disk, a naive
implementation would take around 6 hours, and the performance you're seeing
is quite good. If you're copying the files to a different physical disk,
then what you're seeing isn't impressive but still not all that much worse
than you should expect.
Is it checking for free space or what? Does it REALLY need to examine every
single one of the files before a copy?

Only if you want the copy operation to perform better than a simple-minded
implementation would.
And another thing. Why does it take less than 3 minutes to burn a CD with a
700MB totally legally acquired DivX movie, but copying the same file back to
disk takes nearly 6 minutes? From the same cd-drive that created it, which
is capable of 52x speed.

Assuming that your hard drive isn't significantly fragmented (so that the
copy to it winds up writing the file in many small pieces rather than a few
large ones), one possibility is that your special-purpose CD burning
software is optimized for streaming data to the CD (in fact, it almost has
to be, given how the burning process works) but the (normal file system)
functions used to read the CD are not similarly optimized for streaming data
from the CD.

- bill
 
Back
Top