Question about using large number of small files in Win2K.

  • Thread starter Thread starter Ken Varn
  • Start date Start date
K

Ken Varn

We have a system that can generate over 1 million files per day @ approx.
15K-20K per file. Each day we create a new sub-directory for that day's
files. We have a process that periodically deletes the oldest directory
when the drive becomes full. The problem we seem to have is that when the
last file in a sub-directory is deleted, the file system appears to "hang"
for anywhere from 15 seconds to 2 minutes until the file is deleted. We
contacted Microsoft concerning this problem and it appears it is unavoidable
due to the architecture of the NTFS MFT. So, we implemented a solution of
not removing the last file, and re-using the directory. However, I am
concerned with fragmentation. After running the system for over 2 weeks, I
tried to run a defrag analysis. The analysis will not run.

I would appreciate it if someone could give some insight on the possible
ramifications of re-using directories that contain large number of small
files. So far performance does not seem to be degraded, but I am concerned
that the Defrag utility cannot run an analysis on the partition.

--
-----------------------------------
Ken Varn
Senior Software Engineer
Diebold Inc.

EmailID = varnk
Domain = Diebold.com
-----------------------------------
 
your going to need bigger drive.
Defrag needs free space to move files here & there & back
use partitions to give the server room to work ,
maybe let partitions fill, but not the drive

you can use remote storage to move files off the drive also,
 
I can understand that if I was attempting to do a defrag, but I am merely
trying to run the defrag analysis. The analysis will not run. Does it also
need a lot of drive space to run the analysis? The drive is 250 gig, with
17 gig free. How much space does it need?

--
-----------------------------------
Ken Varn
Senior Software Engineer
Diebold Inc.

EmailID = varnk
Domain = Diebold.com
-----------------------------------
Benn Wolff said:
your going to need bigger drive.
Defrag needs free space to move files here & there & back
use partitions to give the server room to work ,
maybe let partitions fill, but not the drive

you can use remote storage to move files off the drive also,

Ken Varn said:
We have a system that can generate over 1 million files per day @ approx.
15K-20K per file. Each day we create a new sub-directory for that day's
files. We have a process that periodically deletes the oldest directory
when the drive becomes full. The problem we seem to have is that when the
last file in a sub-directory is deleted, the file system appears to "hang"
for anywhere from 15 seconds to 2 minutes until the file is deleted. We
contacted Microsoft concerning this problem and it appears it is unavoidable
due to the architecture of the NTFS MFT. So, we implemented a solution of
not removing the last file, and re-using the directory. However, I am
concerned with fragmentation. After running the system for over 2
weeks,
 
Sorry, I take that back. It is now working with 17 gig. free.

However, getting back to the original question, are there any ramifications
to our file use method (see original post)? This is a 24/7 unattended
system, so running a defrag periodically is not very practical, so we want
to try to avoid fragmentation as well as possible.

--
-----------------------------------
Ken Varn
Senior Software Engineer
Diebold Inc.

EmailID = varnk
Domain = Diebold.com
-----------------------------------
Ken Varn said:
I can understand that if I was attempting to do a defrag, but I am merely
trying to run the defrag analysis. The analysis will not run. Does it also
need a lot of drive space to run the analysis? The drive is 250 gig, with
17 gig free. How much space does it need?

--
-----------------------------------
Ken Varn
Senior Software Engineer
Diebold Inc.

EmailID = varnk
Domain = Diebold.com
-----------------------------------
Benn Wolff said:
your going to need bigger drive.
Defrag needs free space to move files here & there & back
use partitions to give the server room to work ,
maybe let partitions fill, but not the drive

you can use remote storage to move files off the drive also,
solution
 
your going to need bigger drive.
Defrag needs free space to move files here & there & back
use partitions to give the server room to work ,
maybe let partitions fill, but not the drive

you can use remote storage to move files off the drive also,

I've seen an apples-apples test of an application like yours on
a unix box (SunOS 4.1.3) and ntfs. NTFS would have huge pauses
and the Unix file system performance was faster, across the board.

Re: defrag; Contact the companies that sell the defrag tools and see
if they have a technical KB topic on your problem. You might learn
something about your problem.

This is a little OT, but I I'm a big fan of NTFS file compression. We
were getting several GB of ascii numbers each day that would compress
95% (20:1). The CPU cost of compressing and uncompressing the data
stream was minimal, and effectivly speed up the IO rate to/from the
raid array by a multiples. This was the bottleneck in our workflow.

This won't change your directory problem, but it might change it. If
your data is like ours, 20K would compress into 1K andeach file would
fit in a single cluster, which would make defrag irrelevant. This is
a huge simplification, of course.
 
Back
Top