FileInfo performance

  • Thread starter Thread starter peter.rietmann
  • Start date Start date
P

peter.rietmann

Hi
I have an application that records the paths of images in a database
table, while the images are stored on a File System. Unfortunatly there
has been an import job running and has copied over one million images
into one of the directories by mistake. In the database I have
approximately 300,000 images.

I have created a job to recursively scan my image directories , check
the file names in the database, if the image is not in the database
then I delete the image from the hard drive. This works ok but one
directory has more than one millon records (16 GB)inside it and the
application crashes when trying to access the files in that directory.

Here an example of the code that I use

private void ScanImageDirectory(string sourceDir)
{
DirectoryInfo di = new DirectoryInfo(sourceDir);
FileInfo [] fi = di.GetFiles();

foreach(FileInfo file in fi)
{
if(!FileExistsInDatabase(file.Name))
{
file.Delete();
}
}
string [] subdirEntries = Directory.GetDirectories
(sourceDir);
foreach(string subdir in subdirEntries)
{
ScanImageDirectory(subdir);
}
}

Has anyone an idea how to work with large directories to complete my
task ?

thanks in advance
 
I have created a job to recursively scan my image directories , check
the file names in the database, if the image is not in the database
then I delete the image from the hard drive. This works ok but one
directory has more than one millon records (16 GB)inside it and the
application crashes when trying to access the files in that directory.

What is the exception (and details) that is thrown?
 
Have you tried using the search pattern parameter to get files to limit the
number returned? While not elegant, it may get your work done.

string theAlphabet = "A,B,C,D,E,F,G,...1,2,3...7,8,9,0";
foreach(string letter in theAlphabet.Split(",".ToCharArray())
{
foreach(FileInfo in di.GetFiles(letter + "*"))
{

}
}

Good Luck
 
The error and stack trace that I get is

The specified network name is no longer available.
at System.IO.__Error.WinIOError(Int32 errorCode, String str)
at System.IO.Directory.InternalGetFileDirectoryNames(String
fullPath, String userPath, Boolean file)
at System.IO.Directory.InternalGetFiles(String path, String
userPath, String searchPattern)
at System.IO.DirectoryInfo.GetFiles(String searchPattern)
at ImmoImageChecker.ImmoImageChecker.ScanImageDirectory(String
sourceDir, Boolean Recurse)
 
Thanks Nick I have tried your method

FileInfo[] fi = di.GetFiles(filter) ;
// The error occurs still

string [] strArray = Directory.GetFiles(dir, filter);
// This method works ok.

hopefully my Images will get back to normal now .

thanks for the suggestion .

BTW the error message was
The specified network name is no longer available.
at System.IO.__Error.WinIOError(Int32 errorCode, String str)
at System.IO.Directory.InternalGetFileDirectoryNames(String
fullPath, String userPath, Boolean file)
at System.IO.Directory.InternalGetFiles(String path, String
userPath, String searchPattern)
at System.IO.DirectoryInfo.GetFiles(String searchPattern)
at ImmoImageChecker.ImmoImageChecker.ScanImageDirectory(String
sourceDir, Boolean Recurse)
 
The Error message that I get is

The specified network name is no longer available.
at System.IO.__Error.WinIOError(Int32 errorCode, String str)
at System.IO.Directory.InternalGetFileDirectoryNames(String
fullPath, String userPath, Boolean file)
at System.IO.Directory.InternalGetFiles(String path, String
userPath, String searchPattern)
at System.IO.DirectoryInfo.GetFiles(String searchPattern)
at ImmoImageChecker.ImmoImageChecker.ScanImageDirectory(String
sourceDir, Boolean Recurse)

Nick I have tried your method but it seems that even if I use a very
specific search string that would only return a few files, the error
still occors. I will try the FindFirstFile method.
 
At the OS level, it still have to browse for 1000000 names to find out those
that match...

I would try the "use win32 api" suggestion (though I'm not sure how it works
internally) likely just to change the storage structure to avoid so much
files in a single directory (it's likely Windows Explorer for example would
also choke on this).
 
Back
Top