counting files that contain a certain string in the filename?

  • Thread starter Thread starter Andy B
  • Start date Start date
A

Andy B

I need to count files in a certain directory that has the string -Contract
Template.xml at the end of it. How would I do this?
 
Try something like:

Dim _files = Directory.GetFiles(_path)

Dim _count = 0

For Each _file In _files
If _file.EndsWith("-Contract Template.xml") then _count += 1
Next
 
Try something like:

  Dim _files = Directory.GetFiles(_path)

  Dim _count = 0

  For Each _file In _files
    If _file.EndsWith("-Contract Template.xml") then _count += 1
  Next

Or you could do it without the loop:

/////////////////////////
Dim fileCount As Integer = Directory.GetFiles(path, "*-Contract
Template.xml").Length
/////////////////////////

You also have the option to search sub directories if need be:

/////////////////////////
Dim totalFileCount As Integer = Directory.GetFiles(path, "*-Contract
Template.xml", SearchOption.AllDirectories).Length
/////////////////////////

Thanks,

Seth Rowe [MVP]
 
Yes, correct Seth, but you could get bitten by the undesirable 3 character
extension behaviour that is documented in the documentation for the
Directory.GetFiles method.

Given a directory with files named:

A123-Contract Template.xml
A123-Contract Template.xmla

using:

Directory.GetFiles(path, "*-Contract Template.xml").Length

would result in a count of 2, whereas using a loop will get the correct
count.


Try something like:

Dim _files = Directory.GetFiles(_path)

Dim _count = 0

For Each _file In _files
If _file.EndsWith("-Contract Template.xml") then _count += 1
Next

Or you could do it without the loop:

/////////////////////////
Dim fileCount As Integer = Directory.GetFiles(path, "*-Contract
Template.xml").Length
/////////////////////////

You also have the option to search sub directories if need be:

/////////////////////////
Dim totalFileCount As Integer = Directory.GetFiles(path, "*-Contract
Template.xml", SearchOption.AllDirectories).Length
/////////////////////////

Thanks,

Seth Rowe [MVP]
 
Yes, correct Seth, but you could get bitten by the undesirable 3 character
extension behaviour that is documented in the documentation for the
Directory.GetFiles method.

Given a directory with files named:

  A123-Contract Template.xml
  A123-Contract Template.xmla

using:

  Directory.GetFiles(path, "*-Contract Template.xml").Length

would result in a count of 2, whereas using a loop will get the correct
count.

Ahh, very interesting. I guess I would have known about that one if I
ever read the documentation :-)

That also explain why you posted the for...loop and not a sample using
the overload, as with your experience I knew you would know about the
passing in the search pattern. Thanks again for explaining why the
overload is dangerous and also why you posted what you did.

Thanks,

Seth Rowe [MVP]
 
This is right. I needed something with the EndsWith method since I wanted to
ignore files with extensions like xmla and all those oddballs that would
possibly pose as a security problem as well as invalid file formats. Only
files that end in "-Contract Template.xml" are valid filenames. The next
part of a valid file for loading is to make sure the files with the
"-Contract Template.xml" ending conform to a particular xml schema that is
created with a dataset. Just having a particular ending won't work. Since I
have the filename test done (it did work btw), now it's time to move on to
the valid xml schema is/is not valid test. Another post said to just try
loading the files into a dataset in a try...catch block. Is this right? or
is there a better way to do it?
 
This is right. I needed something with the EndsWith method since I wanted to
ignore files with extensions like xmla and all those oddballs that would
possibly pose as a security problem as well as invalid file formats. Only
files that end in "-Contract Template.xml" are valid filenames. The next
part of a valid file for loading is to make sure the files with the
"-Contract Template.xml" ending conform to a particular xml schema that is
created with a dataset. Just having a particular ending won't work. Since I
have the filename test done (it did work btw), now it's time to move on to
the valid xml schema is/is not valid test. Another post said to just try
loading the files into a dataset in a try...catch block. Is this right? or
is there a better way to do it?

.NET actually exposes a class / method to validate an Xml file against
a schema. It's been a long, long time since I've used it, but it's
called something like XmlSchemaValidation. If you can't find it let me
know and I'll try to find the project I used it on. This should be
substantially lighter weight than creating a dataset/datatable and
using a try...catch block.

Thanks,

Seth Rowe [MVP]
 
.NET actually exposes a class / method to validate an Xml file against
a schema. It's been a long, long time since I've used it, but it's
called something like XmlSchemaValidation. If you can't find it let me
know and I'll try to find the project I used it on. This should be
substantially lighter weight than creating a dataset/datatable and
using a try...catch block.

Thanks,

Seth Rowe [MVP]

Or maybe it was XmlValidatingReader, I don't really remember :-(

Thanks,

Seth Rowe [MVP]
http://sethrowe.blogspot.com/
 
Stephany,
I would consider combining the methods:

Dim _files = Directory.GetFiles(_path, "*-Contract Template.xml")
Dim _count = 0

For Each _file In _files
If _file.EndsWith("-Contract Template.xml") then _count += 1
Next

As this lets the OS do the heavy lifting for the bulk of the files names in
a folder allowing the for each to iterate over a subset of all files on
disk. Read Directory.GetFiles will return a smaller array...

I would also consider using LINQ:

Dim _count = Aggregate file In ( _
From file In Directory.GetFiles(path, "*" &
"-Contract Template.xml") _
Where file.EndsWith("-Contract Template.xml") _
) Into Count()
 
Well, yes Jay. You could also consider a myriad of other techniques as
well. Not that I had said 'try something like', meaning 'similar to' or 'a
variation of'.

I attempted to post a follow up to another post of mine on another branch of
this thread, but for some reason or another it doesn't appear in the thread
(in my reader anyway), that pointed out the gotcha when using the * wildcard
character with Directory.GetFiles(_path, _pattern) and the _pattern
specifies a 3 character extension, e.g., Directory.GetFiles(_path, "*.xml").
This gotcha is spelled out in the documentation for the
Directory.GetFiles(_path, _pattern) method.

I had always interpreted that as meaning that the gotcha applied if one used
the * wildcard character anywhere to the left of the . character. However, a
test I executed a few hours ago shows that it only apply when the pattern
is, in fact, in the form "*.<3 character extension>".

In the OP's case, this renders the loop redundant because:

Directory.GetFiles(_path, "*-Contract Template.xml")

will return the correct result.

If the call was:

Directory.GetFiles(_path, "*.xml")

then the result would be incorrect if, and only if, the directory contained
other files with extensions that start with .xml, e.g., .xmla, .xmlb, etc.

Even we old hands can stand educated sometimes :)

Now I certainly appreciate your enthusiasm and evangelism and although it is
a valid technique, don't you think LINQ is a bit over the top for the task
at hand. :)
 
Directory.GetFiles(_path, "*-Contract Template.xml")

If this is a better way of doing things, then I probably will try this way
instead. Makes more sense to use it this way since I have to find all the
files with that partial name and then validate them to make sure they aren't
a fake of some kind.
 
In the OP's case, this renders the loop redundant because:
  Directory.GetFiles(_path, "*-Contract Template.xml")

will return the correct result.

If the call was:

  Directory.GetFiles(_path, "*.xml")

then the result would be incorrect if, and only if, the directory contained
other files with extensions that start with .xml, e.g., .xmla, .xmlb, etc.

Not sure about everyone else, but I am now thoroughly confused!

I guess I'll just wander back to my web development world and be happy
I rarely have to deal with Directory.GetFiles(...)

:-)

Thanks,

Seth Rowe [MVP]
http://sethrowe.blogspot.com/
 
Sorry to get you confused *sigh*... Didn't want to do anything like that...


rowe_newsgroups said:
In the OP's case, this renders the loop redundant because:

Directory.GetFiles(_path, "*-Contract Template.xml")

will return the correct result.

If the call was:

Directory.GetFiles(_path, "*.xml")

then the result would be incorrect if, and only if, the directory
contained
other files with extensions that start with .xml, e.g., .xmla, .xmlb, etc.

Not sure about everyone else, but I am now thoroughly confused!

I guess I'll just wander back to my web development world and be happy
I rarely have to deal with Directory.GetFiles(...)

:-)

Thanks,

Seth Rowe [MVP]
http://sethrowe.blogspot.com/
 
I would keep the loop for the day when Andy's successor changes the criteria
and breaks the algorithm. Of course there is a chance Andy's successor

I don't consider the LINQ over the top as I tend to prefer stating what I
want to do (aggregate the count of files) rather then how specifically to do
it... Also using LINQ its now easy to add other accumulators, such as total
size of the files...

Of course there's the danger of "learn how to use a hammer, and every thing
is a nail"...

--
Hope this helps
Jay B. Harlow
..NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
 
Jay
Of course there's the danger of "learn how to use a hammer, and every
thing is a nail"...
Very good one, I did not know it, I will for sure use it in future,

Cor
 
Actually, I won't have a successor...


Jay B. Harlow said:
I would keep the loop for the day when Andy's successor changes the
criteria and breaks the algorithm. Of course there is a chance Andy's
successor

I don't consider the LINQ over the top as I tend to prefer stating what I
want to do (aggregate the count of files) rather then how specifically to
do it... Also using LINQ its now easy to add other accumulators, such as
total size of the files...

Of course there's the danger of "learn how to use a hammer, and every
thing is a nail"...

--
Hope this helps
Jay B. Harlow
.NET Application Architect, Enthusiast, & Evangelist
T.S. Bradley - http://www.tsbradley.net
 
Back
Top