count occurrences of string within a string

  • Thread starter Thread starter Dana King
  • Start date Start date
D

Dana King

I'm looking for some other developers opinions, I'm trying to find the best
way to count strings within a string using VB.net.

I have tested five methods and have found the String.Replace method is the
fastest and the Regex.Matches.Count to be the slowest. I posted my results
and source code to my web site. If you could take a look and maybe suggest
an even faster method I'd like to hear from you. Also, if anyone can tell me
why Regex is so much slower than the rest I'd like to know about it.

Thanks.

http://www.dotnetmaniac.com/ArticleViewer.aspx?Key={4d9141de-2f82-4490-80ca-c9f725c4e291}
 
Hello Dana,

Looking for only a single character I can get about twice as much speed using
a char loop.

Using For Next and String.Equals...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.21875

Using Do Loop with InStr function...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.25

Using Do Loop with String.IndexOf...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.109375

Using String.Replace...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.0625

Using RegEx.Matches.Count...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.828125

Using Char Loop...
1's found: 499,898 in 5,000,000 characters.
elpased time: 0.015625

snippet:
'== Using char loop
Private Sub test6(ByVal totalLength As Integer, ByVal fileContents As String)

Dim one As Integer = 0
Dim startTime As Int64 = Now.Ticks
Dim tCount As Integer = 0

For tCount = 0 To fileContents.Length - 1
If fileContents(tCount) = "1"c Then
one += 1
End If
Next
Dim endTime As Long = Now.Ticks
Dim elapsedTime As Double = TimeSpan.FromTicks(endTime - startTime).TotalSeconds
Console.WriteLine("Using Char Loop...")
Console.WriteLine("1's found: " & Format(one, "#,#") & " in " & Format(totalLength,
"#,#") & " characters.")
Console.WriteLine("elpased time: " & elapsedTime.ToString)
Console.WriteLine()
Console.WriteLine()
End Sub


-Boo
 
In test1 try this:

For i As Integer = 0 To totalLength - 1
If fileContents(i) = "1"c Then one += 1
Next


In test3 make sure you specify a char (i.e."1"c) and not a string("1")

Do : Result = fileContents.IndexOf("1"c, Start)
If Result = -1 Then Exit Do
one += 1 : Start = Result + 1 : Loop
 
Dana,

We have tested this some years ago, from my memory.

For a string in a string is the best method to count to use the VB Net Instr
while going forward in the string,

For a single char in a string is the best method the string.indexoff("x"c)
doing the same.

http://msdn2.microsoft.com/en-us/library/8460tsh1.aspx

For sure I remember that with counting strings is the Instr twice as fast as
the indexof. The regex and the split string are as far as I remember me
about 100 times slower. Using the instr with a single char is extremely
slower than the indexof with a char.

I hope this helps,

Cor
 
Back
Top