string.StartsWith() duration is proportional to string length!

  • Thread starter Thread starter AAA
  • Start date Start date
AAA said:
string.StartsWith() duration is proportional to string length!!!!!

Has anybody come across this, besides me ?
I posted all info here, as a bug to microsoft:

http://connect.microsoft.com/wf/feedback/ViewFeedback.aspx?FeedbackID=360419


I'm just puzzled:
How can this be ? Are they copying all chars to a char-arr, before testing ?
How stupid is that ?
Here's the output on my machine. The results are consistent over multiple runs.

Warmup = False






4408 1000 False
1416 10000 False
1424 100000 True
1376 1000000 False
1168 10000000 False
1160 100000000 False

What version of the framework are you using? I'm using .NET 3.5, so that's
2.0 SP1 for the runtime.

What culture are you running it in? Try .StartsWith(...,
StringComparison.Ordinal) to do a culture-insensitive comparison.
 
Jeroen said:
Here's the output on my machine. The results are consistent over
multiple runs.

Warmup = False






4408 1000 False
1416 10000 False
1424 100000 True
1376 1000000 False
1168 10000000 False
1160 100000000 False

What version of the framework are you using? I'm using .NET 3.5, so
that's 2.0 SP1 for the runtime.

What culture are you running it in? Try .StartsWith(...,
StringComparison.Ordinal) to do a culture-insensitive comparison.

I can't see any difference due to string length either. Here's the
result that I get:

41 1000 False
24 10000 False
24 100000 True
25 1000000 False
23 10000000 False
22 100000000 False


I changed the test a bit to rule out some inconsistencies in the
measuring. It does more than a single call for each string and measures
time in seconds instead of ticks (as that obviously varies greatly from
system to system):

....
double[] times = new double[levels];
....
sw.Start();
for (int j = 0; j < 1000000; j++) {
results = (strs.StartsWith("ccc"));
}
sw.Stop();
// store duration in the times array
times = (double)sw.ElapsedTicks / (double)Stopwatch.Frequency;
....
output = string.Format("{0:N6}\t\t{1}\t\t{2}", times, strs.Length,
results);
....

Then I get this result:

0,159523 1000 False
0,159885 10000 False
0,159741 100000 True
0,167306 1000000 False
0,157622 10000000 False
0,153183 100000000 False


I also tried adding StringComparison.Ordinal, with the only difference
that it's about eight times faster:

0,021114 1000 False
0,021875 10000 False
0,024070 100000 True
0,021946 1000000 False
0,019754 10000000 False
0,020303 100000000 False
 
Sorry - I just double-checked:
Still happens on my machine!

3339 1000 False
32472 10000 False
313191 100000 True
3114747 1000000 False
31068720 10000000 False
312031152 100000000 False



My culture is "English (United States)".
System.Environment.Version returns "2.0.50727.1433"


It is an XP machine:

OS Name Microsoft Windows XP Professional
Version 5.1.2600 Service Pack 2 Build 2600
OS Manufacturer Microsoft Corporation
System Name XXXXXXXX
System Manufacturer LENOVO
System Model 8811VSD
System Type X86-based PC
Processor x86 Family 6 Model 15 Stepping 6 GenuineIntel ~2393 Mhz
Processor x86 Family 6 Model 15 Stepping 6 GenuineIntel ~2394 Mhz
BIOS Version/Date LENOVO 2JKT36AUS, 5/6/2007
SMBIOS Version 2.4
Windows Directory C:\WINDOWS
System Directory C:\WINDOWS\system32
Boot Device \Device\HarddiskVolume1
Locale United States
Hardware Abstraction Layer Version = "5.1.2600.2180
(xpsp_sp2_rtm.040803-2158)"
User Name XXXX\XXXX
Time Zone Jerusalem Daylight Time
Total Physical Memory 2,048.00 MB
Available Physical Memory 996.95 MB
Total Virtual Memory 2.00 GB
Available Virtual Memory 1.96 GB
Page File Space 3.84 GB
Page File C:\pagefile.sys


M.



Göran Andersson said:
Jeroen said:
Here's the output on my machine. The results are consistent over multiple
runs.

Warmup = False






4408 1000 False
1416 10000 False
1424 100000 True
1376 1000000 False
1168 10000000 False
1160 100000000 False

What version of the framework are you using? I'm using .NET 3.5, so
that's 2.0 SP1 for the runtime.

What culture are you running it in? Try .StartsWith(...,
StringComparison.Ordinal) to do a culture-insensitive comparison.

I can't see any difference due to string length either. Here's the result
that I get:

41 1000 False
24 10000 False
24 100000 True
25 1000000 False
23 10000000 False
22 100000000 False


I changed the test a bit to rule out some inconsistencies in the
measuring. It does more than a single call for each string and measures
time in seconds instead of ticks (as that obviously varies greatly from
system to system):

...
double[] times = new double[levels];
...
sw.Start();
for (int j = 0; j < 1000000; j++) {
results = (strs.StartsWith("ccc"));
}
sw.Stop();
// store duration in the times array
times = (double)sw.ElapsedTicks / (double)Stopwatch.Frequency;
...
output = string.Format("{0:N6}\t\t{1}\t\t{2}", times, strs.Length,
results);
...

Then I get this result:

0,159523 1000 False
0,159885 10000 False
0,159741 100000 True
0,167306 1000000 False
0,157622 10000000 False
0,153183 100000000 False


I also tried adding StringComparison.Ordinal, with the only difference
that it's about eight times faster:

0,021114 1000 False
0,021875 10000 False
0,024070 100000 True
0,021946 1000000 False
0,019754 10000000 False
0,020303 100000000 False
 
Ordinal works great:

3078 1000 False
747 10000 False
2313 100000 True
1062 1000000 False
1080 10000000 False
684 100000000 False


This is a good work-around for me.
Still - I wish it was fixed, because most developer will forget to add the
ordinal.


M.




Göran Andersson said:
Jeroen said:
Here's the output on my machine. The results are consistent over multiple
runs.

Warmup = False






4408 1000 False
1416 10000 False
1424 100000 True
1376 1000000 False
1168 10000000 False
1160 100000000 False

What version of the framework are you using? I'm using .NET 3.5, so
that's 2.0 SP1 for the runtime.

What culture are you running it in? Try .StartsWith(...,
StringComparison.Ordinal) to do a culture-insensitive comparison.

I can't see any difference due to string length either. Here's the result
that I get:

41 1000 False
24 10000 False
24 100000 True
25 1000000 False
23 10000000 False
22 100000000 False


I changed the test a bit to rule out some inconsistencies in the
measuring. It does more than a single call for each string and measures
time in seconds instead of ticks (as that obviously varies greatly from
system to system):

...
double[] times = new double[levels];
...
sw.Start();
for (int j = 0; j < 1000000; j++) {
results = (strs.StartsWith("ccc"));
}
sw.Stop();
// store duration in the times array
times = (double)sw.ElapsedTicks / (double)Stopwatch.Frequency;
...
output = string.Format("{0:N6}\t\t{1}\t\t{2}", times, strs.Length,
results);
...

Then I get this result:

0,159523 1000 False
0,159885 10000 False
0,159741 100000 True
0,167306 1000000 False
0,157622 10000000 False
0,153183 100000000 False


I also tried adding StringComparison.Ordinal, with the only difference
that it's about eight times faster:

0,021114 1000 False
0,021875 10000 False
0,024070 100000 True
0,021946 1000000 False
0,019754 10000000 False
0,020303 100000000 False
 
I don't know how I posted the bug to the wrong place at Microsoft:
I'm not using the "WF" thing, so it was just a simple mistake.

Anyway, I re-posted here:

https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=361616


FYI
M.


Göran Andersson said:
Jeroen said:
Here's the output on my machine. The results are consistent over multiple
runs.

Warmup = False






4408 1000 False
1416 10000 False
1424 100000 True
1376 1000000 False
1168 10000000 False
1160 100000000 False

What version of the framework are you using? I'm using .NET 3.5, so
that's 2.0 SP1 for the runtime.

What culture are you running it in? Try .StartsWith(...,
StringComparison.Ordinal) to do a culture-insensitive comparison.

I can't see any difference due to string length either. Here's the result
that I get:

41 1000 False
24 10000 False
24 100000 True
25 1000000 False
23 10000000 False
22 100000000 False


I changed the test a bit to rule out some inconsistencies in the
measuring. It does more than a single call for each string and measures
time in seconds instead of ticks (as that obviously varies greatly from
system to system):

...
double[] times = new double[levels];
...
sw.Start();
for (int j = 0; j < 1000000; j++) {
results = (strs.StartsWith("ccc"));
}
sw.Stop();
// store duration in the times array
times = (double)sw.ElapsedTicks / (double)Stopwatch.Frequency;
...
output = string.Format("{0:N6}\t\t{1}\t\t{2}", times, strs.Length,
results);
...

Then I get this result:

0,159523 1000 False
0,159885 10000 False
0,159741 100000 True
0,167306 1000000 False
0,157622 10000000 False
0,153183 100000000 False


I also tried adding StringComparison.Ordinal, with the only difference
that it's about eight times faster:

0,021114 1000 False
0,021875 10000 False
0,024070 100000 True
0,021946 1000000 False
0,019754 10000000 False
0,020303 100000000 False
 
Back
Top