For vs. For Each

  • Thread starter Thread starter Guest
  • Start date Start date
A possible explanation would be :
- With for, it has to find out the element from its index in each loop.
- With for each it just has to locate the next element from the current one
which may be easier in some cases (such as a linked list).

It could be interesting to see if there is a difference depending on the
nature fo the enumerated collection.

Patrice
--

Fredrik Wahlgren said:
Snipped from this ng:

"I came across a reference on a web site
(http://www.personalmicrocosms.com/html/dotnettips.html#richtextbox_lines )
that said to speed up access to a rich text box's lines that you needed to
use a "foreach" loop instead of a "for" loop. This made absolutely no sense
to me, but the author had posted his code and timing results. The "foreach"
(a VB and other languages construct) was 0.01 seconds to access 1000 lines
in
rich text box, whereas the "for" loop (a traditional C++ construct) was an
astounding 25 seconds (on a not very fast PC).

I recreated a test file using the partial source code posted by the author
and verified that there is a SIGNIFICANT performance difference between the
two constructs (although on my PC is was 0.01 seconds vs 3.6 seconds - still
a noticeable delay). Unfortunately, there was no explanation as to why this
was the case and I couldn't see anything as to why one loop construct would
be different. Looking at the generated IL code with Lutz Roeder's Reflector
tool, I see that the real culprit is not the loop structure but the
get_Lines() function that is pulled out of the loop in the "foreach" loop
and
not in the "for" loop code. Which, leads to me post this question about the
differences in complier code generation/optimization and is there any
setting
that can change this.

Interestingly, this is true for both Debug and Release builds. The compiler
generated code that called that function twice for each pass of the loop
(once for the loop index check and then again for the length calculation).
Pulling out unneccessary function calls is pretty basic optimization, and I
surprised that the compiler didn't detect this.

With the IDE's intellisense and auto completion features, the "for" loop
construct shown in the code below seems like something that someone might
actually code up, and of course who would have figured out that the
get_Lines
method would be so performance intensive.

Makes me wonder if there are any other gotchas like this.

Thanks, Mike L.

-------------------------------------------------------------------------- --
----------------

//Simple windows form with a richtextbox control, initialized w/1000 lines
of text (e.g., "line #101", etc).

private void ForLoopButton_Click(object sender, System.EventArgs e)
{
Cursor.Current = Cursors.WaitCursor;
int Len = 0;
int Start = Environment.TickCount;
for (int i = 0; i < TheRichTextBox.Lines.Length; i++)
{
Len += TheRichTextBox.Lines.Length;
}
int ElapsedTime = Environment.TickCount - Start;
ResultsTextBox.Clear();
RsultsTextBox.Text = "for loop\r\n\r\nElapsed time = " + ((double)
ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
Len.ToString();
Cursor.Current = Cursors.Arrow;
}

private void ForEachLoopButton_Click(object sender, System.EventArgs e)
{
Cursor.Current = Cursors.WaitCursor;
int Len = 0;
int Start = Environment.TickCount;
foreach (String Line in TheRichTextBox.Lines)
{
Len += Line.Length;
}
int ElapsedTime = Environment.TickCount - Start;
ResultsTextBox.Clear();
ResultsTextBox.Text = "foreach loop\r\n\r\nElapsed time = " + ((double)
ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
Len.ToString();
Cursor.Current = Cursors.Arrow;
}

private void ForLoopButton2_Click(object sender, System.EventArgs e)
{
//Performance results now same as ForEachLoopButton_Click with the changes
made.
Cursor.Current = Cursors.WaitCursor;
int Len = 0;
int Start = Environment.TickCount;
string[] lines = TheTextBox.Lines;
for (int i = 0; i < lines.Length; i++)
{
Len += lines.Length;
}
int ElapsedTime = Environment.TickCount - Start;
ResultsTextBox.Clear();
RsultsTextBox.Text = "for loop\r\n\r\nElapsed time = " + ((double)
ElapsedTime / (double) 1000.0).ToString() + " seconds\r\n\r\nResult = " +
Len.ToString();
Cursor.Current = Cursors.Arrow;
}"
 
That's a hard question to answer:

For arrays/vectors/ILists --- collections which can be directly indexed,
it's close to a wash. (* but see below)

For other kinds of collections (hashtables, linked-lists) -- collections
which aren't or aren't easily indexed, foreach would be the faster, and
possibly the only, way to iterator through.

In some cases, bad coding just throws everything out the window, such as
the page Fredrik cites, where in the for(), he's calling "i <
TheRichTextBox.Lines.Length" each time through the loop -- except
"TheRichTextBox.Lines" builds and return s string[] each time it's called,
so 99% of the time of loop is being wasted.

And finally, the (*) note I mentioned above: The guys writing the C#
compiler knew that
for (int i = 0; i< coll.Length; ++i)
is a very common idiom in C/C++/C# code, and had the compiler recognize and
optimize it, so right now, for()s in the specific pattern are faster, but
with the next release of the C# compiler, maybe they'll get around to
optimizing foreachs.....
 
And finally, the (*) note I mentioned above: The guys writing the C#
compiler knew that
for (int i = 0; i< coll.Length; ++i)
is a very common idiom in C/C++/C# code, and had the compiler recognize and
optimize it, so right now, for()s in the specific pattern are faster, but
with the next release of the C# compiler, maybe they'll get around to
optimizing foreachs.....

Just a point of pedantry really, but I don't believe it's the C#
compiler at all - I believe it's the JIT compiler. That's good, as it
means that VB.NET and MC++ get the same advantages, so long as the JIT
recognises the code for the same kind of loop.
 
Othman said:
<snip for vs. foreach)

It depends on the collection you iterate over.

Possible issues that might have an impact:

- does .Length/.Count read the number from an internal variable or does
it need to do a lengthy operation to determine it (ie. split up a big
string into separate lines and count them)
- can [index] grab the correct value from an internal list or does it
need to do the same type of work as the .Length/.Count operation does ?
- how is the enumerator implemented, does it pre-cache a list (like
splitting up the text) and simply iterate through the results, or does
it have to compute the next result in some way each time ?

There's probably many more issues that will have an impact on for vs.
foreach, so the best way to determine this is to read the documentation
for the specific collection class you're using and/or profile it.

Things to consider:
- [index] calls a method, which might call a method on an internal object
- foreach constructs an object for the enumerator

Wether these things matter in your case, depends on the collection you use.
 
Just my 2c...

The Lines property of TextBox/RichTextBox returns a string[]. This is NOT
the native format for the TextBox, so every time you get the Lines property,
a new array is populated with the contents of the TextBox.

When you use foreach, the code becomes something like this:

string[] lines = textBox.Lines;
IEnumerator e = lines.GetEnumerator();
while (e.MoveNext())
{
string current = (string) e.Current;

...
}

See? You retrieve the Lines array only once, but when using it in a for
loop, then you build the array in every single iteration! (and maybe more
than once).

for (int i = 0; i < textBox.Lines.Length; i++) // first time - each loop
{
string current = textBox.Lines; // second time - each loop
}

The compiler certainly can't take the textBox.Lines access out of the loop,
because it cannot be sure that the Lines property returns always the same
array.

If the Lines property was something like TextBoxLineCollection, then the
performance would be probably more or less the same (depending on how the
collection would be implemented).

HTH,
Stefan


Lasse Vagsather Karlsen said:
Othman said:
<snip for vs. foreach)

It depends on the collection you iterate over.

Possible issues that might have an impact:

- does .Length/.Count read the number from an internal variable or does it
need to do a lengthy operation to determine it (ie. split up a big string
into separate lines and count them)
- can [index] grab the correct value from an internal list or does it need
to do the same type of work as the .Length/.Count operation does ?
- how is the enumerator implemented, does it pre-cache a list (like
splitting up the text) and simply iterate through the results, or does it
have to compute the next result in some way each time ?

There's probably many more issues that will have an impact on for vs.
foreach, so the best way to determine this is to read the documentation
for the specific collection class you're using and/or profile it.

Things to consider:
- [index] calls a method, which might call a method on an internal object
- foreach constructs an object for the enumerator

Wether these things matter in your case, depends on the collection you
use.

--
Lasse Vagsather Karlsen
http://www.vkarlsen.no/
mailto:[email protected]
PGP KeyID: 0x0270466B
 
Back
Top