Spot the Bug: Fun Concurrency Bug

  • Thread starter Thread starter Chris Mullins [MVP - C#]
  • Start date Start date
C

Chris Mullins [MVP - C#]

I hit this bug last night in some test code I had written. It took a few
minutes to figure out what the root cause was, and it left me thinking,
"Wow. That was an interesting one that doesn't come up very often!".

So, for fun, who sees the bug and can explain why it's happening?

List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
Thread t = new Thread(delegate()
{
Debug.Assert(i != 2);
});

t.Start();
threads.Add(t);
}

foreach (Thread thread in threads)
thread.Join();
 
Chris said:
I hit this bug last night in some test code I had written. It took a few
minutes to figure out what the root cause was, and it left me thinking,
"Wow. That was an interesting one that doesn't come up very often!".

So, for fun, who sees the bug and can explain why it's happening?

Define "bug".

Assuming your main thread doesn't get preempted while creating the
threads, both threads are going to fail the assertion. And since the
main thread is doing so little, it is in fact likely to get all the way
to the first call to Join() before another thread gets to run.

But is that a bug? The Debug class is thread-safe, so having two
threads concurrent fail an assertion shouldn't cause a problem in and of
itself.

Pete
 
Even though it does deal with concurrency, I wouldn't file this under a
concurrency issue, per se. The error comes from using an anonymous method
in the loop and capturing the variable "i".

Because it is used in the anonymous method (and there is only one
instance of it created), when the loop exits, i is equal to 2. Not all the
threads have started up by this point, and then by the time that they do,
the assertion fails.
 
I hit this bug last night in some test code I had written. It took a few
minutes to figure out what the root cause was, and it left me thinking,
"Wow. That was an interesting one that doesn't come up very often!".

So, for fun, who sees the bug and can explain why it's happening?

List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
Thread t = new Thread(delegate()
{
Debug.Assert(i != 2);
});

t.Start();
threads.Add(t);

}

foreach (Thread thread in threads)
thread.Join();

Hi Chris,
It looks like you're incrementing i to 2 before the threads start.
You can see the same thing with one thread and no anonymous delegate:

private static int _global;
public static void Main()
{
_global = 1;
Thread t = new Thread(MyThreadProc);
t.Start();

_global = 2;
t.Join();
}

public static void MyThreadProc()
{
Debug.Assert(_global != 2);
}
 
It's not a .Net bug or anything like that, but it's certainly a bug in the
sense that "my code doesn't do what I wrote it to do" sense.

The convergence of technology that caused this to happen made for a weird
debugging process. When you run it in the debugger, it works just fine (as
do most race conditions). Inspecting all the variables always shows the
correct results, etc.

I've seen many people do thing like this with Closures, and the fact that
the passed variable changes makes the code very strange to follow...

It was also confusing, as I *expected* the variable passed into the closure
to be passed by value (it's an int, afterall). I expected to get the same
behavior as if I had passed in a constant. The fact that C# boxed the
variable, passed in a reference to it, and then later checked it after the
for-loop had completed (and the original variable was out of scope) and got
the "illegal" variable, really was amusing...

--
Chris

Nicholas Paldino said:
Even though it does deal with concurrency, I wouldn't file this under a
concurrency issue, per se. The error comes from using an anonymous method
in the loop and capturing the variable "i".

Because it is used in the anonymous method (and there is only one
instance of it created), when the loop exits, i is equal to 2. Not all
the threads have started up by this point, and then by the time that they
do, the assertion fails.
 
Well, it's a bug in that the code doesn't do what was originally intended.
Instead an "impossible" assertion fires, and the developer(s) get a very
confused look on their faces for a few minutes...

Your explination is dead on, but the boxing / byref behavior of the variable
passed into the Closure is what made this code behavie in a way other than
was expected. I was expecting the int (a value type) to be passed in by
value, and therefore not change...
 
Chris said:
[...]
It was also confusing, as I *expected* the variable passed into the closure
to be passed by value (it's an int, afterall). I expected to get the same
behavior as if I had passed in a constant. The fact that C# boxed the
variable, passed in a reference to it, and then later checked it after the
for-loop had completed (and the original variable was out of scope) and got
the "illegal" variable, really was amusing...

Maybe someone who has more intimate knowledge can comment, but AFAIK
this isn't a case of the value type being boxed. If it were, you
wouldn't have had the problem, since the boxing still preserves the
value as it was at the moment of boxing, not referencing the variable
itself.

Rather, by capturing the variable in the anonymous method, what is being
used in the method is the variable itself. That's why any change to the
variable that happens before the code that uses it is executed is seen
when that code is executed.

The key here is the "capturing" behavior, and I believe that has nothing
to do with boxing.

I think the variable capturing that happens with anonymous methods is
pretty cool, actually. But I agree it can lead to some non-obvious
results when one is not aware that the capturing is going on, and what
the capturing does.

Pete
 
Chris said:
Well, it's a bug in that the code doesn't do what was originally intended.
Instead an "impossible" assertion fires, and the developer(s) get a very
confused look on their faces for a few minutes...

I'm glad you put "impossible" in quotes. :)
Your explination is dead on, but the boxing / byref behavior of the variable
passed into the Closure is what made this code behavie in a way other than
was expected. I was expecting the int (a value type) to be passed in by
value, and therefore not change...

I think my other post elaborates on this already, but I think it's
important to note that not only is there no boxing, I don't think it's
actually that the variable is being "passed" either. So it's not really
correct to talk about the variable be passed by reference or by value.
It's captured, not passed.

If you did want the value passed by value, you could have achieved that
by using the ParameterizedThreadStart constructor for the Thread
instances, and a delegate that actually does have a parameter. Then you
could literally pass the loop variable in by value and have things work
as you expected.

None of this is meant to downplay the potential for confusion here. I
agree that it can be confusing, if you're not familiar with the variable
capturing behavior. I have the benefit of having already been surprised
by this months ago and having Jon Skeet explain it here in this
newsgroup. :)

Pete
 
Chris Mullins said:
I hit this bug last night in some test code I had written. It took a few
minutes to figure out what the root cause was, and it left me thinking,
"Wow. That was an interesting one that doesn't come up very often!".

So, for fun, who sees the bug and can explain why it's happening?

List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
Thread t = new Thread(delegate()
{
Debug.Assert(i != 2);
});

t.Start();
threads.Add(t);
}

foreach (Thread thread in threads)
thread.Join();

Posting without reading any responses... "i" is captured and only has a
single "instance", so the assertion will fail if the thread executes
after the loop has "finished".

The solution:

List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
int j=i; // THIS IS THE CHANGE
Thread t = new Thread(delegate()
{
Debug.Assert(j != 2); // AND THIS
});

t.Start();
threads.Add(t);
}

foreach (Thread thread in threads)
thread.Join();

There's a new instance of "j" each time we go round the loop.
 
Peter and Chris,

There is no boxing here. What is going on here is that the compiler is
generating a class which has a public field "i" which also contains the
method which is passed to the thread for the delegate.

The field is of the same type as the variable, in this case, an int,
which means no boxing occurs.

The reason that it asserts is mentioned in my first post. Because the
scope of i actually contains the loop, there is one instance of the
anonymous class created for the delegate, for all iterations of the loop.

To have a separate instance created every time, you can replace the code
with the following:

List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
// Reassign i to prevent a shared anonymous method implementation.
int x = i;

Thread t = new Thread(delegate()
{
Debug.Assert(x != 2);
});

t.Start();
threads.Add(t);
}

In the code above, a new instance of the anonymous class will be created
on each iteration through the loop and then assigned to the delegate.

--
- Nicholas Paldino [.NET/C# MVP]
- (e-mail address removed)

Peter Duniho said:
Chris said:
[...]
It was also confusing, as I *expected* the variable passed into the
closure to be passed by value (it's an int, afterall). I expected to get
the same behavior as if I had passed in a constant. The fact that C#
boxed the variable, passed in a reference to it, and then later checked
it after the for-loop had completed (and the original variable was out of
scope) and got the "illegal" variable, really was amusing...

Maybe someone who has more intimate knowledge can comment, but AFAIK this
isn't a case of the value type being boxed. If it were, you wouldn't have
had the problem, since the boxing still preserves the value as it was at
the moment of boxing, not referencing the variable itself.

Rather, by capturing the variable in the anonymous method, what is being
used in the method is the variable itself. That's why any change to the
variable that happens before the code that uses it is executed is seen
when that code is executed.

The key here is the "capturing" behavior, and I believe that has nothing
to do with boxing.

I think the variable capturing that happens with anonymous methods is
pretty cool, actually. But I agree it can lead to some non-obvious
results when one is not aware that the capturing is going on, and what the
capturing does.

Pete
 
Jon,

I do not understand much of threading & was curious
So, I ran this using snippet compiler & couldn't see any difference in
output

<code>
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Threading;

class Sample
{
public static void Main(string[] args)
{
List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
int j = i;
Thread t = new Thread(delegate()
{
Console.WriteLine("inside thread: " + j);
Debug.Assert(j != 2);
});

Console.WriteLine("starting thread: " + i);
t.Start();
threads.Add(t);

}

foreach (Thread thread in threads)
thread.Join();
}
}
</code>

Could you explain, what is the problem with having to introduce j as a
temp. variable?
What is the problem with the original code (without introduction of
temp. variable inside the loop)?

Thanks
Kalpesh
 
Jon Skeet said:
Posting without reading any responses... "i" is captured and only has a
single "instance", so the assertion will fail if the thread executes
after the loop has "finished".

The solution:

List<Thread> threads = new List<Thread>();
for (int i = 0; i < 2; i++)
{
int j=i; // THIS IS THE CHANGE
Thread t = new Thread(delegate()
{
Debug.Assert(j != 2); // AND THIS
});

t.Start();
threads.Add(t);
}

foreach (Thread thread in threads)
thread.Join();

There's a new instance of "j" each time we go round the loop.

Or:
for (int i = 0; i < 2; i++)
{
Thread t = new Thread((ParameterizedThreadStart)delegate(object
y){ Debug.Assert((int)y != 2); });
t.Start(i);
threads.Add(t);
}

Or in C# V3 :
for (int i = 0; i < 2; i++)
{
Thread t = new Thread(y => { Debug.Assert((int)y != 2); });
t.Start(i);
threads.Add(t);
}


Willy.
 
Kalpesh said:
Could you explain, what is the problem with having to introduce j as a
temp. variable?

I don't think there's a problem introducing j as a temporary variable.
It's a nice solution to the problem.
What is the problem with the original code (without introduction of
temp. variable inside the loop)?

The issue has to do with the order in which the code is executed and
what value each variable holds. The code you posted is likely (but not
guaranteed) to output something like this:

starting thread: 0
starting thread: 1
inside thread: 0
inside thread: 1

However, if you remove the local j variable, and in the anonymous method
just use i, you would get something like this:

starting thread: 0
starting thread: 1
inside thread: 2
inside thread: 2

Again, no guarantees due to thread scheduling, but it's a likely
outcome. And of course, the assertion would fail so in addition to the
output, you'd get that as well.

Hope that helps.

Pete
 
Pete,

Thanks for your reply.
Pardon my ignorance - but it outputs the same thing on the machine
that I am on.

If that's the case, how to really see them outputting different
results (in order to produce the bug)?
Also, in real life case -how to make sure that problems like this dont
pass by & get caught when in production?

Thanks
Kalpesh
 
Could you explain, what is the problem with having to introduce j as a
temp. variable?
What is the problem with the original code (without introduction of
temp. variable inside the loop)?

It's not really a concurrency issue - it's an anonymous method issue.
The two new threads (and the continuing method) are all still using the
same variable.

See http://pobox.com/~skeet/csharp/csharp2/delegates.html for more -
look at the last section (captured variables)
 
Kalpesh said:
Pete,

Thanks for your reply.
Pardon my ignorance - but it outputs the same thing on the machine
that I am on.

What outputs the same thing? What is the actual output? For which
version of the code?
If that's the case, how to really see them outputting different
results (in order to produce the bug)?

The easiest thing would be to include a call to Thread.Sleep() at the
beginning of the anonymous method, to ensure that the rest of the code
in the anonymous method doesn't execute until the main thread has
finished initializing all of the threads.

An alternative would be to use an additional loop. Currently the code
has two: one initializes and starts each thread, while the second waits
for each to complete. You could break the first loop into two different
loops, one of which initializes each thread, another to actually start
each thread. That would reproduce the problem reliably as well.
Also, in real life case -how to make sure that problems like this dont
pass by & get caught when in production?

Well, as a start: very VERY suspicious any time you use a variable
inside an anonymous method that was declared outside that anonymous
method, and the anonymous method is going to be executed asynchronously
relative to the code that's declaring the anonymous method.

The most common reasons for using an anonymous method involve it being
executed asynchronously, so this winds up meaning that you should simply
be suspicious any time a variable is used inside an anonymous method.

One notable exception to this would be using an anonymous method in a
call to Control.Invoke(). But otherwise, asynchronous scenarios are
typical.

By "suspicious", I don't mean you shouldn't do it. I just mean that you
should consider that to be an enormous red flag, requiring very careful
inspection of the code so that you understand exactly what it's doing.

IMHO, if you take the time to really stop and think about what
"capturing" a variable means, then any time you write an anonymous
method this will be very much in the front of your mind. I think
there's a tendency to read and write code in anonymous methods as if the
captured variable exists in the same context as the declaration of the
anonymous method. But it doesn't have to be that way; just remember
that the captured variable's value is used at the same time as the rest
of the code in the anonymous method is _executed_, and you should be
able to avoid these problems.

Pete
 
I'm not sure that Sleep nor a loop would help here - really I suspect
all you need is:

for (int i = 0; i < 2; i++)
{
int j = i;
Thread t = new Thread(delegate()
{
Debug.Assert(j != 2);
});
t.Start();
threads.Add(t);
}

By 1.1 standards it looks the same, but j should (IIRC) now be scoped
(and hence captured) *inside* the loop. On each successive loop, the j
is completely different. Without the capture, of course, there is only
a single j declared on the stack at the top of the method... gotta
love capture ;-p

Marc
 
ahh, frick; sorry Jon - I didn't see your (scarily similar) post.
Deferred posting... oops and apols...
 
Marc said:
I'm not sure that Sleep nor a loop would help here - really I suspect
all you need is:

Help with what? The post to which I replied asked if there was a way to
_ensure_ that the two different versions of the code behaved
differently. I don't know why he's unable to reproduce a difference
without changing the samples; assuming he's running on Windows, it's
hard to imagine a scenario where the thread creating the other threads
gets preempted before the i loop exits.

But taking as granted his statement is true and he does in fact have
trouble seeing the difference, there are ways to hack up the code so
that the timing is more deterministic.

Please don't think that the post to which you replied was intended to
address the original issue. My comments there are _strictly_ with
respect to manipulating the code so that it fails in a more obvious,
reproducible way.

Pete
 
As a follow-on question... where does variable i 'live'. It's local and
a value type, so lives on the local stack...

So... remove the thread.Join() and make the threads take a little longer
e.g.

new Thread( delegate() { Thread.Sleep(5000); Debug.Assert(i!=2); }

The original method will have completed before the thread runs its code.
So where is the 'i' that it's referencing? Won't the stack have changed
by then?
 
Back
Top