The Interlocked on the Edge of Forever

  • Thread starter Thread starter Chris Mullins [MVP]
  • Start date Start date
Stelrad Kypski said:
not obvious to me. what is it?

The reads are done in the same order as the increments, which means
that even without any reordering, it's quite possible to get any of the
four possibilities.
 
Jos Scherders said:
Hi,

One of the posts in this very interesting threat referred to this article
:

http://msdn2.microsoft.com:80/en-us/library/ms686355.aspx

This article includes the following piece of code (to fix the race
condition):

volatile int iValue;
volatile BOOL fValueHasBeenComputed = FALSE;
extern int ComputeValue();

void CacheComputedValue()
{
if (!fValueHasBeenComputed)
{
iValue = ComputeValue();
fValueHasBeenComputed = TRUE;
}
}

BOOL FetchComputedValue(int *piResult)
{
if (fValueHasBeenComputed)
{
*piResult = iValue;
return TRUE;
}

else return FALSE;
}

My (complete ignorent) question is this:

Must fValueHasBeenComputed really be declared volatile for the program to
produce correct results ? Isn't it true that , because iValue is declared
volatile, it is impossible that fValueHasBeenComputed = TRUE and the
result of ComputeValue() is not stored in iValue ?
Of course, because fValueHasBeenComputed is not declared volatile it may
be that FetchComputedValue() returns FALSE when in fact iValue does
contain the computed value but the other case can never occur. E.g.
FetchComputedValue returns TRUE and piResult is incorrect.

Are my assumptions correct ?

I think so, but it wouldn't be very useful. I assume that if
FetchComputedValue() returns FALSE, you would sleep and try again? Only,
the compiler is allowed to determine that FetchComputedValue doesn't depend
on any volatile variables, so it optimizes by hoisting it outside the loop,
and turns this:

int buf;
while (!FetchComputedValue(&buf))
Sleep();

into

int buf;
begin:
BOOL result = FetchComputedValue(&buf);
if (!result) goto end;
Sleep();
goto begin;
end:

into

int buf;
BOOL result = FetchComputedValue(&buf);
if (!result) goto end;
begin:
Sleep();
goto begin;
end:

or
int buf;
if (!FetchComputedValue(&buf)) {
while (true)
Sleep();
}


Whole program optimization (cross-module inlining) makes this a virtual
certainty.
 
| I believe the rules you write are mistaken and not supported by the
| standards or links you post, and in any case, interlocked operations
| only affect a *single* thread, not all threads.

Thanks Barry. Is this documented somewhere? An Interlocked would hardly be
useful if that was the case. They effect other threads by virtue of their
atomic nature *and the barrier they impose. In the case of a shared
variable, only changed in CPU cache, the execution of the normal interlocked
instruction make this var (and all cache memory of current CPU) be
committed into physical memory *before executing the interlocked
instruction. So other threads will "see" changes from interlocked
operations after seeing changes in another shared variable (i.e. release
semantics).

For example:

a++;
InterlockedIncrement (ref b);
c++;

Other threads will only see changes in "a" prior to changes in "b", and will
see changes in "c" after changes in b. Critical section lock algorithms,
such as the "Peterson's algorithm" rely on this behavior or would not work
on SMP hardware. Maybe we are saying the same thing, and I am just not
understanding your intent.
 
William Stacey said:
| I believe the rules you write are mistaken and not supported by the
| standards or links you post, and in any case, interlocked operations
| only affect a *single* thread, not all threads.

Thanks Barry. Is this documented somewhere? An Interlocked would hardly be
useful if that was the case.

No, interlocked is useful in the same way that locking is useful,
despite it being a co-operative thing.
They effect other threads by virtue of their
atomic nature *and the barrier they impose. In the case of a shared
variable, only changed in CPU cache, the execution of the normal interlocked
instruction make this var (and all cache memory of current CPU) be
committed into physical memory *before executing the interlocked
instruction. So other threads will "see" changes from interlocked
operations after seeing changes in another shared variable (i.e. release
semantics).

For example:

a++;
InterlockedIncrement (ref b);
c++;

Other threads will only see changes in "a" prior to changes in "b", and will
see changes in "c" after changes in b. Critical section lock algorithms,
such as the "Peterson's algorithm" rely on this behavior or would not work
on SMP hardware. Maybe we are saying the same thing, and I am just not
understanding your intent.

They will see changes in a, b and c in that order if they *read* them
in that order - but if you do:

int x = c;
int y = b;
int z = a;

then there's no guarantee that you won't see the change to c but not
the change to a, because a could *actually* have been reordered to be
read before c in the first place.

Joe Duffy has replied to the blog post, by the way, confirming this.
 
| Joe Duffy has replied to the blog post, by the way, confirming this.

Thanks. Good to see his reply. So he is saying Interlocked *would prevent
the clr re-order issue? So doing something like:

int a = x;
int b = Interlocked.Read(ref y);

to your example would fix this example in terms of clr reorder issue? As he
said, it is easier to use volatile, but just to finish the thread. Or would
you need Interlocked.Read on both lines?
 
William Stacey said:
| Joe Duffy has replied to the blog post, by the way, confirming this.

Thanks. Good to see his reply. So he is saying Interlocked *would prevent
the clr re-order issue? So doing something like:

int a = x;
int b = Interlocked.Read(ref y);

to your example would fix this example in terms of clr reorder issue? As he
said, it is easier to use volatile, but just to finish the thread. Or would
you need Interlocked.Read on both lines?

I *think* that you only need the one, although you can't do it with an
int because there's no Interlocked.Read(ref int value). It's
unfortunate that the docs for Interlocked.Read imply that all
Interlocked is used for is atomic reads, rather than preventing
reordering.

Personally though, I'd usually use a lower-tech approach, such as
locking, even for a single value. I consider even volatile to be
somewhat high-tech!
 
William said:
| Joe Duffy has replied to the blog post, by the way, confirming this.

Thanks. Good to see his reply. So he is saying Interlocked *would prevent
the clr re-order issue? So doing something like:

int a = x;
int b = Interlocked.Read(ref y);

Interlocked.Read is for reading 64-bit values atomically on 32-bit
platforms. Thread.VolatileRead is what you want here (or preferably mark
y as volatile).
to your example would fix this example in terms of clr reorder issue?

It's not just a CLR issue, it's a CPU issue too - as Joe indicates in
the reply to Jon's blog post, IA64 will reorder reads etc. more than
e.g. x86.
As he
said, it is easier to use volatile, but just to finish the thread. Or would
you need Interlocked.Read on both lines?

Thread.VolatileRead will be a read barrier, and the normal way of
understanding "read barrier" is that it prevents later reads moving
before it, but when you're mixing read-barriers with normal reads, the
barrier-based mental model gets a little shaky. What, in that model,
stops the read of 'x' into local 'a' moving after the call (i.e. in the
other direction)? IMO both x and y should be marked volatile, or both
should be read with Thread.VolatileRead, to nail them down. I'd stay
away from "mix and match" volatile / non-volatile variables.

Even better, use a lock unless performance timing indicates that this
trickery is absolutely necessary.

-- Barry
 
Back
Top