Thread safety of asynchronous sockets, also - documentation vs reality

  • Thread starter Thread starter chris gouldie via .NET 247
  • Start date Start date
C

chris gouldie via .NET 247

I have a multiple part question about the thread safety of the
Socket class, and how asynchronous socket events occur. The
documentation of this class is misleading, but the behavior
is what I would expect (and want). Here is my interpretation
of the documentation:

When you call BeginReceive, the asynchronous callback you
specified is immediately called. This callback should call the
EndReceive function on the socket, which will block until data is
actually available.

I think my interpretation is a reasonable one given the docs. I will
quote them for you here:

For BeginReceive:

"When your application calls BeginReceive, the system will use a
separate thread to execute the specified callback method, and will
block on EndReceive until the Socket reads data or throws an
exception."

And for EndReceive:

"Before calling BeginReceive, you need to create a callback method
that implements the AsyncCallback delegate. This callback method
executes in a separate thread and is called by the system after
BeginReceive returns. [.....] The EndReceive method will block
until data is available."

For Thread Safety of the Socket class, the docs say:

"Any public static (Shared in Visual Basic) members of this type
are thread safe. Any instance members are not guaranteed to be
thread safe."

Therein lies the problem. First of all, the BeginReceive function
does not behave the way that the docs seem to imply. I do not
receive a callback immediately, I only receive a callback when there
is data to be read from the Socket. This is a Good Thing.

The reason this is a Good Thing, is that the Socket class is,
according to the docs, not thread safe. I interpret this to mean
that I must lock around any method calls on an instance of the
class. So, consider the following:

void MyReceiveCallBack(IAsyncResult aResult)
{
...
lock (sock)
{
int bytesRead = sock.EndReceive(aResult);
}
...
}

What happens if a thread in my application now calls a method which
contains the following code?
void MySendFunc(byte [] stuffToSend)
{
...
lock (sock)
{
aResult = sock.BeginSend(...);
}
...
}

That's right, I can't send until something is read from the socket!

Fortunately, the docs are incorrect, and what actually happens is
that I don't get a callback until there is data to be read. However,
I am still unhappy about having to lock around every method call
on the Socket instance. The docs should tell me HOW the class is
not thread safe. Just telling me that it isn't thread safe doesn't
help me at all. Is there something I am missing here? I'm worried
that someday Microsoft will release a patch to the framework that
causes Socket to behave the way that the docs say it will, and my
code will break. Help! Please!
 
chris gouldie via .NET 247 said:
I have a multiple part question about the thread safety of the
Socket class, and how asynchronous socket events occur. The
documentation of this class is misleading, but the behavior
is what I would expect (and want). Here is my interpretation
of the documentation:

When you call BeginReceive, the asynchronous callback you
specified is immediately called. This callback should call the
EndReceive function on the socket, which will block until data is
actually available.

Nope, your experience is in fact correct, BeginReceive creates it's own
thread which will not call your callback until data is available.
I think my interpretation is a reasonable one given the docs. I will
quote them for you here:

For BeginReceive:

"When your application calls BeginReceive, the system will use a
separate thread to execute the specified callback method, and will
block on EndReceive until the Socket reads data or throws an
exception."

It doesn't say that it will call your callback immediately, merely that it
will be called from a separate thread.
And for EndReceive:

"Before calling BeginReceive, you need to create a callback method
that implements the AsyncCallback delegate. This callback method
executes in a separate thread and is called by the system after
BeginReceive returns. [.....] The EndReceive method will block
until data is available."

But doesn't say how long after BeginReceive returns. I can see how it would
be confusing though.
For Thread Safety of the Socket class, the docs say:

"Any public static (Shared in Visual Basic) members of this type
are thread safe. Any instance members are not guaranteed to be
thread safe."

Therein lies the problem. First of all, the BeginReceive function
does not behave the way that the docs seem to imply. I do not
receive a callback immediately, I only receive a callback when there
is data to be read from the Socket. This is a Good Thing.

The reason this is a Good Thing, is that the Socket class is,
according to the docs, not thread safe. I interpret this to mean
that I must lock around any method calls on an instance of the
class. So, consider the following:

Not entirely correct. All it means is that responsibility for thread safety
is your problem. You don't have to lock everything, but you must consider
each variable and whether it is likely to be accessed by another thread
while a thread is using it; e.g. is it likely that your Socket class will be
getting called by the UI thread at the same time as the callback is being
executed?

When MS say that a class is not thread safe they simply mean that they have
put no locking and synchronisation code in at all, and that such actions are
completely up to you. A rule of thumb is that static members should be
thread-safe whilst instance members are not.

So, to summarize; your interpretation of the Socket class is the correct
one, the docs are a bit ambiguous. MS will *not* change Socket to "conform"
to the docs since that is not the required behaviour.
void MyReceiveCallBack(IAsyncResult aResult)
{
...
lock (sock)
{
int bytesRead = sock.EndReceive(aResult);
}
...
}

What happens if a thread in my application now calls a method which
contains the following code?
void MySendFunc(byte [] stuffToSend)
{
...
lock (sock)
{
aResult = sock.BeginSend(...);
}
...
}

That's right, I can't send until something is read from the socket!

Fortunately, the docs are incorrect, and what actually happens is
that I don't get a callback until there is data to be read. However,
I am still unhappy about having to lock around every method call
on the Socket instance. The docs should tell me HOW the class is
not thread safe. Just telling me that it isn't thread safe doesn't
help me at all. Is there something I am missing here? I'm worried
that someday Microsoft will release a patch to the framework that
causes Socket to behave the way that the docs say it will, and my
code will break. Help! Please!
 
I am the original poster, and thank you for your response! See my
comments below.

"Before calling BeginReceive, you need to create a callback method
that implements the AsyncCallback delegate. This callback method
executes in a separate thread and is called by the system after
BeginReceive returns. [.....] The EndReceive method will block
until data is available."
But doesn't say how long after BeginReceive returns. I can see how it
would be confusing though.

Well, why not just say it will be called by the system when data has been
read and is available? It's more than confusing, it's downright
misleading, especially when you toss in the statement about how
EndReceive blocks until data is available. If you only call EndReceive in
your receive callback, EndReceive NEVER blocks, because data is always
available at that point, correct? Am I right in guessing that you can
pass null for your callback, and call EndReceive directly after calling
BeginReceive, thereby causing a synchronous read? If that is the case,
the docs make more sense (although they are still a bit misleading), but
why wouldn't you just use a synchronous Socket instead?
Not entirely correct. All it means is that responsibility for thread
safety is your problem. You don't have to lock everything, but you
must consider each variable and whether it is likely to be accessed by
another thread while a thread is using it; e.g. is it likely that your
Socket class will be getting called by the UI thread at the same time
as the callback is being executed?

Sure, I understand what thread safety means, I guess I didn't make it
clear that concurrency is indeed an issue in my project. However, it
might be useful to have a bit more information than just that the class
is not thread safe. It seems weird to me that calling BeginSend during a
Receive callback could cause a problem, for instance, unless the
underlying object has some sort of cross-dependency on the sending and
receiving sides. The very idea that I have to synchronize access to an
object which exposes an asynchronous API.... well just say that out loud
and see if you don't laugh. :) Before you reply, let me say that I do
realize that Socket also exposes a synchronous API. Perhaps that is the
problem. We might do better if we had a Socket class and an AsyncSocket
class.

It sure would be nice if somebody with knowledge of the internals could
clear this up!
When MS say that a class is not thread safe they simply mean that they
have put no locking and synchronisation code in at all, and that such
actions are completely up to you. A rule of thumb is that static
members should be thread-safe whilst instance members are not.

So, to summarize; your interpretation of the Socket class is the
correct one, the docs are a bit ambiguous. MS will *not* change Socket
to "conform" to the docs since that is not the required behaviour.

I want to quibble that it is the *documented* behavior, but instead I
will say thanks very much for your post! :)

Take care,

Christopher
 
Sean Hederman said:
When MS say that a class is not thread safe they simply mean that they have
put no locking and synchronisation code in at all, and that such actions are
completely up to you. A rule of thumb is that static members should be
thread-safe whilst instance members are not.

<snip>

This is actually a bit of a pain, because I suspect that there *is* a
little bit of thread safety which is guaranteed but not documented.
Just enough to not worry about memory barriers. I suspect there's a
write memory barrier on the "main" thread after the call to BeginRead,
and a read memory barrier on the threadpool thread before the callback.
I suspect it would be hard to make things work at all without involving
those memory barriers along the way. Unfortunately, it's not documented
:(
 
cg said:
I am the original poster, and thank you for your response! See my
comments below.

"Before calling BeginReceive, you need to create a callback method
that implements the AsyncCallback delegate. This callback method
executes in a separate thread and is called by the system after
BeginReceive returns. [.....] The EndReceive method will block
until data is available."
But doesn't say how long after BeginReceive returns. I can see how it
would be confusing though.

Well, why not just say it will be called by the system when data has been
read and is available? It's more than confusing, it's downright
misleading, especially when you toss in the statement about how
EndReceive blocks until data is available. If you only call EndReceive in
your receive callback, EndReceive NEVER blocks, because data is always
available at that point, correct? Am I right in guessing that you can
pass null for your callback, and call EndReceive directly after calling
BeginReceive, thereby causing a synchronous read? If that is the case,
the docs make more sense (although they are still a bit misleading), but
why wouldn't you just use a synchronous Socket instead?

You are right here. Passing a null in for a callback does mean that you
would have to call EndReceive in your main thread. The advantage here is
that any ref, out and returns from EndReceive will be in the caller's thread
context, and thus will not need any synchronization.

Think about a scenario such as a server listener. It may want to start the
receive, and also do some preparation work for when the data comes in. Thus,
it uses the synchronous EndReceive, since it doesn't mind blocking once it's
finished it's prep work. Another situation could be where the main thread
occasionally polls the delegate to see if it has completed, and if it hasn't
does something else. Never really saw much point in that myself (except for
keeping your thread count down). Finally, you have the "true" asynchronous
situation where there is no blocking on the socket at all, but you may get
synchronisation issues when transferring the received information back to
the calling thread.
Sure, I understand what thread safety means, I guess I didn't make it
clear that concurrency is indeed an issue in my project. However, it
might be useful to have a bit more information than just that the class
is not thread safe. It seems weird to me that calling BeginSend during a
Receive callback could cause a problem, for instance, unless the
underlying object has some sort of cross-dependency on the sending and
receiving sides. The very idea that I have to synchronize access to an
object which exposes an asynchronous API.... well just say that out loud
and see if you don't laugh. :) Before you reply, let me say that I do
realize that Socket also exposes a synchronous API. Perhaps that is the
problem. We might do better if we had a Socket class and an AsyncSocket
class.

All non-thread safe objects are thread safe if called from one thread ;D I
know I'm stating the obvious, but all you need to do is ensure that only one
thread "owns" a Socket. It may be asked to do things to the Socket, but it
should never allow other threads to access the Socket directly. The only
exception would be the completion threads which are allowed to call
EndReceive and that's it. If you can do that, then the Socket will be safe
as houses, and won't require any synchronisation code at all. As for an
AsyncSocket class, it isn't really needed since Begin/EndReceive allow the
non-thread safe socket to operate asynchronously, which is enough for most
purposes.
It sure would be nice if somebody with knowledge of the internals could
clear this up!


I want to quibble that it is the *documented* behavior, but instead I
will say thanks very much for your post! :)

:-)

My pleasure
 
Jon Skeet said:
<snip>

This is actually a bit of a pain, because I suspect that there *is* a
little bit of thread safety which is guaranteed but not documented.
Just enough to not worry about memory barriers. I suspect there's a
write memory barrier on the "main" thread after the call to BeginRead,
and a read memory barrier on the threadpool thread before the callback.
I suspect it would be hard to make things work at all without involving
those memory barriers along the way. Unfortunately, it's not documented
:(

I suspect you're right, but I hate relying on undocumented features, they
lead to such nasty bugs down the line when changed.
 
Christopher, you seem to have recently faced the same problems I am
facing now, and some of the questions you wrote have been also my
doubts. I also found the documentation confusing. It wouldn't have
been difficult to explain it in a better way.

I posted a question on ...dotnet.languages.csharp, but since what I
was asking is not specifically related to csharp, maybe this is a
better newsgroup.

My question was:

"Each time that the client closes the connection, my ReadCallback gets
executed, but EndReceive returns 0 bytes.

If ReadCallback gets executed, and EndReceive returns 0 bytes, can I
always conclude that the client has closed the connection?

In other words, can I reliably use this criterium to detect (from the
server point of view) that the connnection with a client has been
lost?"

If anyone has a clue on why that criterium should not be used, please
let me know. Thank you.

Mochuelo.


I am the original poster, and thank you for your response! See my
comments below.

"Before calling BeginReceive, you need to create a callback method
that implements the AsyncCallback delegate. This callback method
executes in a separate thread and is called by the system after
BeginReceive returns. [.....] The EndReceive method will block
until data is available."
But doesn't say how long after BeginReceive returns. I can see how it
would be confusing though.

Well, why not just say it will be called by the system when data has been
read and is available? It's more than confusing, it's downright
misleading, especially when you toss in the statement about how
EndReceive blocks until data is available. If you only call EndReceive in
your receive callback, EndReceive NEVER blocks, because data is always
available at that point, correct? Am I right in guessing that you can
pass null for your callback, and call EndReceive directly after calling
BeginReceive, thereby causing a synchronous read? If that is the case,
the docs make more sense (although they are still a bit misleading), but
why wouldn't you just use a synchronous Socket instead?
Not entirely correct. All it means is that responsibility for thread
safety is your problem. You don't have to lock everything, but you
must consider each variable and whether it is likely to be accessed by
another thread while a thread is using it; e.g. is it likely that your
Socket class will be getting called by the UI thread at the same time
as the callback is being executed?

Sure, I understand what thread safety means, I guess I didn't make it
clear that concurrency is indeed an issue in my project. However, it
might be useful to have a bit more information than just that the class
is not thread safe. It seems weird to me that calling BeginSend during a
Receive callback could cause a problem, for instance, unless the
underlying object has some sort of cross-dependency on the sending and
receiving sides. The very idea that I have to synchronize access to an
object which exposes an asynchronous API.... well just say that out loud
and see if you don't laugh. :) Before you reply, let me say that I do
realize that Socket also exposes a synchronous API. Perhaps that is the
problem. We might do better if we had a Socket class and an AsyncSocket
class.

It sure would be nice if somebody with knowledge of the internals could
clear this up!
When MS say that a class is not thread safe they simply mean that they
have put no locking and synchronisation code in at all, and that such
actions are completely up to you. A rule of thumb is that static
members should be thread-safe whilst instance members are not.

So, to summarize; your interpretation of the Socket class is the
correct one, the docs are a bit ambiguous. MS will *not* change Socket
to "conform" to the docs since that is not the required behaviour.

I want to quibble that it is the *documented* behavior, but instead I
will say thanks very much for your post! :)

Take care,

Christopher
 
If ReadCallback gets executed, and EndReceive returns 0 bytes, can I
always conclude that the client has closed the connection?

yes. The ReadCallback gets called when there is something to read and when
the connection closes.
 
I have been pondering this same set of issues (including the very
misleading documentation), and this discussion has helped to complete my
understanding (particularly on the issue of EndReceive blocking, and why
that would ever occur).

The thread itself, however, still seems to have a few pieces missing. I
now think that I have a fairly complete picture of the matter, and that
picture is presented below for your consideration and comment.

The key point that seems to have been overlooked so far in our
discussion is: Why go to all the trouble of using BeginReceive() /
EndReceive(), callbacks, etc., when you could just create your own
worker thread for each client connection and then use synchronous
methods (i.e., Receive) within that thread for monitoring incoming data
from the client?

I believe that the answer is that the BeginReceive() / EndReceive()
approach requires vastly fewer threads than the
one-worker-thread-per-client approach.

But, how could this be the case if "BeginReceive creates it's own thread
which will not call your callback until data is available" (as one
poster stated)? If BeginReceive created its own thread, then there
would be a thread sitting there for the entire duration of the client's
think-type (or other kind of) delay, and when your callback function was
called, it would run on that thread and then would immediately create a
*new* thread before it exited (to wait for any subsquent input).

So, you would still have one thread for each connected client. And, you
would have the *additional* overhead of stopping and starting a thread
for each fragment of input received. Therefore, if a new thread is
being created by BeginReceive(), then why would you want to use
BeginReceive() / EndReceive() (instead of a worker thread that calls the
synchronous Receive() method) if the main point of using the former is
to reduce the number of threads required by a server application?

This is still just a theory (given the lack of good documentation), but
here is what I think is going on:

When you use BeginReceive(), passing it a callback, *no* new thread is
created at that time. Instead, the framework makes a note of the
callback, the port to be monitored, the buffer location, etc., and then
watches that port for input *without* creating an additional thread.
I'm not sure exactly what Framework/OS mechanisms are used to accomplish
this watching, but I don't think it requires its own worker thread, as
clearly the requirements for saving state for this task are
significantly less than those for a typical thread (which needs a stack
for local variables, current program counter location, priority, etc.,
etc.).

When input eventually appears on the port, a thread is allocated *then*
from the existing (reusable) thread pool, and your callback method is
run on that thread. Your callback should be written so that it does not
block. As a
result, your callback will complete very rapidly, obtaining whatever
data was read by calling EndReceive(), processing it, calling
BeginReceive() again if there is more input, and then terminating. Each
time your callback completes,
the thread on which it is running is returned to the thread pool.

This means that you are tying up a thread only during the time that your
callback is actually running. Between the increments of client input
data, there is *no* active thread.

So, say that you are running a chat application with thousands of
simultaneous client (user) connections. Let's say that 99 percent of
the time of these users is spent thinking and typing, and when they hit
the Enter key for each chat message, it takes only about one hundreth of
the
time to process that input as it did for them to think it up and type
it. Then, we will have one one-hundreth of the threads running using
the BeginReceive() / EndReceive() approach that we would have had if we
had allocated a new thread for each client connection and waited
synchronously on that thread for input to arrive.

That is a huge increase in efficiency, and so that is why people go to
the trouble of using BeginReceive() / EndReceive() rather than just
creating a new thread for each connection.

If the above is correct, I would be interested in hearing what people
might know about the specific Framework / OS mechanisms that are used to
wait for input on a port without creating a separate thread.

Carl
 
Back
Top