Async TCPClients

  • Thread starter Thread starter Shak
  • Start date Start date
S

Shak

Hi all,

Three questions really:

1) The async call to the networkstream's endread() (or even endxxx() in
general) blocks. Async calls are made on the threadpool - aren't we advised
not to cause these to block?

2) You can connect together a binaryreader to a networkstream:

BinaryReader reader = new BinaryReader(stream);

And then read an int from it:

int i = reader.ReadInt32();

And vice versa.

Since there are no calls to beginread/endread, I presume this isn't a
synchronous call. Is there any way to make it async? Or am I missing the
point?

3) I'm trying to implement the message mechanism described here:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dncscol/html/csharp09182003.asp

In order to answer question 2), I'm going to read the the messagetype
asynchronously, but the rest of the message synchrounously in the callback
(by tying together the binaryreader and networkstream). Is it a good idea to
do mix the two like this? What happens if the network lags, or bytes go
missing?

Thanks!

Shak
 
Hi Shak,

I've submitted some answers to your questions in-line
1) The async call to the networkstream's endread() (or even endxxx() in
general) blocks. Async calls are made on the threadpool - aren't we
advised
not to cause these to block?

If you are worried about blocking a ThreadPool thread then spawn your own
Thread and use synchronous NetworkStream calls.
But I wouldn't worry about it anyway. Just don't call EndRead while on a
ThreadPool thread because nested ThreadPool calling may cause a deadlock.
i.e. don't execute asynchronously on a ThreadPool thread to make another
asynchronous call on another ThreadPool thread (this may cause a dead-lock).
2) You can connect together a binaryreader to a networkstream: ....
Since there are no calls to [BinaryReader's] beginread/endread, I presume
this [is] a
synchronous call. Is there any way to make it async? Or am I missing the
point?

BinaryReader works synchronously. If it's used in an asynchronous callback
then it's being used asynchronously within the context of your application.
In other words, Async callback --> Sync ReadInt32() means that the
ReadInt32() call is now asynchronously executing on a ThreadPool thread.

If my explanation doesn't work for you than please tell me what is the
architecture you are trying to achieve?
3) I'm going to read the the messagetype
asynchronously, but the rest of the message synchrounously in the callback
(by tying together the binaryreader and networkstream). Is it a good idea
to do mix the two like this?

I recommend the following approach for a very simple RPC app:

1. Provide an internal AsynchronousCallback to the BeginAcceptTcpClient on
the TcpListener (server-side).
2. In your private AsynchronousCallback method call EndAcceptTcpClient and
grab the TcpClient that has connected.
3. In a loop call AcceptTcpClient (synchronously) if desired while in the
AsynchronousCallback after handling the current connection.
3. Authorize TcpClient.RemoteIPAddress to access your application.
4. TcpClient.GetStream() and wrap it with a NegotiateStream for NTLM or
Kerberos authentication (AuthenticateAsServer)
5. On the client-side's TcpClient use must also use NegotiateStream with the
same arguments (AuthenticateAsClient)
6. Because your TcpClient on the server-side is referenced in a method that
is executing asynchronously you should use synchronous calls to read from
the NegotiateStream. Using a BinaryReader that wraps the NegotiateStream
call BinaryReader.Read*
7. This is beginning to look a lot like Remoting. For this reason it makes
sense to use IMessage implementations as your remote call objects and the
..NET framework will provide the tools to do so.
8. I have implemented my own Remoting framework for more control and to
gather experience. In my implementation I have a class that handles all
network communication. The class automatically sends an Int32 to the remote
counter-part that is the length of the serialized IMessage implementation
before sending the actual IMessage. In the server app the code reads the
Int32 and then it knows how many bytes it must read to acquire the entire
serialized IMessage. Then it reads the bytes and deserializes the IMessage.
Of course I've used an OO approach that has grown to be much more complex
but this has been a simplified explanation.
9. Use a RealProxy-derived class on the client-side to communicate with the
server. (The RealProxy can create a __TransparentProxy for your server
object on the client machine so that all calls to the server object are
forwarded to your RealProxy where you can submit them across your custom
remoting framework. After a response is received from the server your
RealProxy just returns the IMethodReturnMessage implementation and the .NET
framework handles the adaptation of the return message to the actual return
value of the method call. Just override the RealProxy.Invoke method which
encapsulates all of the aforementioned messaging functionality on the
client-side. RealProxy will be impressive if you haven't used it before.
10. As in Remoting your simple framework should close the connection after
each message is sent. (Actually, the Remoting framework seems to buffer
calls or wait after a call for a short period of time so that the connection
is not opened and closed on all invocations.)
11. Just FYI, it's definitely possible, as I've already accomplished in my
framework, to do all of theses steps over a single Socket that remains open
and provides two-way communication. After you get your simple RPC app to
work correctly try to extend it so that it works two-way through a client
firewall using a single Socket. Just a hint, read/write over the Socket
must be synchronized, so using WaitHandles will be useful.
What happens if the network lags, or bytes go missing?

If you are using TCP then you must expect that bytes won't be lost unless
the network fails, in which case the Socket will throw a SocketException.
There is not much you can do to code against faulty hardware or loss of
connectivity other than to try again when you catch a SocketError code that
is not fatal.

HTH
 
Thanks for the comprehensive reply, Dave! I have a few more questions,
inline:

Dave Sexton said:
Hi Shak,

I've submitted some answers to your questions in-line


If you are worried about blocking a ThreadPool thread then spawn your own
Thread and use synchronous NetworkStream calls.
But I wouldn't worry about it anyway. Just don't call EndRead while on a
ThreadPool thread because nested ThreadPool calling may cause a deadlock.
i.e. don't execute asynchronously on a ThreadPool thread to make another
asynchronous call on another ThreadPool thread (this may cause a
dead-lock).

Do you mean NOT to use the Async callback to reestablish the async? At the
moment I'm doing something like this:

//somewhere else
o.BeginAsync(completeAsync, o)
....

void completeAsync(IAsyncResult ar)
{
Class o = (Class)ar.SyncState;
o.EndAsync();
//do stuff for this async cycle
//resestablish wait
o.BeginAsync(completeAsync, o)
}

Which I thought was the established model for eg async reading and writing.
Your point 3 in the RPC method below suggests I should loop with a
synchronous method instead of calling BeginAsync again.
2) You can connect together a binaryreader to a networkstream: ...
Since there are no calls to [BinaryReader's] beginread/endread, I presume
this [is] a
synchronous call. Is there any way to make it async? Or am I missing the
point?

BinaryReader works synchronously. If it's used in an asynchronous
callback then it's being used asynchronously within the context of your
application. In other words, Async callback --> Sync ReadInt32() means
that the ReadInt32() call is now asynchronously executing on a ThreadPool
thread.

If my explanation doesn't work for you than please tell me what is the
architecture you are trying to achieve?

No, that's fine. What I wanted to do was to create a BinaryReader around a
NetworkStream and then call ReadInt32() and then go away, whether there is
an int to be read or not, and get notified when it would be. The solution
I've come to is to use a networkstream's BeginRead to do the wait and then
wrap a BinaryReader around the same networkstream within the callback, since
I now expect there to be data - pretty much how you describe above. I was
just afraid of blocking the the threadpool callback with ReadInt32, but it's
beginning to seem that some kind of blocking is inevitable anyway!
I recommend the following approach for a very simple RPC app:

I appreciate that but I think that's a bit too advanced for now! I will look
to use the various concepts in there eventually so it is useful.
If you are using TCP then you must expect that bytes won't be lost unless
the network fails, in which case the Socket will throw a SocketException.
There is not much you can do to code against faulty hardware or loss of
connectivity other than to try again when you catch a SocketError code
that is not fatal.

HTH

OK. I do have another question though: What happens if, while reading bytes
off of a network stream more data gets sent? Abstracting away from the
above, say you are expecting a particular message of unknown length. How do
you know when to stop reading bytes? All the examples I've seen seem to wait
till the number of bytes read<=0. Couldn't we get more than one message
accedentally in that scenario? If at all it stops?

Shak
 
Hi Shak,

Inline:
Do you mean NOT to use the Async callback to reestablish the async?

Actually, I should have written "don't call Begin*" while on a ThreadPool
Thread since it is the Begin* methods that spawn a ThreadPool Thread, not
the End* methods. The reason you shouldn't call into an asynchronous method
from an asynchronous method where both threads will use the ThreadPool is
that it can cause dead-locks because there is a limit to the number of
ThreadPool Threads provided to your application (related to the number of
physical processors). However, for a simple RPC app, especially if there
will be a limited number of client connections, it is acceptable to call
BeginAsync from the AsyncCallback.
Your point 3 in the RPC method below suggests I should loop with a
synchronous method instead of calling BeginAsync again.

If you were to use a synchronous loop you could spawn your own Thread using
a ThreadStart object to prevent dead-locks, but there may be a slight
performance cost. I have not compared the two methods. A problem that you
must work out when using a synchronous loop is that one client should not
have to wait for another client's RPC to complete before it can establish a
connection, so some asynchronous code is still required. Preferably, code
not executing on a ThreadPool Thread.
What I wanted to do was to create a BinaryReader around a NetworkStream
and then call ReadInt32() and then go away, whether there is an int to be
read or not, and get notified when it would be. The solution I've come to
is to use a networkstream's BeginRead to do the wait and then wrap a
BinaryReader around the same networkstream within the callback, since I
now expect there to be data - pretty much how you describe above.

The problem I had with using a BinaryReader.Read* call to block is that it
doesn't work! The BinaryReader doesn't seem to want to wait. You have to
block, as you said, before using the BinaryReader. I use the
TcpClient.Client Socket's Poll method in my framework instead of calling
Stream.ReadBytes.
I was just afraid of blocking the the threadpool callback with ReadInt32,
but it's beginning to seem that some kind of blocking is inevitable
anyway!

Yea, you have to assume that you'll never know how long the delay will be
before a client attempts to connect, and therefore some Thread, somewhere,
must block until it's activated by a client.
I appreciate that but I think that's a bit too advanced for now! I will
look to use the various concepts in there eventually so it is useful.
NP
OK. I do have another question though: What happens if, while reading
bytes off of a network stream more data gets sent? Abstracting away from
the above, say you are expecting a particular message of unknown length.
How do you know when to stop reading bytes? All the examples I've seen
seem to wait till the number of bytes read<=0. Couldn't we get more than
one message accedentally in that scenario? If at all it stops?

A few things here...

1. Only one client will use a any given socket at one time. Therefore, all
asynchronous calls received by the server on a Socket will originate from
the same object instance on the client.
2. If more than one message is sent to the server asynchronously there must
be a mechanism to buffer the messages on the client-side so they are not
mixed together. In TCP you can't sign each packet. The best way around
this in a simple RPC app is to only allow synchronous calls to be made by
the client, enforced on the client, or else you'll have to create a buffer.
3. Many examples and protocols expect the Socket to be closed when the call
is complete. In other words, the client sends a message and closes the
Socket, and the server reads until NetworkStream.Read() returns 0,
indicating that the Socket has been closed and reading is complete. Not a
very robust solution. Your app would have to open and close a Socket on
every call to the server, and the server would have no way of responding to
the client. Some protocols have an end-of-message mark that is sent after
the entire message. This works for simple, text-only protocols where the
range of input is limited and controlled.
4. In my original post I mentioned that my framework sends an Int32 that
describes the incoming message length before the client-side code serializes
each message. This way the server expects exactly 4 bytes describing the
length of the remaining message and 2147483647 bytes maximum for the
remaining message size. The socket does not need to be closed in order to
signal to the reader that the end of the message has been reached.
5. The client must always know the length of the message, in my example. If
you are using serialized objects for communication then it's possible to
just serialize the object and take the Stream's length as the length of the
message. Write the length to the NetworkStream as an Int32 and then write
the MemoryStream that contains the serialized message. If you do not know
the length of the message then you must use one of the methods I mentioned
in point 3 above.

Serialization is quite easy. Here's a simple example that I have written
for you that illustrates a client transport mechanism:

using System;
using System.Collections.Generic;
using System.Runtime.Serialization.Formatters.Binary;
using System.IO;
using System.Net.Sockets;

static class NetworkTransport
{
public static void SendMessage(Message message, NetworkStream
transportStream)
{
if (message == null)
throw new ArgumentNullException("message");

// I recommend encapsulating the transportStream
// within a class instead of using a static class
// as in this example
if (transportStream == null)
throw new ArgumentNullException("transportStream");

// create the stream that will contain the serialized message
using (MemoryStream serializedMessageStream = new MemoryStream(1024))
{
BinaryFormatter formatter = new BinaryFormatter();

// serialize the message to the stream as binary
formatter.Serialize(serializedMessageStream, message);

// get the length of the stream as an Int32
int length = (int) serializedMessageStream.Length;

// convert the length of the stream to an array of bytes
byte[] lengthInBytes = BitConverter.GetBytes(length);

// write the length of the stream to the socket (4 bytes from index 0,
which is 32 bits)
transportStream.Write(lengthInBytes, 0, 4);

// write the entire serialized message stream to the socket
transportStream.Write(serializedMessageStream.ToArray(), 0, length);

// the server must read 4 bytes, convert it to an Int32
// and then read the length of the Int32 from the socket
}
}
}

[Serializable]
class Message
// see System.Runtime.Remoting.Messaging.IMessage for the framework
interface
{
public readonly Dictionary<string, object> Properties = new
Dictionary<string, object>(8);
}


HTH
 
Dave Sexton said:
Hi Shak,

Inline:


Actually, I should have written "don't call Begin*" while on a ThreadPool
Thread since it is the Begin* methods that spawn a ThreadPool Thread, not
the End* methods. The reason you shouldn't call into an asynchronous
method from an asynchronous method where both threads will use the
ThreadPool is that it can cause dead-locks because there is a limit to the
number of ThreadPool Threads provided to your application (related to the
number of physical processors). However, for a simple RPC app, especially
if there will be a limited number of client connections, it is acceptable
to call BeginAsync from the AsyncCallback.

I think that's how I had read it anyway! I do understand the danger here,
but if the re-call to Begin* is the last thing done in the callback (ie the
callback has no more oppurtunity to block) then wouldn't it be ok, even in
the worst case? I think I'll have no more than a handful of clients anyway,
but I'll think about moving to a loop I ever need to scale it up.
If you were to use a synchronous loop you could spawn your own Thread
using a ThreadStart object to prevent dead-locks, but there may be a
slight performance cost. I have not compared the two methods. A problem
that you must work out when using a synchronous loop is that one client
should not have to wait for another client's RPC to complete before it can
establish a connection, so some asynchronous code is still required.
Preferably, code not executing on a ThreadPool Thread.

Hmm. But if a socket has already been established (and has it's own thread
on which it is reading on), surely it wouldn't matter? Going back to my
Async model:

//somewhere else
o.BeginAsync(completeAsync, o)
....

void completeAsync(IAsyncResult ar)
{
Class o = (Class)ar.SyncState;
o.EndAsync();
//do stuff for this async cycle
//resestablish wait
o.BeginAsync(completeAsync, o)
}

This is as opposed to a sync loop:

//somewhere else

StartAThread(o.Work) //(1)
....

void Work()
{
while (true)
{
o.Begin(); // blocks
//do stuff for this cycle.
}
}

Where (1) can be replaced by your thread-kicking-off code of choice.
Presumeably for the TCPClient connecting case, "do stuff for this cycle"
means to establish a further thread to poll on read (either via async or
kicking off another thread manually). I don't think either introduce the
possibility of deadlock, although I'm now seeing the latter as being more
efficient since it doesn't create new threads. Do both these models fulfill
your condition that "one client should not have to wait for another client's
RPC to complete before it can establish a connection", though?

It seems that most Async calls (especially in streams and the like) seem to
be for a job that repeats. If this is the case, and a loop that blocks is
more efficient, what's the point of having Async calls? I mean it's not like
you'll only want to read or write to a stream once, right?
A few things here...

That's brilliant, and exactly as I imagined it to be. In short, you have to
be able to determine when what you're receiving ends, usually using some
kind of prefix or suffix. Cool!

Shak
 
Hi Shak,

Inline:
I think that's how I had read it anyway! I do understand the danger here,
but if the re-call to Begin* is the last thing done in the callback (ie
the callback has no more oppurtunity to block) then wouldn't it be ok,
even in the worst case? I think I'll have no more than a handful of
clients anyway, but I'll think about moving to a loop I ever need to scale
it up.

I see your point that waiting for the call to complete before accepting the
next client should solve the problem of ThreadPool Thread usage and
dead-locking, however the recommendation exists because it may be out of
your control how many other blocks of code will be using a ThreadPool Thread
to call into your library and you can't say for sure that even the .NET
framework code isn't using ThreadPool Threads internally. In other words,
it's not safe to assume that the number of ThreadPool Threads being used by
your application can be managed simply be rearranging the order of async
calls in your code.
Hmm. But if a socket has already been established (and has it's own thread
on which it is reading on), surely it wouldn't matter? Going back to my
Async model:

The syncronous model is actually sync/async. Clients are accepted within a
synchronous loop and the code spawns a non-ThreadPool Thread to process the
request. There is no chance for dead-locking due to the physical ThreadPool
quantity, only in explicit synchronization code such as with the use of
Monitor.Lock or WaitHandles.
//somewhere else
o.BeginAsync(completeAsync, o)
...

void completeAsync(IAsyncResult ar)
{
Class o = (Class)ar.SyncState;
o.EndAsync();
//do stuff for this async cycle
//resestablish wait

The problem here is that all clients are queued even though your app should
be able to handle more than one request simealtaneously. You should call
o.BeginAsync immediately after o.EndAsync to start listening for another
client request before processing the current request. A problem that would
arise in a scaled-up application may be that too many subsequent requests
from different clients might block the first request from ever completing if
o.BeginAsync steadily completes synchronously. Just FYI.
o.BeginAsync(completeAsync, o)
}

This is as opposed to a sync loop:

//somewhere else

StartAThread(o.Work) //(1)
...

void Work()
{
while (true)
{
o.Begin(); // blocks
//do stuff for this cycle.
}
}

Where (1) can be replaced by your thread-kicking-off code of choice.
Presumeably for the TCPClient connecting case, "do stuff for this cycle"
means to establish a further thread to poll on read (either via async or
kicking off another thread manually). I don't think either introduce the
possibility of deadlock, although I'm now seeing the latter as being more
efficient since it doesn't create new threads. Do both these models
fulfill your condition that "one client should not have to wait for
another client's RPC to complete before it can establish a connection",
though?

Well, your example is not exactly the complete story for the sync/async
client-acceptance model.

// Here's a quick example that I wouldn't use in a real app as-is. Try an
OO approach instead.
// NOTE: this code was not tested and may not build
class Server : System.ComponentModel.Component
{
private readonly TcpListener listener = new TcpListener();
private bool running;

public void Start()
{
running = true;

// start listening for client connections
Thread listeningThread = Thread(new ThreadStart(ListenAsync));
listeningThread.Start();
}

private void ListenAsync()
{
while (running)
{
// block until a client connects
// TODO: handle exception if listener is disposed in a different
Thread; fail silently
TcpClient client = listener.AcceptTcpClient();

if (!running)
// if, while blocking, a different Thread changed the 'running'
state of the class but did not dispose of the listener
{
DisconnectClient(client);
break;
}

// As soon as the client is connected spawn another thread to
handle the request.
// This way, another connection will not be blocked until the
current request has been handled.
// Even if you were to call listener.BeginAcceptTcpClient here
the current request would
// still have to complete before the next connection would be
accepted, like the "queue"
// I mentioned above, since you wouldn't be listening for
another client until the end of the method.
Thread requestHandler = Thread(new
ParameterizedThreadStart(HandleRequestInternal))
requestHandler.Start(client);

/*
ASP.NET, I believe, uses a custom ThreadPool to do exactly what
I've done here,
which has either a manageable limit to the number of concurrent
requests or no limit.
Spawning a new Thread for each request might be too costly for
some applications
if they are not pooled.
*/
}
}

private void HandleRequestInternal(object info)
{
using (TcpClient client = (TcpClient) info)
{
// TODO: prepare request

try
{
HandleRequest(client);
}
finally
{
DisconnectClient(client);
}
}
}

private void HandleRequest(TcpClient client)
{
// TODO: handle request
}

private void DisconnectClient(TcpClient client)
{
client.GetStream().Close();
client.Dispose();
}
}
It seems that most Async calls (especially in streams and the like) seem
to be for a job that repeats. If this is the case, and a loop that blocks
is more efficient, what's the point of having Async calls? I mean it's not
like you'll only want to read or write to a stream once, right?

It may not be that a blocking loop is more effecient. Actually, as I've
mentioned, it might be less effecient unless properly coded. The reason for
using a loop is that it illeviates ThreadPool usage where it might be
problematic. The ThreadPool is very useful to queue up a task that must be
processed in the background (i.e. BackgroundWorker should be used in .NET
2.0) to free up a GUI Thread so that the end user has a better experience
with your application, and it can handle multiple tasks that may start and
stop frequently very efficiently, however it's not useful for servicing an
unlimited number of requests from remote clients or for code that blocks
during the lifetime of the application.
That's brilliant, and exactly as I imagined it to be. In short, you have
to be able to determine when what you're receiving ends, usually using
some kind of prefix or suffix. Cool!

You got it. I don't know how many other ways there are for doing this but
I've had a lot of success with the "suffix" approach.

HTH
 
Dave Sexton said:
The problem here is that all clients are queued even though your app
should be able to handle more than one request simealtaneously. You
should call o.BeginAsync immediately after o.EndAsync to start listening
for another client request before processing the current request. A
problem that would arise in a scaled-up application may be that too many
subsequent requests from different clients might block the first request
from ever completing if o.BeginAsync steadily completes synchronously.
Just FYI.

.... Because callbacks run on threadpool threads, and so you can only have x
number of calls to BeginAsync that haven't yet completed their callbacks?
Well, your example is not exactly the complete story for the sync/async
client-acceptance model.

// Here's a quick example that I wouldn't use in a real app as-is. Try an
OO approach instead.

That seems to be the model mine has evolved to, so thanks.
It may not be that a blocking loop is more effecient. Actually, as I've
mentioned, it might be less effecient unless properly coded. The reason
for using a loop is that it illeviates ThreadPool usage where it might be
problematic. The ThreadPool is very useful to queue up a task that must
be processed in the background (i.e. BackgroundWorker should be used in
.NET 2.0) to free up a GUI Thread so that the end user has a better
experience with your application, and it can handle multiple tasks that
may start and stop frequently very efficiently, however it's not useful
for servicing an unlimited number of requests from remote clients or for
code that blocks during the lifetime of the application.

In other words, if you need to do it more than once and at arbitrary times,
think about moving an async call to its own thread.

Thanks for all the help...

Shak
 
Back
Top