Protocol over TCP

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi,

I have been writing a Communications protocol component in C# that will
basically be used in client and server applications. The sole purpose of the
component is to allow applications to be able to RELIABLY send
structures/objects across a network (LAN) to other applications that also use
the component. Objects are encapsulated into a packet which has a header to
identify to the receiving end what type of packet it is.

The way the component has been implemented at this point is explained below:

- On sending, the component serializes the object in wants to send into a
stream of bytes (packet) first and then sends it by calling the Socket
object's synchronous Send() method. Before returning, this send process needs
to make sure that the other end received the packet (this is where the
reliability part comes into the picture). To know that a packet was received
correctly, the send process waits 200ms (RTO - ReTransmissionTimeout) for an
ACK packet to come back (the reception of this ACK packet is explained below
in the receive section**). If it doesn't receive an ACK by then, it resends
the same packet. It continues to resend the packet every 200ms for 30 secs.
If it still hasn't received it by then, it disconnects the connection to the
remote endpoint.

- On receiving, bytes are received asynchronously by calling the Socket
object's BeginReceive() and EndReceive() methods. The receiving process
blocks until bytes become available on the receiving port. So when bytes do
become available on the receiving port, they are read and deserialized back
to the original object. In order not to miss any further incoming bytes, the
currently received object is passed to another thread for processing. Hence
the receiving process can return and call BeginReceive() again and wait for
more bytes to come in.
Now, the new thread that just got spawned will process the packet.
Depending on the header of the packet, this thread will have different ways
in dealing with the packet. Say we have 2 types of packets - DATA and ACK. If
the receiving end received a DATA packet, then it sends back an ACK to the
sender to signify correct reception.** But if the packet was an ACK, then it
sets a synchronize flag to let the sending process know that an ACK was
received for a particular packet.

This component is now being tested with a TestServerApplication and a
TestClientApplication. The server simply waits for clients to connect. When
the client connects, the server accepts it and waits. The system was set up
such that the client app sends a 2KB packet to the server and the server
basically echoes it back. When the client receives the echo, it also echoes
that back and so they play "packet tennis". This is simply to test the
component. The size of an ACK packet is 44 bytes. So this is what happens:

- client sends 2KB to server
- server receives and sends a 44-byte ACK, and then the echo 2KB packet to
the client
- client receives the ACK (which ends the first send), and then also
receives the echoed packet... client then sends an ACK to the server, as well
as the echo packet
- and the process repeats...

Everytime the client app receives an packet, it prints out the contents on a
RichTextBox and then clears it for the next packet. So visually, the client's
RichTextBox blinks with the contents of received packets. When i have this in
the code, it works fine. However, if i comment out the section that prints
the packet contents (so the RichTextBox will always be blank), something
bizarre occurs:

For some reason, at certain points in time, the asychronous Receive calls
(at the client end) receive bytes that are of inconsistent size and it always
seems to be decreasing by 44 bytes - the size of an ACK packet. So at first,
the component may consistently receive 2048 bytes, but then later on it
receives 2004, then 1960, then 1916, and so on .. this will obviously
register as incomplete packets on the component and cause errors. This
behaviour only happens when the code which prints out the packet contents on
the client app is not included. But if it is, the thing works fine and the
packet sizes being received are all 2KB. What is going on here??

Please excuse the lengthy post. I needed to explain how my component worked
for easier analysis on your behalf. Thanks in advance.

Michael--J.
 
Michael--J said:
I have been writing a Communications protocol component in C# that will

Instead of implementing your own solution, you could use remoting.
- On receiving, bytes are received asynchronously by calling the Socket
object's BeginReceive() and EndReceive() methods. The receiving process

Avoid async-IO unless absolutely forced to do it. It is *very*
complicated to get right -- really!
Everytime the client app receives an packet, it prints out the contents on a
RichTextBox and then clears it for the next packet. So visually, the client's
RichTextBox blinks with the contents of received packets. When i have this in
the code, it works fine. However, if i comment out the section that prints
the packet contents (so the RichTextBox will always be blank), something
bizarre occurs:

Timing influences the way multi-threaded and asynchroneus programs work.
Try making a *very* small example program and do your tests on that,
so you can understand what's going on.

You can try insert some Thread.Sleep(0) in order to try and provoke some
timing errors by having the threads rescheduled.
packet sizes being received are all 2KB. What is going on here??

Could easily be a synchronization bug in your async-IO. It's *very* hard
to get async I/O right.

Are you allowed to read and write to the Socket from multiple treads at
once? I suspect not. Perhaps writing the ACK to the socket moves a
pointer in the Socket's buffer -- which would explain the behaviour you
are experiencing.
Please excuse the lengthy post. I needed to explain how my component worked
for easier analysis on your behalf. Thanks in advance.

I would guess that most of the content of the post is unrelated to your
problem.

BTW1: I don't really understand why you are using async IO for reading
the input. you could just as well use sync, with exactly the same result.

BTW2: Are you certain you will always receive 1 full packet per read of
the Socket?
 
Hello Helge,

I appreciate the reply and you taking your time to read my post. Please read
inline:

Helge Jensen said:
Instead of implementing your own solution, you could use remoting.

Yes you are correct, I could have used remoting. I would really love to use
that technology because it would definitely take a huge burden off my
shoulders. Our component however, is being designed so that it can be used in
PC applications that run under the .NET Framework (Full) environment, as well
as embedded applications (that run on embedded devices with Windows CE 5.0)
that only run in the .NET Compact Framework environment. And since the
Compact Framework does not support .NET Remoting, I could not go down that
path.
Avoid async-IO unless absolutely forced to do it. It is *very*
complicated to get right -- really!

Yeah right?!! Is Async-IO really that difficult? If I only knew that
earlier, I may have resorted to sych methods only. :(
Timing influences the way multi-threaded and asynchroneus programs work.
Try making a *very* small example program and do your tests on that,
so you can understand what's going on.

You can try insert some Thread.Sleep(0) in order to try and provoke some
timing errors by having the threads rescheduled.

Yes you are right, timing does play a great deal in this component. I
noticed that putting a Thread.Sleep(80) in place of the code that prints out
packet contents on the client app’s RichTextBox, made system run fine.
However, putting a Thread.Sleep(20) did not. If I change my implementation to
a synch-IO approach, will that help make things run better? Guaranteed? Or
will I still get timing issues?

The example programs i have written, i thought, seem small. They are
basically a server and client that play "packet tennis". I need to be able to
stress test the server with about 500 connected clients all playing packet
tennis. When you said *very* small program, what type of program do you mean?
Could easily be a synchronization bug in your async-IO. It's *very* hard
to get async I/O right.

That is really a shame. Thanks for the heads up. I think another issue that
may cause problems is the resending. When i have about 100 clients connected
the server maybe struggling to keep up with them, and each client resends
packets at an interval of 200ms (RTO). Maybe the high influx of packets
corrupts the streams or something?? I was thinking of increasing the RTO to
about 1 sec and see how that goes. I might give that a try.
Are you allowed to read and write to the Socket from multiple treads at
once? I suspect not. Perhaps writing the ACK to the socket moves a
pointer in the Socket's buffer -- which would explain the behaviour you
are experiencing.

Yes i think you are allowed to read and write to the Socket from multiple
threads. I mean, that's exactly what i'm doing. The sending process is
totally separate from the receiving process but they both access the same
Socket. Maybe that observation could be the key... hmm... Maybe that is why i
am experiencing the behaviour i mentioned... you could be on the right track
Helge...
I would guess that most of the content of the post is unrelated to your
problem.

BTW1: I don't really understand why you are using async IO for reading
the input. you could just as well use sync, with exactly the same result.

The only reason why i used asych IO was simply because it made things
"easier" and there was no blocking taking place. Using sych IO will give
exactly the same result? Including the timing issues? I hope not - i may give
it a try...
BTW2: Are you certain you will always receive 1 full packet per read of
the Socket?

No. As a matter of fact there are instances where i only receive a portion
of the packet at a time. I have actually taken this into account in my code
and tested it with dummy packets as well. The behaviour i was telling you
about where the packets sizes decrease (and hence show up as incomplete
packets) get processed in that portion of the code - but the thing is i dont
know whether their remaining bytes get received or not.
 
Michael--J said:
that only run in the .NET Compact Framework environment. And since the
Compact Framework does not support .NET Remoting, I could not go down that
path.

That's understandable.
Yeah right?!! Is Async-IO really that difficult? If I only knew that
earlier, I may have resorted to sych methods only. :(

With async-io there really isn't much of a "program-flow", which makes
it very hard to use the usual ways of understanding/arguing how a
program works.
Yes you are right, timing does play a great deal in this component. I
noticed that putting a Thread.Sleep(80) in place of the code that prints out
packet contents on the client app’s RichTextBox, made system run fine.

The "Thread.Sleep(0)" is an old hack to try and provoke synchronization
errors by littering the code with thread-switches. Thread.Sleep(0) (in
win32 atleast, not sure about .NET) yields the rest of a proccess'
timeslice to another thread.

Maybe an instrumenter that puts sleep-calls inbetween the original .NET
code would be nice to have for testing multi-threaded programs..... hmmmm...
However, putting a Thread.Sleep(20) did not. If I change my implementation to
a synch-IO approach, will that help make things run better? Guaranteed? Or
will I still get timing issues?

The problem will propbably not go away, but you will have an improved
chance of finding out whats going on.

If writing to the textbox/console/... is too slow to have the program
exhibit incorrect behaviour you can make an array of logs, something
like this:

public class CyclicLog {
public object[] Logs;
int current;
public CyclicLog(int count) { this.Logs = new object[count]; }
public void Log(object o) {
lock ( this ) {
Logs[current] = o;
current = (current + 1) % Logs.Length;
}
}
}

So you can have a kind of program-flow trace while still getting the
incorrect behaviour.
That is really a shame. Thanks for the heads up. I think another issue that
may cause problems is the resending. When i have about 100 clients connected
the server maybe struggling to keep up with them, and each client resends
packets at an interval of 200ms (RTO). Maybe the high influx of packets
corrupts the streams or something?? I was thinking of increasing the RTO to
about 1 sec and see how that goes. I might give that a try.

It may help, but only indirectly -- by reducing the amount of concurrency.
The only reason why i used asych IO was simply because it made things
"easier" and there was no blocking taking place. Using sych IO will give
exactly the same result? Including the timing issues? I hope not - i may give
it a try...

Instead of the pattern:

ReadHandler(AsyncResult r) {
...
s.BeginRead(..., new AsyncCallBack(ReadHandler));
}
void Main() {
s.BeginRead(..., new AsyncCallBack(ReadHandler));
}

you can do:

ReadHandler() {
while(...) {
int read = s.Read(...);
...
}
}
void Main() {
new Thread(new ThreadStart(ReadHandler));
}

Don't expect it to have any influence on the behaviour (it's essentially
the same code, just by different means), but the code becomes a bit more
understandable, especially in the face of exceptions.

From the docs of {Stream,NetworkStream}.BeginRead:

"Multiple simultaneous asynchronous requests render the request
completion order uncertain."

So it seems you are allowed to do multiple reads.
No. As a matter of fact there are instances where i only receive a portion
of the packet at a time. I have actually taken this into account in my code

OK, so that pit isn't waiting for you, good ;)

As a last resort, simplify the program while keeping it doing the
invalid behaviour... it's a timeconsuming but effective technique.
 
Hello Helge,

So sorry for the late reply but i was away on holiday.

I am back now and i figured out my problem. The problem was caused by the
fact that i was processing incomplete packets and putting them in temporary
byte arrays. I have now simplified the component and basically set it to
ignore incomplete packets and wait for the packet to be resent and received
as a full packet.

Now i have come across another problem. I am now running the TestClient on
an embedded board loaded with Win CE 5.0. Obviously, there isnt much RAM or
CPU to work with. Running the TestServer on my PC and the TestClient on the
board the same way i did before, it works for a while and then for some
reason, on the embedded side, an OutOfMemoryException occurs. It occurs at a
random point in time when a structure is received from the network and is
passed to another thread for processing. When Thread.Start is called, the
exception occurs. I then ran the app again with the RAM settings opened (from
Control Panel->Settings->System) and noticed that the Program Memory was
decreasing as it ran. However, when the exception occurs, there seems to be a
few MBs left in the Program Memory side. I dont understand, why did this
happen? Thanks.

Helge Jensen said:
Michael--J said:
that only run in the .NET Compact Framework environment. And since the
Compact Framework does not support .NET Remoting, I could not go down that
path.

That's understandable.
Yeah right?!! Is Async-IO really that difficult? If I only knew that
earlier, I may have resorted to sych methods only. :(

With async-io there really isn't much of a "program-flow", which makes
it very hard to use the usual ways of understanding/arguing how a
program works.
Yes you are right, timing does play a great deal in this component. I
noticed that putting a Thread.Sleep(80) in place of the code that prints out
packet contents on the client app’s RichTextBox, made system run fine.

The "Thread.Sleep(0)" is an old hack to try and provoke synchronization
errors by littering the code with thread-switches. Thread.Sleep(0) (in
win32 atleast, not sure about .NET) yields the rest of a proccess'
timeslice to another thread.

Maybe an instrumenter that puts sleep-calls inbetween the original .NET
code would be nice to have for testing multi-threaded programs..... hmmmm...
However, putting a Thread.Sleep(20) did not. If I change my implementation to
a synch-IO approach, will that help make things run better? Guaranteed? Or
will I still get timing issues?

The problem will propbably not go away, but you will have an improved
chance of finding out whats going on.

If writing to the textbox/console/... is too slow to have the program
exhibit incorrect behaviour you can make an array of logs, something
like this:

public class CyclicLog {
public object[] Logs;
int current;
public CyclicLog(int count) { this.Logs = new object[count]; }
public void Log(object o) {
lock ( this ) {
Logs[current] = o;
current = (current + 1) % Logs.Length;
}
}
}

So you can have a kind of program-flow trace while still getting the
incorrect behaviour.
That is really a shame. Thanks for the heads up. I think another issue that
may cause problems is the resending. When i have about 100 clients connected
the server maybe struggling to keep up with them, and each client resends
packets at an interval of 200ms (RTO). Maybe the high influx of packets
corrupts the streams or something?? I was thinking of increasing the RTO to
about 1 sec and see how that goes. I might give that a try.

It may help, but only indirectly -- by reducing the amount of concurrency.
The only reason why i used asych IO was simply because it made things
"easier" and there was no blocking taking place. Using sych IO will give
exactly the same result? Including the timing issues? I hope not - i may give
it a try...

Instead of the pattern:

ReadHandler(AsyncResult r) {
...
s.BeginRead(..., new AsyncCallBack(ReadHandler));
}
void Main() {
s.BeginRead(..., new AsyncCallBack(ReadHandler));
}

you can do:

ReadHandler() {
while(...) {
int read = s.Read(...);
...
}
}
void Main() {
new Thread(new ThreadStart(ReadHandler));
}

Don't expect it to have any influence on the behaviour (it's essentially
the same code, just by different means), but the code becomes a bit more
understandable, especially in the face of exceptions.

From the docs of {Stream,NetworkStream}.BeginRead:

"Multiple simultaneous asynchronous requests render the request
completion order uncertain."

So it seems you are allowed to do multiple reads.
No. As a matter of fact there are instances where i only receive a portion
of the packet at a time. I have actually taken this into account in my code

OK, so that pit isn't waiting for you, good ;)

As a last resort, simplify the program while keeping it doing the
invalid behaviour... it's a timeconsuming but effective technique.

--
Helge Jensen
mailto:[email protected]
sip:[email protected]
-=> Sebastian cover-music: http://ungdomshus.nu <=-
 
Back
Top