Intermittent sockets performance problem with multiple remote machines

  • Thread starter Thread starter David Doran
  • Start date Start date
D

David Doran

Hi All,

I've been trying to find a reason for this for a while now and am completely
sick of it. I have a test app which uses System.Net.Sockets.Socket to
receive large amounts of data (2MBytes total) from serveral worker processes
on remote machines. Most of the time the data is received quickly but about
one run in 10 it completes a lot slower. It seems like there's some kind of
collision / contention issue.

Example output from the test app:

Waiting for results... Got results in 79ms.
Waiting for results... Got results in 79ms.
Waiting for results... Got results in 62ms.
Waiting for results... Got results in 63ms.
Waiting for results... Got results in 47ms.
Waiting for results... Got results in 46ms.
Waiting for results... Got results in 469ms. ! Ten times slower than the
previous identical run !
Waiting for results... Got results in 47ms.
Waiting for results... Got results in 63ms.
Waiting for results... Got results in 79ms.
Waiting for results... Got results in 47ms.
Waiting for results... Got results in 47ms.
Waiting for results... Got results in 47ms.

It's a multithreaded app, each thread sends a work request to a process on a
remote machine and then waits for the results. The workers do nothing except
immediately send dummy data in response to the requests. When it does go
slow (e.g. the 469ms run above), it's because one or two of the threads had
to wait for all expected data to arrive. Environment is .Net 1.1, Gigabit
ethernet through a 1Gb switch, Windows 2000 Advanced Server.

Some observations:
- The above results were with 4 workers machines (all, workers and master,
are dual Xeon 2.4GHz, 1GB RAM) sending results to the main test app. When I
instead use 4 worker apps on a single remote machine the problem does not
occur. It's less frequent using 2 processes on each of 2 servers. This
supports the collision/contention hypothesis.
- The problem is less likely to occur if I call Socket.Send() with smaller
chunks. Above results with 64kB chunks. Using 8kB chunks it happens about
once every 50 to 100 runs. But presuming it's a collision problem, it will
get worse when I connect up many more worker servers, as I intend to do.

Anybody have a suggestion what might be going on here? Could it be a
hardware problem? TCP stack in Windows? .NET implementation / wrapper of
sockets? I've heard rumour that Win 2k3 server has optimised network
handling but have yet to try it. Don't think it's down to our particular
hardware as similar results have been reported on a completely different
setup. Haven't had time to try a similar thing in e.g. unmanaged C++ to
check whether or not it's a .net thing.

Thanks in advance for any advice offered.
Cheers,
Dave.
 
I've written lots of apps in .net using the System.Net.Sockets.Socket class,
and I've never had an issue with the performance. In fact, in many cases
it's as fast as the apps I've written in unmanaged c++ as far as network
performance. If you could send me socket code you've written, as well as the
part of the code that does the multi-threading I can send you back some
suggestions for improvement in speed.

p.s. doesn't matter if it's vb or c#, as I'm very good in either language.

thanks,

Nate Thornton
(e-mail address removed)
 
Back
Top