Network Connection Problem

  • Thread starter Thread starter Steve McKewen
  • Start date Start date
S

Steve McKewen

I am working on network connected controllers, controlled from PCs.
The PCs open a network connection (TCPIP socket connection) to a controller
and then carry on a conversation with the controller, monitoring inputs and
switching outputs. That part works fine.
Our software will connect to a device work fine. If the device turns off or
is disconnected, the connection failure is detected and then reconnected and
operation resumes.

The problem is this:
During testing with mulitple controllers, we normally only have one
operational controller and the rest have failed connections. Only the
controller that we are currently testing is actually present in the test
environment.

In this situation the one working device won't connect.

We have changed the number of devices that are attempting to connect, and
fewer devices allows the one working device to connect; but why?
A failed connection takes about 13 seconds to fail, and doesn't cause much
communication or processing.
The network traffic is minimal.
When the connection succeeds, the connection works exactly as normal. There
doesn't seem to be any performance degradation - if it works it works fine.

On a fresh install of windows (XP or Vista) the test system works fine with
24 failed devices and one working.
On my development systems, it won't work.
I even get the same effect if I run a simulator on the same PC that is
running the test so that I am connecting to 127.0.0.1 and have no network
connection to any other devices.

I have turned off all unneccessary protocols and applications.
I can't find any difference in the drivers.

Any help or suggestions appreciated.

Steve
 
It probably has nothing to do with "load".

Most likely it is a socket management design flaw in the Application running
the Devices or a flaw with the Devices themselves.

To give you a similar type of example,...some "home-user" NAT Boxes (I
refuse to call them "routers") that people buy for their home internet
connection will have similar problems with VPN. If more than one user is
trying to VPN outbound to somewhere only the first one succeeds and the
others will fail until the first person shuts down their VPN. They end up
operating one at a time in a first come first serve basis.

Contact the vendor of the product, maybe they have some answers. If they
don't seem to understand the possiblilty of the problem I am
describing,...then that could explain why the devices have the problem
:-).

--
Phillip Windell
www.wandtv.com

The views expressed, are my own and not those of my employer, or Microsoft,
or anyone else associated with me, including my cats.
 
Thanks Phillip.

I thought it was a socket management flaw too, but I am developing the
Application.
We have simplified the socket function to the point that it doesn't actually
do anything except try to connect. If the connection attempt times out, it
tries again. If the connection attempt succeeds then a message is sent once
per second to confirm that the connection hasn't broken.
At the moment the communication is just a socket->Connect and then a once
per second message to confirm that the connection has not broken.

The strange thing is that I have one network which shows this connection
problem, and I haven't been able to reproduce it on another network, or
solve it on this network.

All failed connections simply report that "the connection attempt failed
because the client did not respond in a timely manner". That is the
exception reported by the socket, and is normal for a connection attempt to
something that is not there.

What I am really asking, is what can interfere with the connection process.

Steve
 
Ok. Well, hmmm....

I guess you can use a network monitor (NetMon, Etherreal, etc) and examine
the first one that succeeds and then monitor the susequent one that fails.

There should be the normal Initial connection port (Destination Port), but
then there should be a random Source Port. This source port needs to be
random and different for each connection. It sort of becomes the "unique
identifier" to distingush one connection from another. You can track the
conversation via this Destination & Source Port combination. If you have
two connections trying to use the same Source Port it will fail because the
Destiantion Port is already going to be the same and there won't be any way
to distinguish the connection.

That is the best I can think of. I am already "stretching" myself. I
anyone else has any ideas they are welcome to jump in.

--
Phillip Windell
www.wandtv.com

The views expressed, are my own and not those of my employer, or Microsoft,
or anyone else associated with me, including my cats.
 
Phillip said:
Ok. Well, hmmm....

I guess you can use a network monitor (NetMon, Etherreal, etc) and examine
the first one that succeeds and then monitor the susequent one that fails.

There should be the normal Initial connection port (Destination Port), but
then there should be a random Source Port. This source port needs to be
random and different for each connection. It sort of becomes the "unique
identifier" to distingush one connection from another. You can track the
conversation via this Destination & Source Port combination. If you have
two connections trying to use the same Source Port it will fail because the
Destiantion Port is already going to be the same and there won't be any way
to distinguish the connection.

That is the best I can think of. I am already "stretching" myself. I
anyone else has any ideas they are welcome to jump in.
ports where introduced in the tcp/ip stack to distinguish
applications not sessions !

iirc sessions are usually handled or should be handled by
either the application protocol and/or the initial sequence numbers.
 
Hi goarilla.
ports where introduced in the tcp/ip stack to distinguish
applications not sessions !

If you look at the http example. A web server listens on one port, and can
connect to lots of clients on that one port, by distinguishing the
connections by the source-destination pairs.
Servers also listen on one port, and then tell the client which alternate
port to connect to. This allows them to handle even more connections.
Different protocols will do this in different ways, depending on the
intended data throughput and fanout.
When you connect to a web server on port 80, that doesn't stop you from
connecting to another web server on port 80, or another connection to the
same server on port 80.
The web server must listen for connections on port 80 at its end, but will
accept any port at the client end. The connection is identified by the
source destination pair.

In my case the server listens and connects on the same port as it will only
connect to less than 100 clients at a time, and typically only 1. I am
trying to connect to 15 servers, so each server will be on the same
destination port (seen from my end), but on a different source port at my
end (allocated automatically by my network stack).

I request a socket connection to an IP address and port number (EndPoint),
and if a connection is made, I get a socket handle. I can ask the socket
which local port I am on, but I don't actually care. If I am running a
server service at the same time, then the appropriate port will be in
"listen" mode, and therefore unavailable for allocation to a new connection.

Steve
 
Thanks Phillip

I have been using NetMon

When I make a connection I specify a desitination IP address and port number
for the socket connection, and the network stack will allocate a source port
in the range 1000 to 5000 (from memory).
The source port is random, but the destination port is protocol specific,
when I am the client.
The connection is defined by the source-destination pair.

The port allocation is handled by the network stack.

I am wondering if the ARP resolution stuff can be confused somehow into not
resolving a MAC address when there are several outstanding requests, but I
can't see what would change the mechanism from one network to another.
I am using server addresses that are valid for my local network, but the
devices aren't there.

I am thinking that it must be something in the configuration of my PC, as I
can get this effect when I am running a server simulation on my PC, and
trying to connect to 14 other devices and my simulation at the same time. I
have no network hardware connected except a hub with nothnig else connected
to it (otherwise I get a "network not connected").

Steve
 
goarilla said:
ports where introduced in the tcp/ip stack to distinguish
applications not sessions !

Only the Destination "listening port" identifies the Application.
Random Clients Source Ports in combination with the Destiantion port
identify Sessions.
iirc sessions are usually handled or should be handled by
either the application protocol and/or the initial sequence numbers.

No sequence numbers are for re-assembling the data after the packets are
received to make sure the packets are put back together in the correct
order.

--
Phillip Windell
www.wandtv.com

The views expressed, are my own and not those of my employer, or Microsoft,
or anyone else associated with me, including my cats.
-----------------------------------------------------
 
I doubt MAC & ARP have anything to do with it.

I think the problem is at Layer4 or higher, but I don't know anymore than
what I have said to this point.


--
Phillip Windell
www.wandtv.com

The views expressed, are my own and not those of my employer, or Microsoft,
or anyone else associated with me, including my cats.
 
Phillip said:
No sequence numbers are for re-assembling the data after the packets are
received to make sure the packets are put back together in the correct
order.

they serve both purposes
according to the dutch wikipedia:
http://nl.wikipedia.org/wiki/Transmission_Control_Protocol

Sequentienummer(4 bytes). Een getal dat door een partij bij het maken
van de verbinding vrijwel willekeurig gegenereerd wordt, waarna het door
die partij de rest van de sessie gebruikt wordt om aan te geven dat het
om diezelfde sessie gaat.

or loosely translated:

sequencenumber (4 bytes). A number that is created randomly by each
party, upon it is used by said party the rest of the session to
differentiate the session !

and if it weren't for ISN's connections would be super trivial to hijack
and would even occur from time to time randomly (think multiple clients
behind a NAT).
 
Back
Top