Win2K SP4/XP Pro SP1 Clients/Terminal Svcs/Sessions drop

  • Thread starter Thread starter Jim
  • Start date Start date
J

Jim

Two remote sites connect across their own DSL connection
to the MAIN location. The XP Pro Clients use what comes
with XP Pro to connect back via terminal services
to the Win2K server #1 for Authentication and to
Win2K server #2 for Application (accounting & business
software)

The Three locations are all different business names but
all are owned by the same people and the software allows
multiple companies within it.

PROBLEM: Users are complaining about being kicked off the
server. They say problem lasts about 5-10 minutes at a
time and occurs about three to four times a day.

SITUATION: All three DSL connections have been monitored
since the first report of trouble. All remain UP.
During the first phone call I had the customer's
networking vendor to view the DSL light on the DSL modem
and it remained in SYNC with the DSLAM.

Further logs show traffic since the 10/30/2003 to be
normal bandwidth although on 10/30 at first the Main
location had maxed out their outbound connection.(toward
the Internet) Since then that has yet to reoccur. Since
I suspected a worm busying out the connection and possibly
that the 19-20 users might just be too many, we increased
the speed of the Main site. The outbound Internet usage
to this point is still normal and below half of the 10/30
level.

Steps taken:
The DS3 to the Memphis area is being monitored without any
evidence to suggest trouble thus far. The individual DSL
circuits are being monitored and all remain up.

Asked vendor to remove SP4 and go back to SP3 on both W2k
servers. (one auth, one app) (no comittment to do this yet)

Asked vendor about licensing (how many present vs how many
in use) (no reply yet)

I need a MS document that shows the actual release date of
SP4. If anyone knows where to find it, please email it or
a link to it to me. jdixon(at)communigroup(dot)com
I have searched google and the MS website and the only
references to it show 8/21/03 when the web page was last
updated.

This does not appear to be an ISP related issue but rather
a session related issue since the DSL circuits remain up.
I know that some people have had trouble with Terminal
Services and SP4.
 
Are *all* remote sites disconnected simultaneously? If yes, the
problem should be located near the main site. Could be anything in
the network (faulty switch, faulty power supply, network card,
cables, etc).
Are *all* users within a site disconnected simultaneously, or only
some at a time?
Are they disconnected when active, or when they are idle for a
couple of minutes?
If during idle time, it could be that there is a router between
your clients and your server, which sees no traffic and decides
that the session must have ended and throws it out. The solution
in that case is to enable KeepAlives. For the correct registry
key, check:

216783 - Unable to Completely Disconnect a Terminal Server
Connection
http://support.microsoft.com/?kbid=216783
 
Questions have been asked.
No reply yet from vendor.
Some questions answered INLINE BELOW <<

-----Original Message-----
Are *all* remote sites disconnected simultaneously?

If yes, the problem should be located near the main site.

Could be anything in the network (faulty switch, faulty
power supply, network card, cables, etc).

Are *all* users within a site disconnected simultaneously,
or only some at a time? << only some is the information I
have

Are they disconnected when active, or when they are idle
for a couple of minutes? << 5-10 minutes I am told

If during idle time, it could be that there is a router
between your clients and your server, which sees no
traffic and decides that the session must have ended and
throws it out. << There is a router between the client and
each DSL (WAN) Connection and a DSL router at the head-end
or main site. Linksys 8-port DSL router. The DSL modems
are all RFC1483 Transparent Bridges and all remain IN SYNC
with DSLAM during trouble according to vendor during the
first report to me on 10/30. While trouble occured he
watched light on DSL modem and it remained solid green
indicating a SYNC connection. My Router has ATM-OAM
turned for each subinterface for all three sites. EACH of
those is in turn logged to an HPOV server which has no
up/down transitions logged for these. The ATM Circuit was
checked thoroughly since yesterday with no evidence of any
trouble found.
I believe this to be a SESSION LAYER issue not a Layer
1,2,or 3 issue.

<< How can a router determine that a SESSION is idle and
remove it? The session should occur between the two end
points as I understand it and only the two end points
should have control of the session. To my knowledge basic
configuration of routers does not even view the session
level data, and they forward on based on layer three
and/or layer two where bridging is concerned based on the
routes and rules and access lists that are set. I am not
fully understanding what you said about how the router
could terminate the session once established. Is there
some reference that you could post here or email to me
that I may look it up?
Thanks!
 
UPDATE:
Vendor emailed me back to say:
as for the users and trouble times, so far when one cant
get on, no one can get on and the servers havn't shown the
first error yet. It simply sees that the user was on and
then disconnected.
 
Sorry, I should have used the word "connection" in stead of
"session". Of course, the router has no idea what is going on
inside the session.
But rdp is a very low bandwidth protocol, and when users are idle,
there may not be any packets send at all during a long period.
Many routers drop such connections.

See for instance:
http://www.smallnetbuilder.com/Sections-article18-page3.php

<quote>
Connection Controls
These include a number of different features, intended to give you
control over how long a connection is maintained when there is no
network activity and what is done if you are disconnected. Most
routers default to automatically connecting when Internet related
network activity is detected, but the Linksys routers put this
under the control of a "Connect on Demand" setting. "Maximum Idle
Time" settings control the time that the router waits to drop the
connection when there is no Internet related network activity. An
"Auto-Reconnect" feature automatically tries to restore the
connection when it's dropped.

Keep Alive
One of the very common problems with PPPoE connections is that the
connection is frequently dropped. Some BSPs do this intentionally,
much as a dialup ISP will drop your connection after a certain
period of inactivity, but others just don't have their PPPoE
servers set up properly. A "Keep Alive" feature will try to keep
the connection up by forcing a short burst of Internet activity
after a programmable period of time.
</quote>
I'm sure there are better source of information on this topic.

This is actually quite a common problem, as you'll see when you
search the ts-related newsgroups for "dropped connection" os
"client disconnect" or similar search terms. Many people have
reported that enabling KeepAlives fixes the problem, by putting a
"heartbeat" on the connection. I would also check if the routers
have the latest firmware.

Just to be sure: I assume that you have checked the settings for
idle time-out limits in Connection Configuration?
 
UPDATE: Message when I purposely knocked down the
connection to get an error message with Customer on the
phone at her machine sitting at her main login screen.
********************************************************
the connection to the remote computer was broken,
this may have been cause by a network please try
connecting to the remote computer again.

It was a Gray box with red warning "X"
**********************************************************

When Customerm called she was having trouble. I pinged all
three locations and all responded. She was sitting at he
PC and was able to browse the Internet during the trouble.

error message from Terminal Services during the phone call
prior to knocking the DSL down temporarily at the remote
location the customer was calling from
***********************************************************
The client could not establish a connection to the remote
computer Possible reasons are listed below.
1. remote connections terminated
2. maximum number of connections exceeded
3. a network error occured.
**********************************************************
The problem must either be with SP4 at the main site on
the servers or with the equipment change at the time the
ISP changed.
-----Original Message-----
Are *all* remote sites disconnected simultaneously?

If yes, the problem should be located near the main site.

Could be anything in the network (faulty switch, faulty
power supply, network card, cables, etc).

Are *all* users within a site disconnected
simultaneously, Yes once problem begins it happens for
EACH user.

Are they disconnected when active, or when they are idle
for a couple of minutes? << 5-10 minutes I am told
If during idle time, it could be that there is a router
between your clients and your server, which sees no
traffic and decides that the session must have ended and
throws it out. There is a router between the client and
each DSL (WAN) Connection and a DSL router at the head-end
or main site. Linksys 8-port DSL router. The DSL modems
are all RFC1483 Transparent Bridges and all remain IN SYNC
with DSLAM during trouble according to vendor during the
first report to me on 10/30. While trouble occured he
watched light on DSL modem and it remained solid green
indicating a SYNC connection. My Router has ATM-OAM
turned for each subinterface for all three sites. EACH of
those is in turn logged to an HPOV server which has no
up/down transitions logged for these. The ATM Circuit was
checked thoroughly since yesterday with no evidence of any
trouble found. I believe this to be a SESSION LAYER issue
not a Layer 1,2,or 3 issue.

Two remote sites connect across their own DSL
connection to the MAIN location. The XP Pro Clients use
what comes with XP Pro to connect back via terminal
services to the Win2K server #1 for Authentication and to
Win2K server #2 for Application (accounting & business
software) The Three locations are all different business
names but all are owned by the same people and the
software allows multiple companies within it.

PROBLEM: Users are complaining about being kicked off
the server. They say problem lasts about 5-10 minutes at
a time and occurs about three to four times a day.

SITUATION: All three DSL connections have been
monitored since the first report of trouble. All remain
UP. During the first phone call I had the customer's
networking vendor to view the DSL light on the DSL
modem
and it remained in SYNC with the DSLAM.
Further logs show traffic since the 10/30/2003 to be
normal bandwidth although on 10/30 at first the Main
location had maxed out their outbound connection.
(toward the Internet) Since then that has yet to recur.
Since I suspected a worm busying out the connection and
possibly that the 19-20 users might just be too many, we
increased the speed of the Main site. The outbound
Internet usage to this point is still normal and below
half of the 10/30 level.
Steps taken:
The DS3 to the Memphis area is being monitored without
any evidence to suggest trouble thus far. The individual
DSL circuits are being monitored and all remain up.
Asked vendor to remove SP4 and go back to SP3 on both
W2k servers. (one auth, one app) (no comittment to do this
yet)
Asked vendor about licensing (how many present vs how
many in use) (no reply yet)
I need a MS document that shows the actual release date
of SP4. If anyone knows where to find it, please email it
or This does not appear to be an ISP related issue but
rather a session related issue since the DSL circuits
remain up.
I know that some people have had trouble with Terminal
Services and SP4.
 
Did you read my previous reply? I would still enable KeepAlives,
if only for the sake of testing and ruling out the most common
cause of disconnection problems.

That the client can use the local IE when the rdp session
disconnects only tells you that she didn't loose all connectivity.
 
UPDATE:
visited client yesterday.
HOST or MAIN site where W2K servers (AUTH and APP) are
located NEVER experience T/S Troubles or get
disconnected. Learned from customer that apparently as
long as the client machine was actively using the APP
server the client remained connected, but if user took a
break and returned, disconnect was more likely to occur.

Changes made:
Linksys BEFSR81 Router changed to Cisco SOHO97
SESSION IDLE Time on Win2K server T/S setting changed from
disconnect every 30 minutes overriding client settings to
NEVER.

Router change did not appear to have any effect since
problem recurred.

SESSION IDLE timeout change may have provided some help
since problem did not recurr for over 2 hours afterwards.
We are playing "WAIT AND SEE" at this point.

EXACT ERROR from one of the clients that disconnected.
TITLE OF ERROR WINDOW: Remote Desktop Disconnected.
<RED X>
The connection to thbe rmote computer was broken. This
may have been caused by a network error. Please try
connecting to the remote computer again <OK BUTTON><HELP
BUTTON>

Additional info:
Licensing errors were in log file but not enough to
justify the amount of disconnections 3-5 maybe in log.
The disconnects happen everyday 7-8 times a day customer
says.

Additional Test:
T/S License 20
APP Software License 15
Amount of users connected yesterday 15.
We then added two more users.
Result: It allowed the connects to the APP Software
successfully and did not disconnect anyone leading us to
believe this is NOT the cause and that a Licensing issue
is most likely NOT the cause at this point.
Verification was obtained from APP Software people that
this was normal behavior for their software and licensing.
 
UPDATE:
visited client yesterday.
HOST or MAIN site where W2K servers (AUTH and APP) are
located NEVER experience T/S Troubles or get
disconnected. Learned from customer that apparently as
long as the client machine was actively using the APP
server the client remained connected, but if user took a
break and returned, disconnect was more likely to occur.

Changes made:
Linksys BEFSR81 Router changed to Cisco SOHO97
SESSION IDLE Time on Win2K server T/S setting changed from
disconnect every 30 minutes overriding client settings to
NEVER.

Router change did not appear to have any effect since
problem recurred.

SESSION IDLE timeout change may have provided some help
since problem did not recurr for over 2 hours afterwards.
We are playing "WAIT AND SEE" at this point.

EXACT ERROR from one of the clients that disconnected.
TITLE OF ERROR WINDOW: Remote Desktop Disconnected.
<RED X>
The connection to thbe rmote computer was broken. This
may have been caused by a network error. Please try
connecting to the remote computer again <OK BUTTON><HELP
BUTTON>

Additional info:
Licensing errors were in log file but not enough to
justify the amount of disconnections 3-5 maybe in log.
The disconnects happen everyday 7-8 times a day customer
says.

Additional Test:
T/S License 20
APP Software License 15
Amount of users connected yesterday 15.
We then added two more users.
Result: It allowed the connects to the APP Software
successfully and did not disconnect anyone leading us to
believe this is NOT the cause and that a Licensing issue
is most likely NOT the cause at this point.
Verification was obtained from APP Software people that
this was normal behavior for their software and licensing.
 
Back
Top