Argghhh... 30 Minute Log-in's :-(

  • Thread starter Thread starter Dave Onex
  • Start date Start date
D

Dave Onex

Hi Folks!

I have an all Windows 2000 network comprised of 4 servers, two of which are
DC's.
The two server's that are not DC's are my mail server and my proxy server
(ISA).

The problem I'm having is this, logging on to the ISA machine is now taking
forever (30 minutes or so). The computer sits there saying, "Applying your
personal settings" until you get really, really mad!

I know from past experience that this usually means that the machine is
having problems contacting the DC during the log-on process and that's
usually caused by a DNS issue. Thing is, my DNS is correct all the way
through. In addition, my ISA server is set up correctly. This network has
been operational for years - literally.

So what changed? I added another NIC to the ISA machine so that I could team
the two internal NIC's. The team is set up correctly and has the proper IP
addresses. It should work, just as it did before. The ISA machine can ping
and resolve all the machines on the internal network including the 2 domain
controllers. The event viewer on the ISA machine has these two errors
listed;

First;

Event ID 1000
Windows cannot establish a connection to domain.com with (0).

Then;

Event ID 1000
Windows cannot query for the list of Group Policy objects . A message that
describes the reason for this was previously logged by this policy engine.
(that's the one above)

That's it. Those are the only two errors that the machine will cough up. I
can ping the domain controllers, I can do reverse lookups to the domain
controllers. I can access
\Backup\SYSVOL\domain.com\Policies\{really-long-guid}\GPT.ini and read it. I
can also browse the network and see the shares on other computers but I
can't access the data in any of them - and I used to be able to. I am logged
on as the administrator and have full rights to all that stuff.

I tried changing the binding order on the proxy so that the internal NIC
team is first. I tried re-creating the machine's account in active directory
by resetting it and then re-joining it to the domain - no difference.

I don't really understand what the issue is. I tried removing ISA altogether
and also removed the new NIC and put it all back the way it was and still
got the 30 minute log-in experience :-) Something is up with respect to that
machine and the domain controller but what could it be? It's almost as if
that domain controller refuses to deal with the ISA server for some
reason....

Best & Thanks!
Dave
 
Hello Dave,

Please post an unedited ipconfig /all from all DC/DNS server and the problem
machine so we can check DNS settings.

Best regards

Meinolf Weber
Disclaimer: This posting is provided "AS IS" with no warranties, and confers
no rights.
** Please do NOT email, only reply to Newsgroups
** HELP us help YOU!!! http://www.blakjak.demon.co.uk/mul_crss.htm
 
Dave Onex said:
Hi Folks!

I have an all Windows 2000 network comprised of 4 servers, two of which
are DC's.
The two server's that are not DC's are my mail server and my proxy server
(ISA).

The problem I'm having is this, logging on to the ISA machine is now
taking forever (30 minutes or so). The computer sits there saying,
"Applying your personal settings" until you get really, really mad!

I know from past experience that this usually means that the machine is
having problems contacting the DC during the log-on process and that's
usually caused by a DNS issue. Thing is, my DNS is correct all the way
through. In addition, my ISA server is set up correctly. This network has
been operational for years - literally.

So what changed? I added another NIC to the ISA machine so that I could
team the two internal NIC's. The team is set up correctly and has the
proper IP addresses. It should work, just as it did before. The ISA
machine can ping and resolve all the machines on the internal network
including the 2 domain controllers. The event viewer on the ISA machine
has these two errors listed;

First;

Event ID 1000
Windows cannot establish a connection to domain.com with (0).

Then;

Event ID 1000
Windows cannot query for the list of Group Policy objects . A message that
describes the reason for this was previously logged by this policy engine.
(that's the one above)

That's it. Those are the only two errors that the machine will cough up. I
can ping the domain controllers, I can do reverse lookups to the domain
controllers. I can access
\Backup\SYSVOL\domain.com\Policies\{really-long-guid}\GPT.ini and read it.
I can also browse the network and see the shares on other computers but I
can't access the data in any of them - and I used to be able to. I am
logged on as the administrator and have full rights to all that stuff.

I tried changing the binding order on the proxy so that the internal NIC
team is first. I tried re-creating the machine's account in active
directory by resetting it and then re-joining it to the domain - no
difference.

I don't really understand what the issue is. I tried removing ISA
altogether and also removed the new NIC and put it all back the way it was
and still got the 30 minute log-in experience :-) Something is up with
respect to that machine and the domain controller but what could it be?
It's almost as if that domain controller refuses to deal with the ISA
server for some reason....

Best & Thanks!
Dave


Did you check the LAT in ISA to make sure the internal subnets are local and
not remote?

Ace
 
Ace Fekay said:
Did you check the LAT in ISA to make sure the internal subnets are local
and not remote?

Ace

Hi Ace - really good to hear from you :-)

Yes! That was my first thought - that ISA was sending the requests out the
wrong network card and trying to reach the DC's by using the external NIC.
To that end, after I created the NIC team I thought that maybe ISA didn't
'understand' so I re-ran the local network wizard and removed and re-added
the new logical adapter.
No dice. I then un-installed ISA altogether only to find the same thing - 30
minute log-on times.
I then re-installed ISA and loaded in my most recent backup - same thing :-(

The only other member server (my mail server) also does the same thing. I
made no changes to it whatsoever - it also happened after I added the extra
NIC in ISA.
My thinking on that front is that it's happening to that machine because it
uses ISA as it's default gateway.

There are two XP workstations - both of these can log-on and log-off the
domain with no issues. So it seems to be localized to only Win2K domain
members. All machines can ping and lookup the addresses of the domain
controllers.
I think the problem must be localized to the ISA machine but I can't figure
it out. I even took the extra NIC out of the ISA machine only to find the
same thing. Un-installing ISA results in the same thing.

What the heck can it be?

Best & Thanks!
Dave (pulling the hair out of my head)
 
Dave Onex said:
Hi Ace - really good to hear from you :-)

Yes! That was my first thought - that ISA was sending the requests out the
wrong network card and trying to reach the DC's by using the external NIC.
To that end, after I created the NIC team I thought that maybe ISA didn't
'understand' so I re-ran the local network wizard and removed and re-added
the new logical adapter.
No dice. I then un-installed ISA altogether only to find the same thing -
30 minute log-on times.
I then re-installed ISA and loaded in my most recent backup - same thing
:-(

The only other member server (my mail server) also does the same thing. I
made no changes to it whatsoever - it also happened after I added the
extra NIC in ISA.
My thinking on that front is that it's happening to that machine because
it uses ISA as it's default gateway.

There are two XP workstations - both of these can log-on and log-off the
domain with no issues. So it seems to be localized to only Win2K domain
members. All machines can ping and lookup the addresses of the domain
controllers.
I think the problem must be localized to the ISA machine but I can't
figure it out. I even took the extra NIC out of the ISA machine only to
find the same thing. Un-installing ISA results in the same thing.

What the heck can it be?

Best & Thanks!
Dave (pulling the hair out of my head)

BTW, the mail server is reporting almost the exact same errors except in
this case it looks like it tried to contact the second domain
controller...without success.

Could not open LDAP session to directory 'second.domain.controller' using
local service credentials. Cannot access Connection Agreement configuration
information. Make sure the server 'second.domain.controller' is running
Windows cannot establish a connection to my.domain.com with (0).
Windows cannot query for the list of Group Policy objects . A message that
describes the reason for this was previously logged by this policy engine.
(the previous line)

Is it possible that the whole NIC issue is a red herring of some sort? Is it
possible something got pooched when I re-started all the machines? Something
to do with ActiveDirectory? That only effects the two Win2K domain members?

I'm certain DNS is correct - nothing was really changed. ISA rules are all
in place and it's run for about 3 years without an issue.
 
Dave Onex said:
BTW, the mail server is reporting almost the exact same errors except in
this case it looks like it tried to contact the second domain
controller...without success.

Could not open LDAP session to directory 'second.domain.controller' using
local service credentials. Cannot access Connection Agreement
configuration information. Make sure the server 'second.domain.controller'
is running
Windows cannot establish a connection to my.domain.com with (0).
Windows cannot query for the list of Group Policy objects . A message that
describes the reason for this was previously logged by this policy engine.
(the previous line)

Is it possible that the whole NIC issue is a red herring of some sort? Is
it possible something got pooched when I re-started all the machines?
Something to do with ActiveDirectory? That only effects the two Win2K
domain members?

I'm certain DNS is correct - nothing was really changed. ISA rules are all
in place and it's run for about 3 years without an issue.


I just checked the first domain controller and found an error message there
that might help...
The session setup from the computer PROXY failed to authenticate. The name
of the account referenced in the security database is PROXY$. The following
error occurred:
Access is denied.
 
Dave Onex said:
I just checked the first domain controller and found an error message
there that might help...
The session setup from the computer PROXY failed to authenticate. The name
of the account referenced in the security database is PROXY$. The
following error occurred:
Access is denied.


Wow, you've been busy today. Are you in the US? If so, Happy T-Day.

Are you using the firewall client?

I think this would be best posted in the ISA group for better help. I know
ISA, but I do not use it day to day, and I think the folks that use it on a
daily basis may be better help. I cross-posted it to microsoft.public.isa
with this response.

And stop pulling your hair out. :-)

Ace
 
Ace Fekay said:
Wow, you've been busy today. Are you in the US? If so, Happy T-Day.

Are you using the firewall client?

I think this would be best posted in the ISA group for better help. I know
ISA, but I do not use it day to day, and I think the folks that use it on
a daily basis may be better help. I cross-posted it to
microsoft.public.isa with this response.

And stop pulling your hair out. :-)

Ace

Hi Ace;

I really don't think it's an ISA issue. I just re-built the ISA server from
scratch and the log-in problem occurred well before ISA was installed. I did
a clean O/S install and then went to Service Pack 4 and then joined the
domain. As soon as I joined the domain - blammo - the long log-in times
started happening. So, I was able to eliminate ISA from the loop right off
the bat.

I have been pulling my hair out. It took a long time to re-build the ISA
machine (many, many long log-in's occurred after each re-start!).

At this point I don't know what it is. It's a weird thing but it's true. One
thing I just found out is that one of my secondary DNS servers was not able
to pull a copy of a zone from the primary. The event viewer for the
secondary DNS server complains that the primary did not send the zone and
the logs on the primary report that it did send the zone.

There's something weird going on here....
 
Dave Onex said:
Hi Ace;

I really don't think it's an ISA issue. I just re-built the ISA server
from scratch and the log-in problem occurred well before ISA was
installed. I did a clean O/S install and then went to Service Pack 4 and
then joined the domain. As soon as I joined the domain - blammo - the long
log-in times started happening. So, I was able to eliminate ISA from the
loop right off the bat.

I have been pulling my hair out. It took a long time to re-build the ISA
machine (many, many long log-in's occurred after each re-start!).

At this point I don't know what it is. It's a weird thing but it's true.
One thing I just found out is that one of my secondary DNS servers was not
able to pull a copy of a zone from the primary. The event viewer for the
secondary DNS server complains that the primary did not send the zone and
the logs on the primary report that it did send the zone.

There's something weird going on here....

After many hours of messing around, staring at DNS entries across 4 servers,
doing a ground-up re-build of the firewall - I finally figured out what was
happened.
Are you ready? .................

The switch packed it in. More specifically, certain limited aspects of the
switch packed it in....

I'm sitting here at my wits end. Everything worked perfectly before I added
the extra NIC to the firewall. I configured multi-link trunks on the
appropriate switch ports. Somehow or another, the switch either failed at
that moment or the software that runs the switch got corrupted.

That's why weird things were happening. For instance, the zone transfer from
the primary DNS to the secondary. The transfer would begin and then the
secondary would report that the transfer failed telling me to go look for
clues at the primary. Looking at the primary showed that the transfer
succeeded. So where did the data go? Into the ether, I guess.

In that case the primary DC did do the transfer. It was the secondary that
didn't get all the data because of... the switch. I turned on logging on the
secondary DNS server and it showed that the transfer was taking place but
that it had not completed. After waiting for some time it would then cough
up the error message.

I happened to have another identical switch here so I tried a last ditch
effort and changed it out. Bingo - the speed increase across the network was
instant. Everything is responding instantly again. Log-on's are instant.

So, the reason the domain controllers had zero issues must have been that
the ports on the switch they were connected to were OK. The reason the Proxy
and the Mail server were having issues was because there was some form of
corruption in the switch for those ports. Data flowing through those
SPECIFIC ports was being lost or corrupted causing all sorts of log-in
problems and DNS transfers between servers.

Go figure.

I knew my DNS was perfect - the network was instant prior to installing the
extra NIC. Same with ISA - it's been in place for about 4 years now and it's
rock-solid. Thing is, because I couldn't figure out what was causing this
weird behavior I went looking all over the place only to find out the switch
was corrupted.

The network performance was so poor it was as if the entire network was
infected with a virus. It was slow, 'jerky' and annoying. In retrospect it
was probably packets being shed intermittently in the switch itself.

The interesting thing is that small zone transfers would work. It was the
larger zone (my active directory zone) that would not transfer. So it's
almost as if small stuff would get through and larger transfers wouldn't.
This meant that pings worked perfectly, nslookups worked perfectly but any
sizable transfers (such as probably occurs when logging on) would fail.
That's why I could read that little group policy file from each of the
affected computers - it was small enough.

I don't know how a switch works (inside) but I know this much - the one I
had in place selectively failed on specific ports affecting those two
servers and the type of failure meant small traffic got through and large
traffic would not. That's why all my ICMP diagnostic traffic succeeded and
that's why I was pulling out my hair - there was no apparent reason for the
problems I was having to occur. If you can ping all the machines and do
forward and reverse lookups to them - it should work!

Hahaha - anyway, I just thought I would let you know what it ended up being
in the end. That's what I get for trying to _increase_ network performance
by adding another NIC to the proxy!

Best & Thanks!
Dave
 
Dave Onex said:
After many hours of messing around, staring at DNS entries across 4
servers, doing a ground-up re-build of the firewall - I finally figured
out what was happened.
Are you ready? .................

The switch packed it in. More specifically, certain limited aspects of the
switch packed it in....

I'm sitting here at my wits end. Everything worked perfectly before I
added the extra NIC to the firewall. I configured multi-link trunks on the
appropriate switch ports. Somehow or another, the switch either failed at
that moment or the software that runs the switch got corrupted.

That's why weird things were happening. For instance, the zone transfer
from the primary DNS to the secondary. The transfer would begin and then
the secondary would report that the transfer failed telling me to go look
for clues at the primary. Looking at the primary showed that the transfer
succeeded. So where did the data go? Into the ether, I guess.

In that case the primary DC did do the transfer. It was the secondary that
didn't get all the data because of... the switch. I turned on logging on
the secondary DNS server and it showed that the transfer was taking place
but that it had not completed. After waiting for some time it would then
cough up the error message.

I happened to have another identical switch here so I tried a last ditch
effort and changed it out. Bingo - the speed increase across the network
was instant. Everything is responding instantly again. Log-on's are
instant.

So, the reason the domain controllers had zero issues must have been that
the ports on the switch they were connected to were OK. The reason the
Proxy and the Mail server were having issues was because there was some
form of corruption in the switch for those ports. Data flowing through
those SPECIFIC ports was being lost or corrupted causing all sorts of
log-in problems and DNS transfers between servers.

Go figure.

I knew my DNS was perfect - the network was instant prior to installing
the extra NIC. Same with ISA - it's been in place for about 4 years now
and it's rock-solid. Thing is, because I couldn't figure out what was
causing this weird behavior I went looking all over the place only to find
out the switch was corrupted.

The network performance was so poor it was as if the entire network was
infected with a virus. It was slow, 'jerky' and annoying. In retrospect it
was probably packets being shed intermittently in the switch itself.

The interesting thing is that small zone transfers would work. It was the
larger zone (my active directory zone) that would not transfer. So it's
almost as if small stuff would get through and larger transfers wouldn't.
This meant that pings worked perfectly, nslookups worked perfectly but any
sizable transfers (such as probably occurs when logging on) would fail.
That's why I could read that little group policy file from each of the
affected computers - it was small enough.

I don't know how a switch works (inside) but I know this much - the one I
had in place selectively failed on specific ports affecting those two
servers and the type of failure meant small traffic got through and large
traffic would not. That's why all my ICMP diagnostic traffic succeeded and
that's why I was pulling out my hair - there was no apparent reason for
the problems I was having to occur. If you can ping all the machines and
do forward and reverse lookups to them - it should work!

Hahaha - anyway, I just thought I would let you know what it ended up
being in the end. That's what I get for trying to _increase_ network
performance by adding another NIC to the proxy!

Best & Thanks!
Dave


Wow. And I've seen this before with switches and teaming, but I just didn't
think of it. Some switches by default, will do that when you connect the two
NICs on the same switch, even if teamed. It just can't handle it without
reconfiguring the switch to allow it, or simply throwing it out. :-)

What brand name and models are the switches?

Glad you figured it out!

Cheers!

Ace
 
Wow. And I've seen this before with switches and teaming, but I just
didn't think of it. Some switches by default, will do that when you
connect the two NICs on the same switch, even if teamed. It just can't
handle it without reconfiguring the switch to allow it, or simply throwing
it out. :-)

What brand name and models are the switches?

Glad you figured it out!

Cheers!

Ace

Yeah, that's the weird thing. The switch actually supports up to 6
multi-link trunks and will even do them across different switches. So it was
well within the featureset of the switch. The odd thing is that it didn't
just discard packets - it only discarded certain traffic and only on certain
ports. That's why ping-tests and nslookups all worked but things like a zone
transfer or a log-on wouldn't pass through (properly).

It would have been way better if it dropped all packets on the effected
ports instead of doing a 'soft-fail'.

I think somehow the switch software got corrupted. Anyway, it's done now.
The switch is an older Nortel/Bay Networks 420-24. It's getting time to
change it out in favor of a gig Ethernet unit.

Thanks for your help through all these different issues - it's been great to
have someone else in the picture (other then just myself!)

Best;
Dave
 
Dave Onex said:
Yeah, that's the weird thing. The switch actually supports up to 6
multi-link trunks and will even do them across different switches. So it
was well within the featureset of the switch. The odd thing is that it
didn't just discard packets - it only discarded certain traffic and only
on certain ports. That's why ping-tests and nslookups all worked but
things like a zone transfer or a log-on wouldn't pass through (properly).

It would have been way better if it dropped all packets on the effected
ports instead of doing a 'soft-fail'.

I think somehow the switch software got corrupted. Anyway, it's done now.
The switch is an older Nortel/Bay Networks 420-24. It's getting time to
change it out in favor of a gig Ethernet unit.

Thanks for your help through all these different issues - it's been great
to have someone else in the picture (other then just myself!)

Best;
Dave


Looks like you did well by yourself! :-)

I like the Cisco Catalysts. Nice switches, no problems.

Cheers!

Ace
 
Back
Top