P
PC Pete
First, I apologise for the enormous length of this posting. But I want to
include ALL available information to avoid the "have you tried rebooting?" or
"Fix your DNS" kind of responses!
I've also reviewed multiple non-MS troubleshooting sites, as well as
checking here for similar issues.
Problem description:
After adding a new gateway server, and reconfiguring the old gateway server
with a new network card and running it as a simple networked PC, I found that
after successfully running the new gateway with ICS for a few minutes,
without any intervention or other configuration, the ICS connection was
suddenly and without warning or log data, unable to properly perform any DNS
lookups using (known working) ISP DNS servers.
From my internal network, I can ping any IP address, and I can access any
resource using IP addresses, but not using either a fully or partially QDN.
Original Configuration:
The original configuration consisted of an internal server (Beowulf) running
Windows XP Professional x64 Edition, build 3790 SP2 plus all hotfixes, ICS,
and connection to an internal private network with a wireless AP (DLink
DWL9000+), and a number of machines and devices with fixed IP addresses.
It has 3 inbuilt network cards, only 2 of which were enabled. Both enabled
adapters were Broadcom Gigabit ethernet controllers.
The first adapter (let's call it BC-A) was connected to my cable modem
directly, and was configured for DHCP addressing, and ICS was configured and
enabled and working.
The second adapter (let's call it BC-B) was connected to the internal
network, using the standard private address range of
192.168.0.1/255.255.255.0.
The internal network clients were:
1) Jove, (XP Pro 32 SP2 + all hotfixes) Laptop, manual IP 192.168.0.10,
using Wifi via the access point below.
2) Security module, manual IP 192.168.0.6
3) D-Link wireless AP DWL9000+, configured as 192.168.0.100, serving the
laptop.
After the problems described, I have disconnected all internal network
devices, so the internal server is the ONLY client connected to the gateway
PC.
Initially, the internal server's BC-A adapter had ICS enabled, and both the
server itself and the notebook connected reliably and quickly to the internet
with no errors or lookup problems (apart from a couple of stupid problems I
caused myself). They have been working fine for 3+ years in this
configuration.
The Workgroup name for all devices is set to my own internal name (it's
valid, all connected devices that require netbios access are set to the same
workgroup, and it has worked in its current state for about 4 years with few
or no changes needed).
What changed:
I was given an IBM e-Series 206 server with an Intel PRO/1000 CT Network
connection.
I added a new D-Link DFE-528TX PCI adapter to the new PC to allow it to act
as an ICS gateway, and decommissioned the original server's network adapters
and configured a new adapter in that old server to allow it to act as a
simple ICS client (along with the laptop via the wifi AP).
Configuration requirements:
I was intending to configure the IBM system to act as a firewall/gateway,
with plans to move the 4TB RAID5 storage (currently internal to the existing
server) to it instead, thereby offloading firewall, routing, and storage from
the heavily used internal system.
Since most (99.9% or more) network traffic would be internal (192.168.0.x
traffic), I configured the D-Link adapter (let's call it DL-A) as the new
internet connection, and configured it appropriately (DHCP assignments only).
This connection worked, the new IBM server fired up and seemed to browse and
so on without any errors or problems.
On the original (internal) server, due to resource and cabling limitations,
I disabled the two Broadcom adapters, and enabled the Intel adapter (let's
call it Intel-B), and manually assigned the unused IP address of
192.168.0.223 for safety's sake.
On the new IBM, I then ran the ICS wizard, and it correctly set up the Dlink
adapter (using the ISP's DHCP service) as the ICS gateway, and the Intel
adapter (let's call it Intel-A) as the internal network adapter. The Intel
adapter after the ICS wizard ran was set to static IP, 192.168.0.1, netmask
255.255.255.0, and nothing in the gateway or DNS server IP address fields.
I could connect to the internet via the IBM server, and I could ping the
internal network devices, and I can access the original server's shares
normally using the netbios names.
On the internal server, I rebooted and ran the ICS wizard, which originally
complained that the 192.168.x.x address range was in use by another device.
So I opened the device manager, enabled hidden devices, and uninstalled both
Broadcom "phantom" adapters, then re-ran the ICS wizard, which completed fine.
After rebooting that machine, for about 7 minutes, I had perfect access from
Beowulf through Lugh acting as the ICS gateway, so I left both systems on.
An hour later, I suddenly found that I could not receive emails, and Beowulf
seemed to be unable to access the internet. There are NO logs or other errors
on either server's event lists, apart from the "service started normally"
kind of things. Certainly there is nothing about DHCP, ICS, NAT, or any other
networking problem.
After a few minutes, I discovered that I CAN actually access any internet
resource as long as I don't use DNS. So if I try to connect Beowulf to
http://www.microsoft.com, I get a "Internet Explorer cannot display the
webpage". But if I connect to http://207.46.19.254, I can see the webpage
just fine. I can't browse, because as soon as I get a DNS FQDN, I get no
results. If I change the FQDN to the full IP, and append the path, I can (for
example) look at any MS Technet or support webpage, just not navigate away
from the page.
I can share and access resources between the two computers, and pinging and
sharing works perfectly, internally and externally. Internally, I can use the
computer names instead of IP addresses, and they resolve fine. I just can't
access resources outside the ICS gateway using DNS lookups.
Major Confusion, Thanks, Microsoft!
After downloading and following the MS ICS install and troubleshooting
guides, I noticed a couple of intensely confusing bits of contradictory and
truly confusing information.
First, the Configure ICS in Windows XP (KB306126) document clearly states
that on the ICS gateway computer, the internal adapter will be configured to
a static IP address of 192.168.0.1, netmask 255.255.255.0.
But in the troubleshooting article (KB308006), it clearly states that the
INTERNAL LAN adapter must be configured as DHCP (Obtain an IP Address
Automatically). It is not, and the ICS wizard does not configure the adapter
at all as DHCP, instead it configures it as static, with NO gateway, or DNS
IP addresses.
Second, the KB308006 document states that if the internal LAN adapter on the
ICS gateway machine is configured as static, I must disable ICS, configure
the internal adapter to obtain an IP address automatically, then re-enable
ICS. I tried that, and it plain vanilla doesn't work - the internal adapter
takes forever to get an IP address, and when it does, if I look at the
adapter's information, it says it's now set to static!
Checking the NSW.log on both machines shows the exact sequence of steps I
took, and confirms that there is nothing strange or wierd about the
configuration.
So. Now I have an ICS gateway that mysteriously (and without ANY KIND of
intervention) enabled ICS for a while, then suddenly changed its
configuration and now can't be re-enabled, with no hint as to why the problem
occurred, and therefore, I don't know how to fix it.
I have removed and re-added both networking devices on the gateway server,
without change in symptoms.
I have removed and re-added the internal server's adapter and rebooted,
without change in symptom.
I have manually configured the internal networked devices to use static,
unique, private IP addresses in the 192.168.0.x range, and they can each see,
ping, and access shares and resources fine - they just can't access the
internet through the gateway PC using DNS lookups.
The gatway PC can see the internet and can connect to internal devices and
shares fine, using local names or IP addresses.
I have manually configured the gateway PC's internal adapter to fixed IP
addresses, with and without specifying 192.168.0.1 as the default gateway,
with and without setting the fixed IP as the default DNS server IPs, and with
and without setting the actual real DNS server IPs. Nothing seems to work.
However: if I don't specify a default gateway in the address box on the
internal server, the "page not found" problem immediately pops up when I try
to access any DNS lookup. If I manually configure it to the gateway IP
address, it takes much longer, but still fails. If I manually configure the
gateway AND use the ISP's DNS server addresses, it still fails, but takes
much longer.
I have also run netsh winsock reset on the internal and gatway machines and
rebooted.
I have run the ICS wizard multiple times, always with the same results.
I have disabled all firewalls, and that made no difference, so the firewall
solution(s) I am using DO NOT seem to affect the symptom in any way. So I'm
fairly sure it's not a firewall issue.
For the sake of completeness, the firewall on the gateway PC is Online
Armor, but I have also tried it with the default windows firewall initially,
and that has the same problem.
I can connect to the gateway PC using RDC just fine.
System information:
Here are the results of running ipconfig /all on the gateway :
Windows IP Configuration
Host Name . . . . . . . . . . . . : lugh
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Mixed
IP Routing Enabled. . . . . . . . : Yes
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : vic.optushome.com.au
Ethernet adapter Optus WAN:
Connection-specific DNS Suffix . : vic.optushome.com.au
Description . . . . . . . . . . . : D-Link DFE-528TX PCI Adapter
Physical Address. . . . . . . . . : 00-1C-F0-6E-B1-EC
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 122.107.183.56
Subnet Mask . . . . . . . . . . . : 255.255.240.0
Default Gateway . . . . . . . . . : 122.107.176.1
DHCP Server . . . . . . . . . . . : 211.31.132.78
DNS Servers . . . . . . . . . . . : 198.142.0.51
203.2.75.132
Lease Obtained. . . . . . . . . . : Sunday, 17 August 2008 2:03:53 PM
Lease Expires . . . . . . . . . . : Monday, 18 August 2008 2:03:53 PM
Ethernet adapter Churinga LAN:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Intel(R) PRO/1000 CT Network
Connection
Physical Address. . . . . . . . . : 00-11-25-57-E4-5B
Dhcp Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 192.168.0.1
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . :
And on the internal server:
Windows IP Configuration
Host Name . . . . . . . . . . . . : beowulf
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Mixed
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : mshome.net
Ethernet adapter Local Area Connection 2:
Connection-specific DNS Suffix . : mshome.net
Description . . . . . . . . . . . : Intel(R) PRO/100 S Server Adapter
Physical Address. . . . . . . . . : 00-E0-81-46-78-3C
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 192.168.0.192
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 192.168.0.1
DHCP Server . . . . . . . . . . . : 192.168.0.1
DNS Servers . . . . . . . . . . . : 192.168.0.1
Lease Obtained. . . . . . . . . . : Sunday, 17 August 2008 3:20:52 PM
Lease Expires . . . . . . . . . . : Sunday, 24 August 2008 3:20:52 PM
You'll note that the current Default gateway and DNS servers have been
manually configured to the gateway's static IP address, but they have also
been blank, and configured for the ISP's DNS both with and without the
default gateway, the symptoms do not change!
The internal IP addresses have NO conflicts (all devices are currently
disabled). However, if I attempt to connect to Jove using RDP, it works fine.
Jove can also not access any internet FQDN.
I have no idea why the problem occurred after it worked for quite a while,
with nothing at all whatsoever changing on any of the devices mentioned. I
didn't even touch the screen or keyboard or mouse - it worked, then it didn't.
If anyone can help to figure out what's gone wrong, I would very much
appreciate any help.
I note with interest that this is one of the most-asked MS networking
question on the planet (right after "I can't get my Wifi adapter to work
under Vista". Nuff said). There are countless shotgun solutions, including
deleting registry keys, clean reinstalls, and so on - but they are not
solutions at all, they are guesses and workarounds.
The MS documentation is confusing and unhelpful. So if anyone with some
experience can suggest a troubleshooting plan of attack that doesn't require
registry hive deletion, oxy-acetylene welding, or a 3+ day reinstall of
hideously complex OSes and applications on the internal server, I'd be most
grateful.
FWIW, I'm posting this using IE7 on the gateway machine, using remote
desktop on the internal system - so it's almost certainly not a simple
connectivity problem. I hope this helps to filter out the "can you ping
localhost" replies!
include ALL available information to avoid the "have you tried rebooting?" or
"Fix your DNS" kind of responses!
I've also reviewed multiple non-MS troubleshooting sites, as well as
checking here for similar issues.
Problem description:
After adding a new gateway server, and reconfiguring the old gateway server
with a new network card and running it as a simple networked PC, I found that
after successfully running the new gateway with ICS for a few minutes,
without any intervention or other configuration, the ICS connection was
suddenly and without warning or log data, unable to properly perform any DNS
lookups using (known working) ISP DNS servers.
From my internal network, I can ping any IP address, and I can access any
resource using IP addresses, but not using either a fully or partially QDN.
Original Configuration:
The original configuration consisted of an internal server (Beowulf) running
Windows XP Professional x64 Edition, build 3790 SP2 plus all hotfixes, ICS,
and connection to an internal private network with a wireless AP (DLink
DWL9000+), and a number of machines and devices with fixed IP addresses.
It has 3 inbuilt network cards, only 2 of which were enabled. Both enabled
adapters were Broadcom Gigabit ethernet controllers.
The first adapter (let's call it BC-A) was connected to my cable modem
directly, and was configured for DHCP addressing, and ICS was configured and
enabled and working.
The second adapter (let's call it BC-B) was connected to the internal
network, using the standard private address range of
192.168.0.1/255.255.255.0.
The internal network clients were:
1) Jove, (XP Pro 32 SP2 + all hotfixes) Laptop, manual IP 192.168.0.10,
using Wifi via the access point below.
2) Security module, manual IP 192.168.0.6
3) D-Link wireless AP DWL9000+, configured as 192.168.0.100, serving the
laptop.
After the problems described, I have disconnected all internal network
devices, so the internal server is the ONLY client connected to the gateway
PC.
Initially, the internal server's BC-A adapter had ICS enabled, and both the
server itself and the notebook connected reliably and quickly to the internet
with no errors or lookup problems (apart from a couple of stupid problems I
caused myself). They have been working fine for 3+ years in this
configuration.
The Workgroup name for all devices is set to my own internal name (it's
valid, all connected devices that require netbios access are set to the same
workgroup, and it has worked in its current state for about 4 years with few
or no changes needed).
What changed:
I was given an IBM e-Series 206 server with an Intel PRO/1000 CT Network
connection.
I added a new D-Link DFE-528TX PCI adapter to the new PC to allow it to act
as an ICS gateway, and decommissioned the original server's network adapters
and configured a new adapter in that old server to allow it to act as a
simple ICS client (along with the laptop via the wifi AP).
Configuration requirements:
I was intending to configure the IBM system to act as a firewall/gateway,
with plans to move the 4TB RAID5 storage (currently internal to the existing
server) to it instead, thereby offloading firewall, routing, and storage from
the heavily used internal system.
Since most (99.9% or more) network traffic would be internal (192.168.0.x
traffic), I configured the D-Link adapter (let's call it DL-A) as the new
internet connection, and configured it appropriately (DHCP assignments only).
This connection worked, the new IBM server fired up and seemed to browse and
so on without any errors or problems.
On the original (internal) server, due to resource and cabling limitations,
I disabled the two Broadcom adapters, and enabled the Intel adapter (let's
call it Intel-B), and manually assigned the unused IP address of
192.168.0.223 for safety's sake.
On the new IBM, I then ran the ICS wizard, and it correctly set up the Dlink
adapter (using the ISP's DHCP service) as the ICS gateway, and the Intel
adapter (let's call it Intel-A) as the internal network adapter. The Intel
adapter after the ICS wizard ran was set to static IP, 192.168.0.1, netmask
255.255.255.0, and nothing in the gateway or DNS server IP address fields.
I could connect to the internet via the IBM server, and I could ping the
internal network devices, and I can access the original server's shares
normally using the netbios names.
On the internal server, I rebooted and ran the ICS wizard, which originally
complained that the 192.168.x.x address range was in use by another device.
So I opened the device manager, enabled hidden devices, and uninstalled both
Broadcom "phantom" adapters, then re-ran the ICS wizard, which completed fine.
After rebooting that machine, for about 7 minutes, I had perfect access from
Beowulf through Lugh acting as the ICS gateway, so I left both systems on.
An hour later, I suddenly found that I could not receive emails, and Beowulf
seemed to be unable to access the internet. There are NO logs or other errors
on either server's event lists, apart from the "service started normally"
kind of things. Certainly there is nothing about DHCP, ICS, NAT, or any other
networking problem.
After a few minutes, I discovered that I CAN actually access any internet
resource as long as I don't use DNS. So if I try to connect Beowulf to
http://www.microsoft.com, I get a "Internet Explorer cannot display the
webpage". But if I connect to http://207.46.19.254, I can see the webpage
just fine. I can't browse, because as soon as I get a DNS FQDN, I get no
results. If I change the FQDN to the full IP, and append the path, I can (for
example) look at any MS Technet or support webpage, just not navigate away
from the page.
I can share and access resources between the two computers, and pinging and
sharing works perfectly, internally and externally. Internally, I can use the
computer names instead of IP addresses, and they resolve fine. I just can't
access resources outside the ICS gateway using DNS lookups.
Major Confusion, Thanks, Microsoft!
After downloading and following the MS ICS install and troubleshooting
guides, I noticed a couple of intensely confusing bits of contradictory and
truly confusing information.
First, the Configure ICS in Windows XP (KB306126) document clearly states
that on the ICS gateway computer, the internal adapter will be configured to
a static IP address of 192.168.0.1, netmask 255.255.255.0.
But in the troubleshooting article (KB308006), it clearly states that the
INTERNAL LAN adapter must be configured as DHCP (Obtain an IP Address
Automatically). It is not, and the ICS wizard does not configure the adapter
at all as DHCP, instead it configures it as static, with NO gateway, or DNS
IP addresses.
Second, the KB308006 document states that if the internal LAN adapter on the
ICS gateway machine is configured as static, I must disable ICS, configure
the internal adapter to obtain an IP address automatically, then re-enable
ICS. I tried that, and it plain vanilla doesn't work - the internal adapter
takes forever to get an IP address, and when it does, if I look at the
adapter's information, it says it's now set to static!
Checking the NSW.log on both machines shows the exact sequence of steps I
took, and confirms that there is nothing strange or wierd about the
configuration.
So. Now I have an ICS gateway that mysteriously (and without ANY KIND of
intervention) enabled ICS for a while, then suddenly changed its
configuration and now can't be re-enabled, with no hint as to why the problem
occurred, and therefore, I don't know how to fix it.
I have removed and re-added both networking devices on the gateway server,
without change in symptoms.
I have removed and re-added the internal server's adapter and rebooted,
without change in symptom.
I have manually configured the internal networked devices to use static,
unique, private IP addresses in the 192.168.0.x range, and they can each see,
ping, and access shares and resources fine - they just can't access the
internet through the gateway PC using DNS lookups.
The gatway PC can see the internet and can connect to internal devices and
shares fine, using local names or IP addresses.
I have manually configured the gateway PC's internal adapter to fixed IP
addresses, with and without specifying 192.168.0.1 as the default gateway,
with and without setting the fixed IP as the default DNS server IPs, and with
and without setting the actual real DNS server IPs. Nothing seems to work.
However: if I don't specify a default gateway in the address box on the
internal server, the "page not found" problem immediately pops up when I try
to access any DNS lookup. If I manually configure it to the gateway IP
address, it takes much longer, but still fails. If I manually configure the
gateway AND use the ISP's DNS server addresses, it still fails, but takes
much longer.
I have also run netsh winsock reset on the internal and gatway machines and
rebooted.
I have run the ICS wizard multiple times, always with the same results.
I have disabled all firewalls, and that made no difference, so the firewall
solution(s) I am using DO NOT seem to affect the symptom in any way. So I'm
fairly sure it's not a firewall issue.
For the sake of completeness, the firewall on the gateway PC is Online
Armor, but I have also tried it with the default windows firewall initially,
and that has the same problem.
I can connect to the gateway PC using RDC just fine.
System information:
Here are the results of running ipconfig /all on the gateway :
Windows IP Configuration
Host Name . . . . . . . . . . . . : lugh
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Mixed
IP Routing Enabled. . . . . . . . : Yes
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : vic.optushome.com.au
Ethernet adapter Optus WAN:
Connection-specific DNS Suffix . : vic.optushome.com.au
Description . . . . . . . . . . . : D-Link DFE-528TX PCI Adapter
Physical Address. . . . . . . . . : 00-1C-F0-6E-B1-EC
Dhcp Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 122.107.183.56
Subnet Mask . . . . . . . . . . . : 255.255.240.0
Default Gateway . . . . . . . . . : 122.107.176.1
DHCP Server . . . . . . . . . . . : 211.31.132.78
DNS Servers . . . . . . . . . . . : 198.142.0.51
203.2.75.132
Lease Obtained. . . . . . . . . . : Sunday, 17 August 2008 2:03:53 PM
Lease Expires . . . . . . . . . . : Monday, 18 August 2008 2:03:53 PM
Ethernet adapter Churinga LAN:
Connection-specific DNS Suffix . :
Description . . . . . . . . . . . : Intel(R) PRO/1000 CT Network
Connection
Physical Address. . . . . . . . . : 00-11-25-57-E4-5B
Dhcp Enabled. . . . . . . . . . . : No
IP Address. . . . . . . . . . . . : 192.168.0.1
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . :
And on the internal server:
Windows IP Configuration
Host Name . . . . . . . . . . . . : beowulf
Primary Dns Suffix . . . . . . . :
Node Type . . . . . . . . . . . . : Mixed
IP Routing Enabled. . . . . . . . : No
WINS Proxy Enabled. . . . . . . . : No
DNS Suffix Search List. . . . . . : mshome.net
Ethernet adapter Local Area Connection 2:
Connection-specific DNS Suffix . : mshome.net
Description . . . . . . . . . . . : Intel(R) PRO/100 S Server Adapter
Physical Address. . . . . . . . . : 00-E0-81-46-78-3C
DHCP Enabled. . . . . . . . . . . : Yes
Autoconfiguration Enabled . . . . : Yes
IP Address. . . . . . . . . . . . : 192.168.0.192
Subnet Mask . . . . . . . . . . . : 255.255.255.0
Default Gateway . . . . . . . . . : 192.168.0.1
DHCP Server . . . . . . . . . . . : 192.168.0.1
DNS Servers . . . . . . . . . . . : 192.168.0.1
Lease Obtained. . . . . . . . . . : Sunday, 17 August 2008 3:20:52 PM
Lease Expires . . . . . . . . . . : Sunday, 24 August 2008 3:20:52 PM
You'll note that the current Default gateway and DNS servers have been
manually configured to the gateway's static IP address, but they have also
been blank, and configured for the ISP's DNS both with and without the
default gateway, the symptoms do not change!
The internal IP addresses have NO conflicts (all devices are currently
disabled). However, if I attempt to connect to Jove using RDP, it works fine.
Jove can also not access any internet FQDN.
I have no idea why the problem occurred after it worked for quite a while,
with nothing at all whatsoever changing on any of the devices mentioned. I
didn't even touch the screen or keyboard or mouse - it worked, then it didn't.
If anyone can help to figure out what's gone wrong, I would very much
appreciate any help.
I note with interest that this is one of the most-asked MS networking
question on the planet (right after "I can't get my Wifi adapter to work
under Vista". Nuff said). There are countless shotgun solutions, including
deleting registry keys, clean reinstalls, and so on - but they are not
solutions at all, they are guesses and workarounds.
The MS documentation is confusing and unhelpful. So if anyone with some
experience can suggest a troubleshooting plan of attack that doesn't require
registry hive deletion, oxy-acetylene welding, or a 3+ day reinstall of
hideously complex OSes and applications on the internal server, I'd be most
grateful.
FWIW, I'm posting this using IE7 on the gateway machine, using remote
desktop on the internal system - so it's almost certainly not a simple
connectivity problem. I hope this helps to filter out the "can you ping
localhost" replies!