Wandering DNS entry

  • Thread starter Thread starter Christopher A. Newell
  • Start date Start date
C

Christopher A. Newell

I posted on this a couple of weeks ago and then the problem "appeared" to
clear up for a while.

This appeared to be a very sporadic problem, but as I look more closely it
seems to be more prevalent than I had imagined.

I have a medium-small, but moderatly complex network configured in 7 logical
segments, each operating on it's own IP subnet. In three of the segments,
dynamically addressed PCs are transiently loosing their DNS entries,
multiple local DNS servers being replaced by 168.95.1.1, an operating DNS
server in Taiwan. (in fact the only service answering on about half of the
168.95.1.x subnet is DNS) The loss of the correct DNS entrires disrupts the
client's network connectivity until the configuration is restored (all
Internet access for user PCs is through a proxy server, our firewall
prevents any client address from communicating with the Internet in any
other way, so the affected PC gets no response at all.) "ipconfig /renew"
seems to correct the problem, as does re-strating the PC.

As a temporary workaround, I have assigned the outside IP to one of my
internal DNS servers and routed all requests for that IP to the correct LAN
address. This is preserving my users' connectivity but is eliminating thier
calls for help to notify me.

After implementing the temporary solution, I have been monitoring detailed
traffic on the DNS server, only to find that inquiries using the off-site IP
are almost constant. It seems like there is one PC, occasionally two, using
that IP for DNS (and SMB and a few other protocols) just about all the time,
although the issue seems to move from computer to computer at no
identifiable interval. Apparently, either some of the users are
experiencing problems and just re-starting or the DNS error is not lasting
long enough to cause them to actually see the connectivity loss.

These PCs are in three different network segments, broken up at Layer 3,
configured by three different DHCP servers (although all are in the same AD
forrest.) Before I identified the problem being present in three different
segments, I tried stopping the known DHCP server and trying to obtain
address information - No rogue DHCP apparent. We are using 128 WEP on a
small number of wireless APs, but I have ruled out a customer notebook with
an ICS configuration running.

I have run throuough Spyware and AV scanns of some of the affected PCs with
no notable results (CA-ITM and Spybot S&D). Staticly addressed PCs are not
affected and one IP subnet that is dynamically addressed but operates in an
independent AD domain also seems to be OK.

Has anybody else ever seen anything remotely like this ?

Any ideas what I can look at to figure out where a changing DNS IP could be
getting injected into the system, across routers?

I think that I would have gotten an incorrect IP configuration if I had a
hardware based DHCP on the LAN (like a SOHO router), but it may bear noting
that a search on that IP reveals it to be one of the most commonly
referenced publicly accessable DNS servers. The IP appears in many pieces
of hardware documentation (again, like SOHO gateways).
 
Chris a couple of questions;
7 Subnets, is there any routers connecting these subnets?
How many DHCP server on the Network?
How amny Dns Servers? secondary and primary?

i will get to the internet access!!!
 
Some are. Most are "Power Users" on thier PCs.

It is just after close of business so most of the systems are off-line right
now, but I don't believe that there is actually a correlation between these
issues. If anything, with one exception, I think that most of the PCs where
I am seeing the foreign DNS entry are being used by local
non-Adminsitrators when the problem is occuring.
 
Ok Chris!!!!
Routers involve: DHCP relay agents.
Dns servers in different location regular sync shoudl take place.
Host A records checking should be done by the Dns server.
Secure Dynamic updates only work Xp machines.
Check the events on your Dhcp server!!!
Check the events on your Dns server
Check the events on AD.....thats havoc when your Dns dont work properly
because AD is fully dependent on your Dns....replication just to mension.
My opinion this is a DHCP issue because DHCP is responsible for the DNS
distribution....RELAY AGENTS VERY IMPORTANT
THIS IS ON SERVER 2003?

SQLDAWG
PTA RSA 2010 soccer/wcup
 
The 7 subnets are physically separated by routers.

Two are totally static configurations. There are 5 DHCP servers, one
physically located on each subnet. Of the four (sorry, missed one) subnets
that are experiencing this, one is a core, and the other three are branched
in a distributed star. The server that is primary for the users in each of
the three branch networks runs DHCP, has a network conenction to the core,
and provides the routing. The DHCP is bound only to the NIC on the remote
side of the "distributed star". (The 5th DHCP is also an IP router to the
core, but it is a controller for a trusted domain.)

I am going to have to confirm, but I do not believe that any relay agents
are in operation.

There are three DNS servers running. One provides external lookup and
carried the primary site for our externally addressable sites, all three
resolve our inside *.local DNS entries. I don't think that this is actually
a DNS problem, except to the extent that when a client PC changes the DNS
server entries to the "foreign" server the client cannot resolve internal
names (and since they are blocked from direct outside access, they can't
contact the outside server to resolve public names either They just loose
all connectivity for any application that is DNS name dependent.)
 
I'm going to have to try this. We are off-hours now and I am not seeing any
traffic to the foreign IP. Whatever device(s) are involved or causing the
issue are logged out/powered off.
 
Christopher,

I read your posting. May it be correctly restated as:

Some, but not all, client machines that are DHCP clients
are loosing their configured DNS servers, with these always
being replaced by 168.95.1.1. Further, only the DHCP clients
in three of the network segments that are part of one AD forest
are affected (i.e. DHCP clients in other segments and/or forest
are not affected in this way). There are no rogue DHCP servers
on the network segments.

Your statement that renewing the DHCP lease reestablishes
correct DNS server IPs lets us know that you are using DHCP
scope delivered nameserver IPs. Your statement that restarting
the machines also reestablishes indicates that there are no GPO
delivered incorrect DNS server IPs.

Since only an account with admin authority can set the DNS
servers in the TCP/IP config, we know this must be happening
due to something running with system/admin context on the
machines where this happens.
So, you need to find that admin/system process on or remotely
accessing those machines. This is not happening willy-nilly.

I am leaning toward a steathed malware.

Have you probed the 168.95.1.1 DNS server to see if it is
hosting a mock zone(s) in which your client machines might
access trusted hosts ? (i.e. is this part of a man in the middle
effort ?).
 
The only thing that is actually incorrect (my error in the original post) is
that there are 4 LAN segments affected. One is essentially my "core" which
includes our Internet and two other private WAN connections, as well as
servers that are equally utilized among our departments. The other 6
segments are departmentaly orgnaized and users are grouped with server
resources that they use most frequently.

Of the three unaffected segments, one is DHCP but is part of a trusted
domain in a separate AD forrest, One is static addressed and is in a child
domain, one is static addressed and validates in an external domain over a
WAN connection. The general topology is distributed-star with each branch
LAN segment being routed through one of thier servers to the core segment to
reach the Internet, WANs, and (occasionally) other branch LANs.

In the three branch LAN segments, the DHCP server is on the same system as
the routing function, bound to the NIC serving the branch LAN (if it was
propogating to the core, I would have gotten a configuration with the core's
DHCP server stopped.)

Running a sniffer on my core router's traffic and filtering on the foreign
DNS IP, I am only seeing traffic from one or two clients at any one time,
but even though no one client seems to be affected for a long period I am
now seeing traffic from some host almost constantly during business hours.

I have probed the foreign DNS on several common domains (microsoft.com,
google.com, etc.) and do not see any inconsistencies with known accurate
responses, but this has not been an exhaustive check. I will take a closer
look at the DNS queries being directed to that host during the day Friday
and look more closely at that.

Although we appear to be well scanned internally, I tend to agree with the
malware assessment. What I cannot determine yet is if it is running
directly on the affected machines or if it is something that is being
injected externally. The fact that this is crossing Layer 3 boundaries
leads me to suspect client, but the migratory nature (with only a small
number of machines affected at any one time) leaves a suspicion of a single
infected host affecting the other clients.
 
Keep in mind that many clients may have incorrect DNS server IP set,
but do not need to do DNS resolutions for extended periods.
I would probe the DNS for your zones, those of your business
partners, etc.. The spread could be intentional from a single
machine using an account with admin access to the others, or
could be a common hijackware that has spread by common
vectors. Again, something has to run as admin or system on
the machines where the change happens, so perhaps you could
install a watcher to profile processes that come/go in system
or an admin context.

Roger
 
Christopher,
The hypothesis is that you have malware on your clients. As the users have
local admin or power user rights this would have been easy to introduce. We
also have to assume that your AV does not detect it. If you google for
"trojan change dns" you will find several references.
I think what you need to do is:
- run several AV and spyware scanners to detect it
- try the non-admin test
- try to catch it "red-handed" with a changed registry value
- remove all users from local admin and power user groups (and automate the
things they need those rights for)
- find out why your AV has not detected it, and switch to one that does.
The real problem is that as your users have admin rights, and if you can
prove the hypothesis that the machines have been compromised, then you have
no way to know the extent of the damage and to be safe you would need to
rebuild your network. The mitigating circumstance is that you say all access
is through the proxy.
On balance, you probably need to rebuild all the PC's in turn and migrate
your users onto new non-admin config. The most important thing to do is
assess whether there is any chance your servers or admin desktops have also
been compromised.
Anthony,
http://www.airdesk.co.uk
 
In
Christopher A. Newell said:
The only thing that is actually incorrect (my error in the original
post) is that there are 4 LAN segments affected. One is essentially
my "core" which includes our Internet and two other private WAN
connections, as well as servers that are equally utilized among our
departments. The other 6 segments are departmentaly orgnaized and
users are grouped with server resources that they use most frequently.
<snipped>

The last time I saw something like this with similar symptoms, I found a
Linksys wireless router someone brought in causing it. It was providing DNS
addresses that was configured on it's WAN interface while it was at the
person's home. When they brought it in without me knowing about it, DHCP was
still enabled. It wound up conflicting with the customer's corp scope and
options.

Something else to think about and look for.

--
Regards,
Ace

This posting is provided "AS-IS" with no warranties or guarantees and
confers no rights.

Ace Fekay, MCSE 2003 & 2000, MCSA 2003 & 2000, MCSE+I, MCT,
MVP Microsoft MVP - Directory Services
Microsoft Certified Trainer

Infinite Diversities in Infinite Combinations

Having difficulty reading or finding responses to your post?
Try using Outlook Express or any other newsreader, configure a news
account, and point it to news.microsoft.com. Anonymous access. It's
easy and it's free:

How to Configure OEx for Internet News
http://support.microsoft.com/?id=171164

"Life isn't like a box of chocolates or a bowl of cherries or
peaches... Life is more like a jar of jalapenos. What you do today
may burn your butt tomorrow." - Garfield
 
OK. Here's what it turned out to be. . . . A wireless access point (NOT
ROUTER). The only explanation I can see is that DHCP was changed to on by
default in a firmware update. This still leaves me with a bunch of
questions:
1. Why did only the DNS address get changed. (the DNS is not user/admin
configurable on the device, although the address range, subnet, gateway are)
I would have expected to have gotten the full configuration from that
device, not a full config from one device and then DNS only from another.
2. Why didn't this device give me a complete (albeit useless in my
network) configuration when I stopped the official DHCP server? When I
tried this, I got the default public config after receiving an error message
becasue no DHSP server was found.
3. How did this effect carry over to three other dynamicaly addressed
subnets which were sepperated by routers? (or why only three of the four?
Although the fourth operates as a trusted domainin a separate AD forrest.)

What I finally had to do was actually go out to the desktop of what appeared
to be the machine which was switching DNS IPs the quickest with a sniffer
and a hub (unmanaged switches) and capture all of the traffic until the
config actually changed on me. Then I was able to see the offending DHCP
packet and extract the source addresses to pinpoint the device.
 
In
Christopher A. Newell said:
OK. Here's what it turned out to be. . . . A wireless access point
(NOT ROUTER). The only explanation I can see is that DHCP was
changed to on by default in a firmware update. This still leaves me
with a bunch of questions:
1. Why did only the DNS address get changed. (the DNS is not
user/admin configurable on the device, although the address range,
subnet, gateway are) I would have expected to have gotten the full
configuration from that device, not a full config from one device and
then DNS only from another. 2. Why didn't this device give me a
complete (albeit useless in my
network) configuration when I stopped the official DHCP server? When
I tried this, I got the default public config after receiving an
error message becasue no DHSP server was found.
3. How did this effect carry over to three other dynamicaly
addressed subnets which were sepperated by routers? (or why only
three of the four? Although the fourth operates as a trusted domainin
a separate AD forrest.)
What I finally had to do was actually go out to the desktop of what
appeared to be the machine which was switching DNS IPs the quickest
with a sniffer and a hub (unmanaged switches) and capture all of the
traffic until the config actually changed on me. Then I was able to
see the offending DHCP packet and extract the source addresses to
pinpoint the device.

As for #1 and 2, I've seen just the DNS address get changed especially if
the scope the wireless device is giving out is the same. I also can't answer
#3 in your scenario. Are you using IP helpers or DHCP agents?

Just one note, I do not believe a true access point (AP) has teh ability to
provide DHCP from the ones that I've used from Cisco 1231's to Linksys APs.
They bridge the wireless segment and wired segment. Now a router will do
that, and I've seen routers do just what you've described. Now if APs now
offer DHCP services, that's a cool little feature, but then I would imagine
it will be on a difrerent segment and routing traffic.


Ace
 
I suppose it could be a "router" in disguise. Now that I think about it, I
seem to recall some layer 3 features kicking around the config. It is a
MiLAN unit that is packaged and sold as an AP. One Ethernet/POE port, one
RF output (I have seen some Bufalo APs with 4 port switches embeded,) WEP,
WAP, Radius authentication support. Everything runs logicaly on a single LAN
segment, but it appears to be possible to do "routing on a stick" (a term I
have grabbed from Cisco's explanation for doing layer 3 and 4 translations
over a single interface.)

I have a handfull of them deployed (including one at home where I do use the
DHCP). The IP block, mask and GW IP are user configurable. The DNS IP
assigned is not. Just no way from the UI to set it.
 
In
Christopher A. Newell said:
I suppose it could be a "router" in disguise. Now that I think about
it, I seem to recall some layer 3 features kicking around the config.
It is a MiLAN unit that is packaged and sold as an AP. One
Ethernet/POE port, one RF output (I have seen some Bufalo APs with 4
port switches embeded,) WEP, WAP, Radius authentication support.
Everything runs logicaly on a single LAN segment, but it appears to
be possible to do "routing on a stick" (a term I have grabbed from
Cisco's explanation for doing layer 3 and 4 translations over a
single interface.)
I have a handfull of them deployed (including one at home where I do
use the DHCP). The IP block, mask and GW IP are user configurable. The
DNS IP assigned is not. Just no way from the UI to set it.

Interesting. I've never used a Milan unit. Can you disable DHCP on it? I
tried looking for a MiLAN product guide, but not sure what model you have:
http://www.milan.com/TransitionNetworks/MiLAN/Default.aspx

Do your docs mention how to disable DHCP?


Ace
 
Back
Top