Solved DNS puzzles: Load "hosts into DNS cache" & Forward to second namespace (also RBL)

  • Thread starter Thread starter Herb Martin
  • Start date Start date
H

Herb Martin

[This message presents two solutions and progress report s
-- and is not a request for help but all suggestions to improve
on these are appreciated. The biggest improvement for me
would be to have these work through Win2000+ DNS. <sigh>]

Looking into the issue of forwarding to a second (disjoint)
namespace caused me to spend some time with BIND and
returned my attention to a problem I had been wanting to
solve (and has been previously requested on this group by
others): How to pre-load the "cache" with records to
prevent access to "ad display" (or other undesirable) site?

Unfortunately, I cannot do neither with Win2000 (and I believe
not with Win2003) DNS so I was forced to use BIND
(it is running on Win2000 just fine.)

Four issues I am pursuing:
1) Loading a large "blacklist" into DNS through the cachhe: SOLVED
2) Using a forwarder to check a separate (disjoint) namespace
while still using internal servers to check the internal namespace
even when the forwarder cannot resolve the query. SOLVED
3) Building an RBL (real-time blacklist) for email that dynamically
checks, synthesizes composite records, and caches from multiple RBL
servers: Proof of concept in Perl; needs to be added to the DNS
server.

The "hosts" file issue came up because there are various
large "hosts" file that resolve their records to 127.0.0.1
(or another essentially wrong or invalid address) to prevent
loading of advertisement graphics. Turns out that a large
but reasonable list can block a high percentage of ads that
derive from the same subset of the Internet.

Unfortunately, MS DNS won't let me add records other
than (true) "root hints" to the cache -- it seems to just ignore
them.

Loading such hosts files on EVERY PC in even a small network
is both a nuisance, and it causes a (temporary) performance problem
every time the file is load or EVEN ONE record is updated -- the
PC spends up to an hour at near 100% utilization churning through
the entire hosts file (again.) [We are talking 750K hosts files with
17,000 records or so.]

Putting it on the DNS server (at the NAT/forwarding position)
solves this, and it loads without undue stress on the server in
under 10 seconds. Full reloads take less than a minute because
they require stopping/re-starting BIND -- but I am looking at
reducing that too.

Hard part was figuring out how to make BIND 9 do
persistent caching -- it's not in the manual, but reading
the source code indicated that giving a NAME to the
cache file, e.g., as an 'option' or 'view' setting of
cache-file "cache-file.dns";
....would save and re-load the cache on next start. Works
great.

(Two caveats: prevent the cache from being overwritten at
shutdown (Read-Only Attribute solved this), and THEN deal
with the "cache date" you need in the file since it never gets
updated and eventually -- a really long time with a 32-bit
integer -- it might expire. Setting TLL on the records to
2147483647 should handle this.)

Goal #2:
Arrange for a completely separate (DISJOINT) Namespace
to have it's DNS servers forward (to the Internet) for public
namespace entries, but STILL do internal recursion for
private names using a private root namespace.

Result: Success

Problem: For private names the Fowarder returns NXDOMAIN
and the internal DNS servers stop searching -- so trick the
forwarder into REFUSING the requests (or if desparate giving
Server Failure for internal names) so that the internal DNS will
KEEP LOOKING -- no answer would be worse because we
would then have to wait internally for the timeout to expire.

Method: Create "stub" zones for the internal zones on the forwarder
but also use an ACCESS list (ACL) to deny on those zones
so they never really get searched. (A view works best but isn't
absolutely necessary.) It also works with a 'Master' but 'stub'
is closer to the concept.

Note: "stub" zone is a technical term in BIND and although I am
using the "stub" zones to perform this function it can be done with
other types -- ideally there would just be a "refuse" or "constant"
Zone type...

Two improvements would be nice (I'll have to change the BIND
source for these):
1) Never "fail" but always refuse
2) A new Zone type "Refuse" where only the zone name is
needed and the extra "stub" cruft can be skipped.

It's not worth changing the code JUST for these but I have another goal
or two:

Goal #3: A multiplexor RBL (real-time blackhole list) with scoring;
when a "blackhole spam server" test is requested, do a lookup to
a "group" of RBLs and use a factor (e.g., 0.7 or 0.5) with a threshold
(e.g., 1.0) to determined if an RBL record should be "synthesized".

Purpose: Allows checking more than more RBL for (weighted)
concurrence before rejecting email from that source, and treating
'aggressive' RBLs differently than 'conservative' RBLs, e.g., 3
agressive server reports might equal one conservative RBL report.

RBLs have been the single most effective tool I have found against
spam -- don't get excited, they aren't complete, but I did knock out
75-90% of MY SPAM by checking the 2 RBLs my email server
allows. If I can push that towards 95%, the remaining 5% can be
dealt with easier by Bayesian and keyword filters.

Status: I have a working Perl DNS server as proof of concept but it
needs to be converted to BIND 9 source and compiled into there.

[I haven't modified BIND yet, but I have it compiling under VC in
VS.Net 2003. BIND 9 is a good size program and I have yet to find
any significant "programmers' notes" other than header files and
comments.]

Thanks to anyone who helped, who tried to help, or is just interested
in reading this report.
 
Windows Server 2003 does support forwarding to a disjoint namespace through
the use of stub zones.

Windows 2000 Server does not allow this.

I'm not familiar enough with DNS to comment on the rest of your post.

Cheers

Oli


Herb Martin said:
[This message presents two solutions and progress report s
-- and is not a request for help but all suggestions to improve
on these are appreciated. The biggest improvement for me
would be to have these work through Win2000+ DNS. <sigh>]

Looking into the issue of forwarding to a second (disjoint)
namespace caused me to spend some time with BIND and
returned my attention to a problem I had been wanting to
solve (and has been previously requested on this group by
others): How to pre-load the "cache" with records to
prevent access to "ad display" (or other undesirable) site?

Unfortunately, I cannot do neither with Win2000 (and I believe
not with Win2003) DNS so I was forced to use BIND
(it is running on Win2000 just fine.)

Four issues I am pursuing:
1) Loading a large "blacklist" into DNS through the cachhe: SOLVED
2) Using a forwarder to check a separate (disjoint) namespace
while still using internal servers to check the internal namespace
even when the forwarder cannot resolve the query. SOLVED
3) Building an RBL (real-time blacklist) for email that dynamically
checks, synthesizes composite records, and caches from multiple RBL
servers: Proof of concept in Perl; needs to be added to the DNS
server.

The "hosts" file issue came up because there are various
large "hosts" file that resolve their records to 127.0.0.1
(or another essentially wrong or invalid address) to prevent
loading of advertisement graphics. Turns out that a large
but reasonable list can block a high percentage of ads that
derive from the same subset of the Internet.

Unfortunately, MS DNS won't let me add records other
than (true) "root hints" to the cache -- it seems to just ignore
them.

Loading such hosts files on EVERY PC in even a small network
is both a nuisance, and it causes a (temporary) performance problem
every time the file is load or EVEN ONE record is updated -- the
PC spends up to an hour at near 100% utilization churning through
the entire hosts file (again.) [We are talking 750K hosts files with
17,000 records or so.]

Putting it on the DNS server (at the NAT/forwarding position)
solves this, and it loads without undue stress on the server in
under 10 seconds. Full reloads take less than a minute because
they require stopping/re-starting BIND -- but I am looking at
reducing that too.

Hard part was figuring out how to make BIND 9 do
persistent caching -- it's not in the manual, but reading
the source code indicated that giving a NAME to the
cache file, e.g., as an 'option' or 'view' setting of
cache-file "cache-file.dns";
...would save and re-load the cache on next start. Works
great.

(Two caveats: prevent the cache from being overwritten at
shutdown (Read-Only Attribute solved this), and THEN deal
with the "cache date" you need in the file since it never gets
updated and eventually -- a really long time with a 32-bit
integer -- it might expire. Setting TLL on the records to
2147483647 should handle this.)

Goal #2:
Arrange for a completely separate (DISJOINT) Namespace
to have it's DNS servers forward (to the Internet) for public
namespace entries, but STILL do internal recursion for
private names using a private root namespace.

Result: Success

Problem: For private names the Fowarder returns NXDOMAIN
and the internal DNS servers stop searching -- so trick the
forwarder into REFUSING the requests (or if desparate giving
Server Failure for internal names) so that the internal DNS will
KEEP LOOKING -- no answer would be worse because we
would then have to wait internally for the timeout to expire.

Method: Create "stub" zones for the internal zones on the forwarder
but also use an ACCESS list (ACL) to deny on those zones
so they never really get searched. (A view works best but isn't
absolutely necessary.) It also works with a 'Master' but 'stub'
is closer to the concept.

Note: "stub" zone is a technical term in BIND and although I am
using the "stub" zones to perform this function it can be done with
other types -- ideally there would just be a "refuse" or "constant"
Zone type...

Two improvements would be nice (I'll have to change the BIND
source for these):
1) Never "fail" but always refuse
2) A new Zone type "Refuse" where only the zone name is
needed and the extra "stub" cruft can be skipped.

It's not worth changing the code JUST for these but I have another goal
or two:

Goal #3: A multiplexor RBL (real-time blackhole list) with scoring;
when a "blackhole spam server" test is requested, do a lookup to
a "group" of RBLs and use a factor (e.g., 0.7 or 0.5) with a threshold
(e.g., 1.0) to determined if an RBL record should be "synthesized".

Purpose: Allows checking more than more RBL for (weighted)
concurrence before rejecting email from that source, and treating
'aggressive' RBLs differently than 'conservative' RBLs, e.g., 3
agressive server reports might equal one conservative RBL report.

RBLs have been the single most effective tool I have found against
spam -- don't get excited, they aren't complete, but I did knock out
75-90% of MY SPAM by checking the 2 RBLs my email server
allows. If I can push that towards 95%, the remaining 5% can be
dealt with easier by Bayesian and keyword filters.

Status: I have a working Perl DNS server as proof of concept but it
needs to be converted to BIND 9 source and compiled into there.

[I haven't modified BIND yet, but I have it compiling under VC in
VS.Net 2003. BIND 9 is a good size program and I have yet to find
any significant "programmers' notes" other than header files and
comments.]

Thanks to anyone who helped, who tried to help, or is just interested
in reading this report.
 
Windows Server 2003 does support forwarding to a disjoint namespace
through
the use of stub zones.
Windows 2000 Server does not allow this.

Yes, but merely having Stub zones is insufficient,
see below....
I'm not familiar enough with DNS to comment on the rest of your post.

Deep in there somewhere I mention that "Stub" is a technical
term in BIND (now also in Win2003) but this has only a
peripheral bearing on the behavior discussed here -- I happen
to be using stub zones, but I also succeeded with a Master and
suspect other types would work.

The key was the ACL (access control) and ideally there would
be a new Zone type with minimal behavior: REFUSE, or CONSTANT
zone. There are other uses for "Constant" zones so I am leaning towards
implementing it that way.

RBL zones are my other interest -- so this would make two new types,
RBL and REFUSE/Constant Zones. Constant zones might also be
used to efficiently generate predictable answers so there are some
other ideas here.

Essentially, all these are "Synthetic" zones so maybe using one new
type and configuring the behavior is best.

RBL -- synthesizes a Blackhole entry from other RBL servers
Constant -- always returns the same (synthetic, no DB) answer
Generate -- same idea, except variable synthesis
Refuse -- (like) any of the above but designed to refuse gracefully even
if the record is not available.

[Oh, well, I am having "Fun With DNS" ]
 
Ummmm. OK. (Scurries away and orders a copy of DNS for Dummies.)

Out of my depth on this one. :-)

Cheers

Oli


Herb Martin said:
Windows Server 2003 does support forwarding to a disjoint namespace through
the use of stub zones.
Windows 2000 Server does not allow this.

Yes, but merely having Stub zones is insufficient,
see below....
I'm not familiar enough with DNS to comment on the rest of your post.

Deep in there somewhere I mention that "Stub" is a technical
term in BIND (now also in Win2003) but this has only a
peripheral bearing on the behavior discussed here -- I happen
to be using stub zones, but I also succeeded with a Master and
suspect other types would work.

The key was the ACL (access control) and ideally there would
be a new Zone type with minimal behavior: REFUSE, or CONSTANT
zone. There are other uses for "Constant" zones so I am leaning towards
implementing it that way.

RBL zones are my other interest -- so this would make two new types,
RBL and REFUSE/Constant Zones. Constant zones might also be
used to efficiently generate predictable answers so there are some
other ideas here.

Essentially, all these are "Synthetic" zones so maybe using one new
type and configuring the behavior is best.

RBL -- synthesizes a Blackhole entry from other RBL servers
Constant -- always returns the same (synthetic, no DB) answer
Generate -- same idea, except variable synthesis
Refuse -- (like) any of the above but designed to refuse gracefully even
if the record is not available.

[Oh, well, I am having "Fun With DNS" ]
 
Ummmm. OK. (Scurries away and orders a copy of DNS for Dummies.)
Out of my depth on this one. :-)

Sorry, I didn't mean to discourage you.
Hey, a week ago it was probably too deep for
me.

The technical details may not interest you
but the results you can achieve are quite
dramatic:

1) Reductions in Advertisements cluttering up
pages when browsing the web

2) Reduction in bandwidth usage by your users
web requests or and spam

3) Reductions in size of worthless stuff populating
your Proxy server, your email server, your client
browser caches, and your client mailboxes
(All this stuff takes up disk space or drives valuable
info out of the caches or mail files)

4) Improved responsiveness (faster) browsing

5) Extreme reduction in Spam at the server
with few false positives and very little processing
power since it can reject before the content is
even accepted at the server

6) Ability to avoid dangerous or undesirable sites
with almost no effort and no additional client
software

7) Some reduction in exposure to dangerous viruses
trojans, attacks (this is NOT a true security measure
but it's equivalent to putting a low fence around your
property as it reduces those "just cutting through your
yard."
 
Sounds cool. I'll have a look through your post in more depth later in the
week.

Cheers

Oli
 
Isn't there a way to script and force mutliple queries to a forwarder, say
a
server you setup with all those zones created, which then would populate the
cache at boot or service startup?

Nope -- and if there were it would generate 75,000 (plus
glue stuff) queries.

The idea is not to resolve these addresses CORRECTLY but
to HIDE them from your users (and myself) by giving back
an known "wrong answer" (typically 127.0.0.1 but I am
experimenting with a couple of variations on this theme.)

I also don't want all this junk in every client's cache on
speculation -- it might take hours to revolve this mess --
on a moderately slow (600Mhz) machine the processing
of this as a local HOSTS file takes near an hour.

Most clients will never visit the majority of these locations
on any PARTICULAR boot cycle. It's similar to the
theory of having the world-wide Internet DNS database
be "distributed hierarchically" around the world until needed
by a particular client.
 
No, not saying in the client cache, on the server. Once a server resolves
it, it'll be cached. Create a script to resolve them all at DNS server boot,
it will cache them in the server cache, then have an entry to clear the
resolver cache on the server.

"Resolves it" from where? These records are synthetic;
they are just a collection that various people keep of
"undesirable sites or graphics advertisement storage"
so there is nowhere to resolve from.

It also means that we can tune our own list and aren't
dependent on (whoever) those guys who collect and maintain
these lists It's possible to add our own or remove sites we
DO wish to visit.

There's an analogy here: Snort (or other IDS) signatures
you download from there site rather than create them ALL
yourself -- you can still add or subtract though.
Just a WAG, but hey, I tried.

BIND works -- and we could use your idea to get them over
to Win2000 but that is "too little too late" if we have to use
BIND to do it anyway.
 
I admit that I probably have not read your posting thoroughly. But the
gist
of it is that you are trying to preload certain records into your DNS cache
so that you can essentially blacklist them and tell your DNS server to
direct any request for any blacklisted record to a blackhole.

You understand correctly.
IF I am correct in the above assumption, please don't take offense if I say
that you are probably attempting to change your car's tires with a barber's
hair clipper. Wrong tool for the wrong job.

Perhaps, perhaps not. No harm in thinking through
other ideas, but I also wanted several other things
that ISA or proxies don't address but are related here
through DNS.
I do similar thing currently with ISA, and ISA does it beautifully without
any patchwork. Any "good" Proxy server should be able to do this. If I have
understood you correctly, I don't believe DNS (at least the current MS DNS
implementation) is the correct technical solution for your ideas. Maybe this
concept will be incorporated into an RFC somewhere, but right now .....

No, you are correct MS DNS can't do it. BIND can, and
without all the complication of ISA (but ISA adds much other
functionality so anyone who is going to run it anyway should
consider ISA for this task as well.)

Can ISA handle 75,000 domain entries easily? Frequent updates?
(I don't know because I haven't tried it in ISA). Easy to automate
the updates and reload without dropping connections/mappings
on a production server?

Another issue is that for ISA to handle this, we must first resolve
those DNS addresses, decide it's not local and send it to ISA,
then ISA must refuse.

On the other hand, with the DNS blackhole back to the SAME
machine, a lot of these messages never happen and others
file request never cross the wire.
If I have not understood you correctly, I apologize.
my $0.02

No apology necessary -- my first goal was to solve the
multiple distinct (or disjoint) namespace issue of the forwarder
returning NXDOMAIN -- while doing this with BIND (MS DNS
would do that either) I also tried the blackhole cache list idea
and it works so easily it's hard to imagine anything being much
easier.

In fact, it was so easy I was actually thinking of COMBINING
it with ISA for another effect (that isn't well thought out or tested
yet) that occurred to me.

ISA is also a purchased product and while I have a license, my
investigation was also motivated by a question that appears in the
group from time to time about doing this with HOSTS files
and such. All my tools here were free, and the source code
means I can extend them or modify the built-in behavior.

Hosts files definitely seem a 'bad idea' <tm>

If ISA works better, I may return to that, but ISA and I are
a bit "out of sorts" with each other over some things it did
to two of my gateway machines. <grin>
 
HM> How to pre-load the "cache" with records to
HM> prevent access to "ad display" (or other undesirable) site?

This is using chocolate-covered bananas to integrate European currencies
again. One doesn't populate the cache with Microsoft's DNS server in order to
make blacklisted domain names totally unreachable. One populates the
_database_. Simply creating a dummy zone for each of the blacklisted domain
names should suffice for rendering them unreachable. This would seem to be
possible with DNSCMD and a modicum of command scripting.

However: It should be noted that DNS service is a very heavy-handed solution
to this problem; and one that causes a lot of collateral damage, since it
affects _all_ services provided from the named machines (such as SMTP, FTP,
POP3, and NNTP), not just content HTTP service. If one wants to block
advertising Web sites, a better approach would be to employ whatever
blacklisting or domain name redirection facilities are provided by one's proxy
HTTP server.
 
[ said:
while we are still on the point, can I have a copy of this tools - as soon
as you work out the kinks? :)

You may certainly have a copy -- right now it is just
free BIND 9.xx from ISC.org with a certain set of
zone config entries.
I don't know how this is a limitation of ISA, though.

No, it's not a limitiation "of ISA" but of "using ISA" -- the client
won't discover the material is unavailable (blocked) until it
ASKS ISA for it. With a local (127.0.0.1) block address
no request travles on the net.
Absolutely no harm in creativity. And, if it works well, who knows? You may
get MS to pony up some of those Billions of dollars they have in the vault
in compensation for your originality :)

Microsoft has been good to me so far -- why not.
I haven't reached 75K YET, but that's because I don't have a list of xxx
sites that huge :)
Yes. I script mine.

Well, with a script you download "hosts" files from various
collection sites and re-process them to the script format or
have the script do it. (I user Perl now to convert from "hosts"
to a "Dns cache" format.)
No. Remember that ISA is not just a Proxy server, it's a firewall too.

Yes, I am aware of that, which prompted the question. With BIND
the worst that happens (not necessarily though) is the DNS forwarder
server goes offline for less than a minute. No existing connections
are affected.
 
This is using chocolate-covered bananas to integrate European currencies
again. One doesn't populate the cache with Microsoft's DNS server in order to
make blacklisted domain names totally unreachable. One populates the
_database_. Simply creating a dummy zone for each of the blacklisted domain
names should suffice for rendering them unreachable. This would seem to be
possible with DNSCMD and a modicum of command scripting.

Who would want 10,000 new zones when they can merely load the
cache?

No script file necessary to "parse and create" all these (unnecessary)
zones. They are just individual cache entries.

However: It should be noted that DNS service is a very heavy-handed solution
to this problem; and one that causes a lot of collateral damage, since it
affects _all_ services provided from the named machines (such as SMTP, FTP,
POP3, and NNTP), not just content HTTP service.

That is a true statement but not a disadvantage -- it means that we can also
knock
down some spam by accident, perhaps.

I have no intention of letting my users download with FTP, NNTP, etc.
from advertisement and XXX sites.

If one wants to block
advertising Web sites, a better approach would be to employ whatever
blacklisting or domain name redirection facilities are provided by one's proxy
HTTP server.

Some of us don't control that Proxy -- but we do control our DNS.
We can do more with the DNS (as you have pointed out.)
This may be (much) more efficient than the Proxy solution.

It's functionally equivalent to and MUCH better than the "hosts" file
solution that many people are using.

There is also the issue of other related benefits like allowing the
DNS forwarder to handle the Internet while solving the problem
of internal servers (which use that forwarder) still doing actual
recursion through a separate (disjoint) namespace.

It's trivial to prevent the NXDOMAIN message -- returning REFUSE
instead -- so that internal servers can actually recurse their own private
root namespace AND use a forwarder for everything public without
having a long wait timeout either.
 
Back
Top