Using DNS for Fault tolerance ?

  • Thread starter Thread starter MostlyH2O
  • Start date Start date
M

MostlyH2O

Hi Folks,

I wonder if someone could help me to understand a few basic facts about
primary and secondary DNS servers and how they work...

I have 2 web servers at 2 different physical locations - each with their own
static IP and each runs it's own DNS and web server. I would like to use
one of these servers as a backup for the web site on other server, so I
registered both IP's as DNS servers with network solutions. When I
registered my domain name, I entered the one DNS as the primary and the
other as a secondary DNS. Individually, each web server and DNS works fine.

Here's where I get confused....

If, for example, a tree falls on my internet connection cable at my primary
site, I would like the secondary site to take over the duties of both the
DNS and the web site at it's location. When I set up my Backup DNS server
with the domain entry as a "Standard Secondary" entry, it populates itself
with the info from my primary DNS at the main location. But this does not
provide the redundancy I need because it just points the request right back
to the IP of the primary.

I am unclear exactly how the DNS refreshes itself on the internet, but when
I set my second (backup) location's DNS to a "Standard Primary" and have it
point to it's own IP, it seems that all requests automatically will go to
the secondary location. However, I only want requests to go to the secondary
if the primary is down.

I would greatly appreciate it if someone could explain how it would be
possible to create a truly redundant web site using DNS. Where the
secondary DNS & Webserver take over only in the event that the primary is
down.

If there is a reasonable easy tutorial on the subject, I would be happy to
read it as well. I have read a few primers on DNS, and have used it in it's
basic form, but the information I am seeking has eluded me- most of the info
I find is related to internal network redundancy and DHCP and seems more
complicated than my simple need.

Thanks very much,
Jack Coletti
 
There is a world of difference between "redundant DNS"
and "DNS for redundancy." They are unrelated, and
you seem as if you might be confusing the two concepts.

Listing multiple DNS resolvers for your zone is a
supported and recommended way to insure that --
if a DNS server itself fails -- clients are still able to
resolve names in your zone from another DNS server.
This is "redundant DNS" and is part of the intrinsic design
of DNS.

"DNS for redundancy" or perhaps better "failover DNS"
is in the area of -- well -- 'hinky'. DNS was never designed to
accomodate this function, and it doesn't wear it especially well.
DNS is intended that all servers for a zone reflect more or less
the same information, (though of course the fact that you can work
it otherwise opens the door to some interesting possibilities.)

The core problem is that DNS has no concept of down sites
or any way to even "know" if your web site is up or down. Indeed,
it doesn't know anything about web sites or ports or applications
at all. So, any effort to do this strictly within DNS is doomed.

If you want to use DNS to handle some measure of
failover, then it follows you need to provide some
extra-DNS mechanism to watch your web sites
(or whatever application you are using) and adjust your
DNS server(s) accordingly.

A few months back I posted a simple batch (vb?)
script on this forum that used pings and dnscmd
to adjust DNS entries according to responses received.
If you search back you should be able to find it easily
enough (I don't have it handy or I'd repost it).

It isn't at all difficult to put this together yourself; you just need
a command or program to "test" availability and return a status,
and then use dnscmd to add or delete host entries in your DNS
servers based on the results of that test. You can then run
that with the Windows scheduler periodically.

You also need to set the host TTLs to a small enough
value so that clients are forced to refresh within approximately
your watching interval.

I want to stress that I don't think this is an especially good idea,
but I recognize that people sometimes need a modest
"poor-man's" solution to a $10,000 multiport application router.

Steve Duff, MCSE
Ergodic Systems, Inc.
 
Hi Steve,

Thanks very much for setting me straight on the issue. I did find a copy of
your batch file on google. And...well, I hate to be a bother, BUT, you are
the first person to really explain this thing well. Could you do me the
favor of a bit more of a play-by-play with the rem statements. I understand
the general structure with the calls and subroutines, but some of the syntax
eludes me - especially in the ping statements. I know it should be obvious,
but could you explain the variable assignments in the set statements at the
beginning?

And when it's done, I save it as a *.bat file, right? Does it go in any
particular location? And when I do the Scheduling wizard, will it see the
batch file - or do I need to select an application to run it?

Perhaps you could just recomment a good primer on running VB Scripts in a
windows environment? I really like your solution to this problem and I'd
like to learn more.

Thank you for your help and patience :-)
Jack ...

REM Replace SETs below per your environment
set gw1=20.20.20.20
set gw2=21.21.21.21
set svr1=20.20.20.40
set svr2=21.21.21.41
set myserver = mydnsserver
set mydomain = mydomain.com

:loop
ping -n 1 %gw1% | find /i "Reply" > nul
if errorlevel 1 goto down1
ping -n 1 %gw2% | find /i "Reply > nul
if errorlevel 1 goto down2
call :addrec %svr1%
call :addrec %svr2%
goto continue
:down1
echo %gw1% appears to be down
call :addrec %svr2%
call :delrec %svr1%
goto continue
:down2
echo %gw2% appears to be down
call :addrec %svr1%
call :delrec %svr2%
goto continue
:addrec
dnscmd %myserver% /RecordAdd %mydomain% www 30 A %1% > nul
goto :EOF
:delrec
dnscmd %myserver% /RecordDelete %mydomain% www A %1% /f > nul
goto :EOF
:continue
REM (sleep command below is freeware DOS program)
sleep /p 00:00:30
goto loop
 
Oh, who the #$!* knows? I probably had
alcohol in me at the time.

Looks like gw1 and gw2 are the IP addresses
of the gateways you are pinging to test
availability. svr1 and svr2 are the corresponding IP
addresses of the web servers themselves. I don't quite
understand why I separated the two; it seems
like they could be the same IP sets if you want.

myserver and mydomain should be self-explanatory.

If you have multiple DNS servers then you would need
to beef up the logic in addrec/delrec to update all
servers (secondary zone transfers are much too
infrequent to be of help with this.)

The bat file can be situated anywhere, simply schedule
it with the Windows "scheduled tasks" to run at bootup
or put it in the startup group, keeping in mind it has to run as
an administrator. And keeping in mind that the Windows scheduler
has an annoying habit of sometimes forgetting how to run a task.

And, as always -- especially with anything
I've written -- the risk is entirely yours :-)

Steve Duff, MCSE
Ergodic Systems, Inc.
 
Back
Top