Delay replication site

  • Thread starter Thread starter Clayton Sutton
  • Start date Start date
C

Clayton Sutton

Hey everyone,

I once heard of somethink like a delay replication site. A site (maybe at
your DR location) where your main AD DCs will replicate ONLY twice a week.
that way if someone deleted an OU in AD you could do an authitave restore
from the delay site. Anyone know what I'm talking about? Can anyone point
me to a step-by-step doc to walk me through setting up one?

--

TIA,


Clayton


P.S.: I wrote an iTunes podcast tutorial and just want to publicize it.
You can find it at: http://www.nikoli.net/itunepod

*******************
 
You're thinking of a "lag site". There's a lot of religion surrounding lag
sites - some people/companies love them, others think they're a horrible
idea.

The following article provides an overview of the topic, and a Google of the
words 'active directory lag site' will give you any number of other links
and references:
http://searchwinit.techtarget.com/tip/0,289483,sid1_gci1086805,00.html

HTH


--
-----------------------
Laura E. Hunter
Microsoft MVP - Windows Server Networking
Author: _Active Directory Consultant's Field Guide_
(http://tinyurl.com/7f8ll)
Author: _Active Directory Cookbook, Second Edition_
(http://tinyurl.com/z7svl)
 
It seems to me that if the lag site only updated twice a week,...that it would
be better served by just doing "system-state "tape backups on a rotation. A
tape drive and a few tapes is a whole lot cheaper and easier to manage than a
lag site.

So I guess my religious belief would be that it is a bad idea,..or at least a
very expensive and needlessley complex idea.

--
Phillip Windell [MCP, MVP, CCNA]
www.wandtv.com

The views expressed (as annoying as they are, and as stupid as they sound), are
my own and not those of my employer, or Microsoft, or anyone else associated
with me, including my cats.
 
Clayton Sutton said:
Hey everyone,

I once heard of somethink like a delay replication site. A site (maybe at
your DR location) where your main AD DCs will replicate ONLY twice a week.
that way if someone deleted an OU in AD you could do an authitave restore
from the delay site. Anyone know what I'm talking about? Can anyone
point me to a step-by-step doc to walk me through setting up one?

Such isn't really anything special as Sites go.

Just setup a site with its own SiteLink(s) that use only certain
times of the week as the schedule and adjust the frequency so
there is only time for one replication per open window.

You can have a site updated every day, twice a night, twice a
week, or pretty much anything as long as you replicate at least
once per week.
 
Opinions vary about them. Most large orgs (>100k users) that I have seen
use them. They like them because recovery of objects is much faster than
restoring multigig DBs from tape, especially if the process is to store
the tapes offsite. These companies still do offsite tape stores as well,
but don't need to recall/restore the tapes for simple user or group
recoveries.

If you are already managing a larger infrastructure, a lage site is just
another site and management is really not much. The cost of DCs is
pretty low in those environments as well versus the cost of tape
recovery. These numbers will vary for different corporations as well as
the requirements (SLAs/SLOs) for recovery. Each company needs to
determine if it makes sense for them. Some places will feel they are a
great savings, some places will feel they are a great waste, some places
will use them for specific circumstances (say during migrations or
periods of mass updates like Exchange upgrades, etc).

Me personally, I dislike the idea of ever doing auth restores. My
feeling is to give very very few people the ability to delete things
that you would possibly ever need to auth restore in the first place. I
ran a Fortune 5 AD for many years with some 250,000+ users and never did
an auth restore, never planned on it. In 7 years since I first set up AD
there they still haven't done a single auth restore. Everything was
handled by provisioning or by the 4 DAs. If something was deleted, it
was meant to be deleted.

That being said I have designed/helped with the implementation of lag
sites in several large companies. Usually it is one of the few DCs
outside of the test lab that I will allow to be virtual. Commonly do
three virtual DCs per domain, fully configured to not be used by clients
(not in WINS and DNS records are properly blocked as per the specific
KBs). DCs are all scheduled to replicate once per week on different
schedules (say M,W,F) though I have done 5 DCs per Domain and set
schedules to M-F; alternately have DCs start up and shutdown once day a
week so replication absolutely can't be forced (there are other
unsupported hacks to enforce this as well yet keep the DCs up). The nice
thing about using virtuals here, besides the cost, is that you can
quickly and easily pick up the virtual machines and drag them to a
segregated test lab to do things such as schema tests, etc. Or quickly
recover your entire forest in the event of a complete disaster by
spinning them up on even workstation class hardware. Anyone who has done
even a single dissimilar hardware AD Recovery under pressure or even in
a DR test can appreciate the simplicity of just loading a virtualization
product and firing up the DC and it works right off.

joe


--
Joe Richards Microsoft MVP Windows Server Directory Services
Author of O'Reilly Active Directory Third Edition
www.joeware.net


---O'Reilly Active Directory Third Edition now available---

http://www.joeware.net/win/ad3e.htm
 
I like the VM idea a lot.
I'm a little confuse how you would restore a particular item from these DC
though. I fact I don't see how you would restore anything from them to the
"real" system without things getting very messy and confusing. It seems it
would be easier to throw out the real system and replace it from the VMs easy
enough, but don't see how you would do a partial.

--
Phillip Windell [MCP, MVP, CCNA]
www.wandtv.com

The views expressed (as annoying as they are, and as stupid as they sound), are
my own and not those of my employer, or Microsoft, or anyone else associated
with me, including my cats.
 
It seems it would be easier to throw out the real system and replace it
from the VMs easy enough, but don't see how you would do a partial.

There's a couple of ideas with the lag VM. Should you need to perform an
auth restore (from a single attribute to an entire subtree or NC), you
simply boot into DSRM and mark the necessary objects as authoratative (or
use your online recovery tool). If you have a major data problem or an
erroneous schema change then you can use the lag DC for your forest recovery
node. You disable replication and turn all other DCs off. You then cleanup
the NTDS and NTFRS metadata, enable replication, make the DC a GC and
promote new machines into the domain again.

We don't use lag sites. Instead we use Quest's Recovery Manager to perform
disk-based backups, which can be used for any kind of restoration and can be
done online. I keep a month's worth of full backup and daily diff's on a
per week basis. I also archive a full once per month for up to twelve
months for a forest recovery.

This eliminates the need for tape which is slow and troublesome. The SAN is
used which is replicated between two data centres.

I also don't recover a failed DC. That DC is rebuilt and a metadata cleanup
process is initiated. The rebuild is a full OS rebuild and DCPROMO from
IFM. It is 100% automated using Open View Radia. This is great, real easy,
although won't work for places whereby you don't have didicated DCs.
Getting the IFM there is fun, as the DIT is big, but there's a secure
process in place for that.

I also agree with Joe on the auth restore, in that it shouldn't need to be
done, however I cater for it anyway, as although there's a metaverse and
provisioning system, there's still humans running round with the ability to
delete subtrees when performing administrative tasks, although these should
be few and clever enough not to do this.
 
Thanks for all the input guys, but does anyone have any links to
step-by-step docs?

--

TIA,


Clayton


P.S.: I wrote an iTunes podcast tutorial and just want to publicize it.
You can find it at: http://www.nikoli.net/itunepod

*******************
 
Paul said:
There's a couple of ideas with the lag VM. Should you need to
perform an auth restore (from a single attribute to an entire subtree
or NC), you simply boot into DSRM and mark the necessary objects as
authoratative (or use your online recovery tool). If you have a
major data problem or an erroneous schema change then you can use the
lag DC for your forest recovery node. You disable replication and
turn all other DCs off. You then cleanup the NTDS and NTFRS
metadata, enable replication, make the DC a GC and promote new
machines into the domain again.
We don't use lag sites. Instead we use Quest's Recovery Manager to
perform disk-based backups, which can be used for any kind of
restoration and can be done online. I keep a month's worth of full
backup and daily diff's on a per week basis. I also archive a full
once per month for up to twelve months for a forest recovery.

This eliminates the need for tape which is slow and troublesome. The
SAN is used which is replicated between two data centres.

I also don't recover a failed DC. That DC is rebuilt and a metadata
cleanup process is initiated. The rebuild is a full OS rebuild and
DCPROMO from IFM. It is 100% automated using Open View Radia. This
is great, real easy, although won't work for places whereby you don't
have didicated DCs. Getting the IFM there is fun, as the DIT is big,
but there's a secure process in place for that.

I also agree with Joe on the auth restore, in that it shouldn't need
to be done, however I cater for it anyway, as although there's a
metaverse and provisioning system, there's still humans running round
with the ability to delete subtrees when performing administrative
tasks, although these should be few and clever enough not to do this.

I'm curious about the consistant "at least once per week" (lag site)
replication. Is this just precaution it doesn't run over 60 days, or is
there a best practice or other specific for this "not to exceed" acceptance?
 
You cannot schedule replication to be beyond one week, well you can
schedule it but at the one week mark the DC will queue up replication
anyway. The only way you can exceed a week is by turning the DC off or
using some configuration tricks.


--
Joe Richards Microsoft MVP Windows Server Directory Services
Author of O'Reilly Active Directory Third Edition
www.joeware.net


---O'Reilly Active Directory Third Edition now available---

http://www.joeware.net/win/ad3e.htm
 
Joe said:
You cannot schedule replication to be beyond one week, well you can
schedule it but at the one week mark the DC will queue up replication
anyway. The only way you can exceed a week is by turning the DC off or
using some configuration tricks.

Ah, yes. Thanks Joe.
 
Back
Top