<inline>
What would be your suggestion to have a fault tolerant Active
Directory?
Improve the fault-tolerance of the Domain Controllers?
1) A second DC located in another building
This only addresses "Availability" -- not FT.
2) System State Backups and Recovery Disks updated daily
This strategy addresses "Disaster Recovery", but not FT.
Increase the quality of the equipment to the point that it can withstand the
vast majority of "faults" that occur. Fault tolerance is achieved by
installing equipment that either survives or fails-over during "faults".
That includes, but is not limited to, redundant disk systems and subsystems,
motherboards, nics, power supplies, and UPSs.
Does a system state backup on the FSMO holder, collect the entire AD
database if the only two DC's die?
Any system state on any DC will suffice for Disaster Recovery - provided it
is a relatively recent backup. What you are addressing with backups is your
DR plan - and the decisions there have to do with how much your AD
environment changes over time as opposed to how resistant to failure your
environment is. IOW - a simple risk analysis will do ...
A more pertinant question is, why do you believe both DCs will "die"
simultaneously? Wouldn't that indicate to you that you might be addressing
the wrong problem?
What if Active Directory gets corrupt and replicates to the only other
DC, how would you recover?
This "hypothetical" scenario has been asked so many times .... It can
actually be rephrased as follows:
"What if the Directory gets so corrupt that it's unusable, but not so
corrupt that it can still replicate?"
If you analyze that statement, the most likely "corruption" that would fall
into this category woudn't be some kind of software failure, but most
likely would be user (AKA Admin) induced. That being the case, I'd
investigate means to ensure people with admin rights don't do stupid or bad
things rather than worry about the very remote possibility that AD woudn't
do its job ... which it does rather nicely.
I know it doesent happen much but, if asked, how would you do it?
If it *did* happen, there is a process known as "Forest Recovery" you would
likely want to perform.
(
http://www.microsoft.com/downloads/...79-C99B-4DF9-823C-933FEBA08CFE&displaylang=en)
It's not a simple thing to do (I've practiced it in a lab for a 17 domain
forest), and it should be part of the regular training plan for the
AD/Exchange folks.
-ds