1000 Site Limit

  • Thread starter Thread starter Warren Oldroyd
  • Start date Start date
W

Warren Oldroyd

WE've been designing an AD for a customer who has
approximately 2000 branches. Each will require a site
and a DC (due to slow, unreliable links). I've come
across the various issues regarding KCC, DNS
registrations, group replications etc that are
problematic in these scenarios, most of which appear to
have been fixed in W2K3 or have configuration changes
documented.

We are looking at implementing Windows 2003, possibly
with the KCC on due to the improvements in scaleability.

However, I've just been told (by an MS consultant) that
there is a recommended limit of 1000 domain controllers
per domain which applies to both Windows 2000 and (he
checked) 2003. This apparently is due to limitations
with File Replication Service.

The only reference I can find to this is in KB 272567,
which mentions a limit at the bottom, but which says that
it was fixed in W2K SP2.

We don't particularly want to split this domain if we can
avoid it - one of the drivers for us moving to W2K3 was
the improvements in scaleability for branch office
designs.

Does anyone have any further info on this or can point me
at where it is documented?

Thanks

Warren
 
Hi Warren,

It is important to understand that the replication topology used to
replicate SYSVOL is different from how the topology is generated for other
DFS replicas. To replicate SYSVOL FRS uses the same replication topology as
Active Directory. In contrast, to replicate a DFS link FRS uses a full-mesh
topology. What this means is replication of SYSVOL is far more scalable. For
example, say you had a DFS link with replicas on 20 servers. That
configuration would create 380 connections. However, if instead you consider
SYSVOL replication of 20 servers in the same site (worst case scenario) you
would only have about 80 connections (assuming both of these examples are
using one way connections). The number of connections drops even more when
you are talking about inter-site replication of SYSVOL. What this means is
that the limitations that apply to FRS replication of DFS don't necessarily
apply to (or to the same degree) FRS replication of SYSVOL.

Now, that's not to say that over 1,000 domain controllers in a single domain
can't be done - it can be (and has been). However, you need to be extremely
thorough with your Active Directory design - there are a lot of numbers that
need to be crunched. Just because you can have so many domain controllers
does not necessarily mean it will work with your infrastructure. This would
be an extremely large domain and that's a lot of data to replicate to all
the branch offices. *Typically* when domains get this large it's a good idea
to use a few "smaller" domains based on geography (NAWest, NACentral, NAEast
for example) rather than one super large domain. By using multiple domains
you can limit the amount of traffic that each branch office must store.
Having said that, you know more about your environment than I do. Without
hard numbers it is difficult for me to make recommendations.

Mike

------------------------------------------------------------------
Mike Aubert
MCSE, MCSD, MCDBA
(e-mail address removed)

Note the "news2" in my email address is temporary and may be changed in the
future, remove it to email me at my Permanente address.
This posting is provided "AS IS" with no warranties, and confers no rights.
 
Warren said:
WE've been designing an AD for a customer who has
approximately 2000 branches. Each will require a site
and a DC (due to slow, unreliable links). I've come
across the various issues regarding KCC, DNS
registrations, group replications etc that are
problematic in these scenarios, most of which appear to
have been fixed in W2K3 or have configuration changes
documented.

We are looking at implementing Windows 2003, possibly
with the KCC on due to the improvements in scaleability.

However, I've just been told (by an MS consultant) that
there is a recommended limit of 1000 domain controllers
per domain which applies to both Windows 2000 and (he
checked) 2003. This apparently is due to limitations
with File Replication Service.

The only reference I can find to this is in KB 272567,
which mentions a limit at the bottom, but which says that
it was fixed in W2K SP2.

We don't particularly want to split this domain if we can
avoid it - one of the drivers for us moving to W2K3 was
the improvements in scaleability for branch office
designs.

Does anyone have any further info on this or can point me
at where it is documented?

one of the knowledgebase articles you cite (272567) says:

• In domains with more than 1,000 replication connections, FRS stops
working because it reaches the default LDAP query limit. A workaround is
to increase this limit, but the long-term solution is to use paged results.


This seems pretty clear to me -- it's not an inherrant problem with FRS,
but an unfortunate interaction with Active Directory's effort to make
sure that hosts that issue queries don't get swamped by masses of
results at once.

I would hope that the fix is in Windows 2003, but the workaround of
raising the query limit will exist if required.
 
Adam Wood said:
one of the knowledgebase articles you cite (272567) says:

• In domains with more than 1,000 replication connections, FRS stops
working because it reaches the default LDAP query limit. A workaround is
to increase this limit, but the long-term solution is to use paged results.


This seems pretty clear to me -- it's not an inherrant problem with FRS,
but an unfortunate interaction with Active Directory's effort to make
sure that hosts that issue queries don't get swamped by masses of
results at once.

I would hope that the fix is in Windows 2003, but the workaround of
raising the query limit will exist if required.

Thanks for the reply. The issue is that the same KB article also cites that
it was fixed in SP2. We have been told that the issue of FRS not scaling
past 1000 sites is still present in W2K3...?
 
Mike Aubert said:
Hi Warren,

It is important to understand that the replication topology used to
replicate SYSVOL is different from how the topology is generated for other
DFS replicas. To replicate SYSVOL FRS uses the same replication topology as
Active Directory. In contrast, to replicate a DFS link FRS uses a full-mesh
topology. What this means is replication of SYSVOL is far more scalable. For
example, say you had a DFS link with replicas on 20 servers. That
configuration would create 380 connections. However, if instead you consider
SYSVOL replication of 20 servers in the same site (worst case scenario) you
would only have about 80 connections (assuming both of these examples are
using one way connections). The number of connections drops even more when
you are talking about inter-site replication of SYSVOL. What this means is
that the limitations that apply to FRS replication of DFS don't necessarily
apply to (or to the same degree) FRS replication of SYSVOL.

Now, that's not to say that over 1,000 domain controllers in a single domain
can't be done - it can be (and has been). However, you need to be extremely
thorough with your Active Directory design - there are a lot of numbers that
need to be crunched. Just because you can have so many domain controllers
does not necessarily mean it will work with your infrastructure. This would
be an extremely large domain and that's a lot of data to replicate to all
the branch offices. *Typically* when domains get this large it's a good idea
to use a few "smaller" domains based on geography (NAWest, NACentral, NAEast
for example) rather than one super large domain. By using multiple domains
you can limit the amount of traffic that each branch office must store.
Having said that, you know more about your environment than I do. Without
hard numbers it is difficult for me to make recommendations.
We're only interested in SYSVOL replication not DFS - we won't be using
multiple replicas.

I take your point about data transfers and this is something that we are
considering.
 
I've seen SYSVOL replication scale to over 1,200 domain controllers. With a
properly designed Active Directory structure you should be able to hit 2,000
domain controllers.



Let me explain the 1,000 replication connections in the article you linked
to. When you submit a query to active directory using LDAP there is a
maximum number of objects that will be returned - 1,000 by default.



Now, for FRS replication of DFS this is a problem. For each replica an
object exists for every other replica (full-mesh - at least in Windows
2000...you have much finer control of the DFS replication topology in
Windows Server 2003). When a server attempts to pull its list of DFS replica
partners it can't get all the objects if there are more than 1,000
connections.



SYSVOL replication does not work in the same way. Rather than storing the
replication topology, FRS stores a reference to the domain controller's NTDS
Settings object. FRS then uses the connection objects listed under the NTDS
Settings object for the DC - which is a lot less than a full mesh.



Even if the 1,000 object limit was an issue, it can easily be changed using
Ntdsutil. I also know many improvements have been made to FRS in Windows
Server 2003 - I'm just not sure if paging support was one of them. Maybe
someone from Microsoft would be so kind as to find this information for us!



Just let me know if you want to go into more detail.sounds like an exciting
project!



------------------------------------------------------------------
Mike Aubert
MCSE, MCSD, MCDBA
(e-mail address removed)

Note the "news2" in my email address is temporary and may be changed in the
future, remove it to email me at my Permanente address.
This posting is provided "AS IS" with no warranties, and confers no rights.
 
Thanks for the info and your thoughts. You're right in that the project is
certainly, er, 'challenging' :-)
 
Back
Top