FRS just stops..

  • Thread starter Thread starter Slimline
  • Start date Start date
S

Slimline

I'm using DFS and FRS to replicate a directory between two Server 2003 AD
servers. Everything seems fine but after a number of hours the replication
just stops. Even though I'm running GigE the replication speed is very slow
and I'm replicating 38 GB.

It appears that DNS is just dying on the target server but I can't
figure-out why. Rebooting the target server starts the replication process
once again. I get some errors that seem strange:

on the source server:
System: Warning: None of the IP addresses (192.168.2.2) of this domain
controller map to the configured site "Lab". ....
I t goes on the state that this may be a temporary problem but until FRS
stops everything seems fine. I can see and manage both AD servers on the
network and so on.

Application: Error: Windows canot query for the list of Group Policy
objects...
Application: Error: Windows cannot access the file gpt.ini for GPO
CN={31B2....}
I get a lot of these errors but I have not tweaked the group policys (to my
knowledge).

on the target server:
DNS Server: Warning: The zone mydomain.net was previously loaded from the
directory partition MicrosfotDNS but another copy of the zone has been found
in directory partition DomainDNSZones.mydomain.net. The DNS server will
ignore this new copy of the zone.
Does anyone know what this means? I could find any references to these
errors on the MS sites.

Scott
 
The DNS server warning can most likely be ignored. They are indicating that
you have an additional copy of your AD-integrated DNS in an application
partition within your AD. It is expected behavior for those DNS
registrations to be there on W2K03 DCs which are hosting the application
partition.

I would suggest getting a NETDIAG /V (from the support tools) and see what
information that reveals. This could be a slight network configuration
problem on the affected DC(s) or an intermittent network connectivity
problem between the DCs. There are other possiblities, but the NETDIAG is
the best place to start.
 
I've been able to determine that it appears as though some of my errors are
ocurring when FRS is holding off sysvol and AD replication. Since I'm
replicating so much data (38 GB) and it's taking about 16-20 hours, this
seems to cause problems with replication of sysvol and AD (?). I've noticed
that I get a message in the event viewer when FRS is completed for the
directory. The message states that frs is no longer preventing AD2 from
being a domain controller.

I've adjusted AD and sysvol replication schedules such that they only occur
during a few hour period. Then I've set-up my FRS replication schedule to
occur during the longer period when sysvol and AD replication is stopped.
Does this sound like I'm on the right track?

One clue was that I was getting 100s of messages stating that the admin user
on AD1 had successfully logged-on to AD2 and on AD1 I would get warnings
stating that 24 (or some other number) attempts to login to AD1 have been
made from a server that is not part of the domain.
It appears that FRS holds-off a lot of other activity and this may be my
problem. Is it a bad idea to use the AD servers for file servers as well?

Scott
 
From a security standpoint, it is generally not recomended to use a domain
controller as a file server.

I had a question though to make sure I understand your configuration. Are
you using the SYSVOL (or any of it's subdirectories) for domain user's to
access shares?
 
I'm not explicitly using SYSVOL for user shares. I assume you mean that
SYSVOL is my system install volume (C drive)? My shares are all on my N
drive which is a separate controller and drive (RAID set). However, by
default, FRS uses C:\staging... for staging the replication. I tried using
n:\staging for my stage directory but didn't notice anything diffeent as far
as reliability goes - both are extremely slow.

I've also noticed that every 3 minutes or so, I'm getting messages (3
messages, once every 10 seconds or so) in the security tab of event viewer
stating that a login attemp from the opposite AD server was successful.

Scott
 
By default, the staging directory is c:\winnt\sysvol\staging\domain. This
can be changed during promotion, or after the fact.

A 38gig SYSVOL replica is good sized, but big factors in this though are
speed of hardware and quality of network connectivity .

The reason I asked whether there are user shares in the SYSVOL is that each
change to a file is logged within the SYSVOL and queued up (with a great
deal of caveats and extra logic there) to be sent to other replicas so that
all machines are up to date. More changes there are to those files leads to
more replication, and hence the slower the service may get in processing
additional changes (inbound from other replicas or outbound to them).
Hosting a user file share in your SYSVOL would mean a good number of
changes.

There is an excellent tool called Health_chk.cmd that is included in the
Server 2003 Support Tools. If there is any concern about problems with your
file replication, that is the tool to use to look in detail. Among other
things, it will gather the change logs (inlog and outlog), and a scan of the
file replication service debug logs for errors (errscan.txt).

Also, what is the 38gig of data comprised of?
 
Tim,
Can you calrify a couple points:

When you say SYSVOL are you referring to the install directory for 2003 (my
C: drive)?
My C: drive is only 20GB. My data drive is 400 GB. The directory that I'm
replicating is on my data drive and that directory consists of 38 GB of
basic user file data - nothing special.

When I set-up relication of the domain root target for this directory, one
of the dialog boxes show the staging directory to be C:\frs-staging on each
of the 2 systems on which their are root targets (AD1 and AD2). I've
increased the size of the staging area in order to avoid the staging area is
full error but the systems have not become more reliable.

When copying this 38 GB of data from one machine to another (just
dragging/dropping it around in explorer) I get great throughput (~40MB/s).
My entire network is GigE and very clean and simple. However, replicating 38
GB is taking somewhere in the 36 hour range. It appears that the data is
copied from AD1 data drive to AD1 C drive to AD2 C drive and then to AD2
data drive. I've tried moving the staging area to my data drives (in the FRS
dialogs that appear when it's set-up) but performance was still very slow.

At this point I'm wiping the systems cleana nd re-installing 2003 on each
one from scratch. I've added a second GigE NIC to each system so that I can
dedicate a direct link between systems just for FRS replication. Can you
offer any words of advice on setting-up the systems this time around to
avoid some of these issues?
Scott
 
The SYSVOL is the specified area on a domain controller that contains the
default domain group policies (and any that have been added) and scripts.
This directory and it's subdirectories are file system junction points and
changes made to the files in them are replicated to all other domain
controllers in the domain (if the change is needed). This replication is
done by the File Replication Service, and the replica set (group of domain
controllers that replicate the SYSVOL contents amongst each other)
information-where the other machines are and who talks to who essentially
for this- is kept in the Active Directory. File replication is separate
from Active Directory replication.

The default directory for the SYSVOL (meaning policies and scripts) is
%systemroot%\SYSVOL\SYSVOL, where %systemroot% is the volume and directory
where you installed Windows.

Clients connect to the SYSVOL to locate and download group policies and
scripts to apply at user logon and machine boot, using FQDNs like
"\\childdomain.domain.local\sysvol\childdomain.domain.local\Policies\{31B2F3
40-016D-11R5-946F-00C04FB984F9}\Machine\Microsoft\Windows
NT\SecEdit\GptTmpl.inf". If you're ever curious to watch this happen, you
can enable USERENV logging on a workstation as a user logs on.

It is also possible to use the File Replication Service to replicate
Distributed File System (DFS) replicas on domain-based DFS hierarchies.
Replicating DFS is a tool to have site-specific (if set up that way) common
DFS shares where users can place their data and have it kept up-to-date on a
replicas so a user can connect to any of them to access them same up-to-date
data.

It now sounds like that is what you may be doing, and that you are doing the
initial sourcing of new replica links from the source one. The initial
sourcing will simply take awhile with a large amount of data. The length of
time you indicated sounds about right, generally speaking. After sourcing
is done and the replicated DFS link replicas are in production, normal daily
changes (unless they are many gigs at a time) should be very smooth.

Let us know if I've misunderstood what you have and are doing, or if you
have other concerns or questions.
 
Back
Top