Replication errors

  • Thread starter Thread starter joecarter83
  • Start date Start date
J

joecarter83

One of our three domain controllers (let's call it 'DC3') suffered an
unrecoverable hard disk failure a week ago. Having manually demoted
DC3 from the domain, we rebuilt the server and successfully promoted it
to a DC.

However, we soon noticed some strange behaviour in the Active
Directory. DC3 would only talk to DC2, but not to DC1, which is the
PDC emulator, RID master, etc. The Directory Service event log in DC1
was also full of warning messages (1014, 1084, 1130). A restart of DC1
seemed to have cleared the problem, but it has since come back.

My feeling is that during the manual demotion of DC3, something wasn't
removed properly and now the replication cycle is effectively looking
for something that is not there.

Any help would be much appreciated.

Thanks, Joe.
 
One of our three domain controllers (let's call it 'DC3') suffered an
unrecoverable hard disk failure a week ago. Having manually demoted
DC3 from the domain, we rebuilt the server and successfully promoted it
to a DC.

However, we soon noticed some strange behaviour in the Active
Directory. DC3 would only talk to DC2, but not to DC1, which is the
PDC emulator, RID master, etc. The Directory Service event log in DC1
was also full of warning messages (1014, 1084, 1130). A restart of DC1
seemed to have cleared the problem, but it has since come back.

My feeling is that during the manual demotion of DC3, something wasn't
removed properly and now the replication cycle is effectively looking
for something that is not there.

Did you use the same name? (I have generally had very good results doing
that, but others have claimed different outcomes.)

If you used a "different name" have you done the NTDSUtil "Metadata cleanup"
of the old DC? (You will have to do so eventually, even if this doesn't fix
your
problem.)

Who is the GC? (With a single domain forest, or other small forests, you
should generally make every DC a GC.)

Can all of your DCs pass (no FAIL or WARN) a complete (/c) DCDiag?

What does repadmin say about replication?
 
Hi Herb! Thanks for the response.


The DC has the same name as before.

I did do metadata cleanup though I recall it brought up one or two
errors along the way.

DC1 is the only GC server.

DCDiag is not good. All three fail at least something, though DC2 and
DC3 seem to bring up fewer fails. They all report that no GC server
can be found though!
 
Hi Herb! Thanks for the response.


The DC has the same name as before.

I did do metadata cleanup though I recall it brought up one or two
errors along the way.

Those errors might be the genesis or the evidence of your current
problems...I have never received any errors from "NTDSUtil
Metadata cleanup" (if you get to the point where it is to be actually
performed it is nearly foolproof.)
DC1 is the only GC server.

DCDiag is not good. All three fail at least something, though DC2 and
DC3 seem to bring up fewer fails. They all report that no GC server
can be found though!

Likely DNS errors.

Describe your DNS setup and how (all) machine NIC->IP properties
are set for using DNS (just post UNEDITED, TEXT of "Ipconfig /all").

All internal DNS clients (which includes esp. DCs) must use strictly
the internal DNS servers which can resolve the records for the DCs
etc.

Almost all AD problems are really DNS problems if the basic IP is
routable (including not firewall filtered). A few are due to time being
out of sync. (Many of those are due to incorrect TIME ZONE -- and
then having the time set "visually" to look correct but actually an hour
or more off.)
 
I'd rather not post too much information about the domain (and
organisation) on a public forum.


The DC's use the following IP's and DNS's:


DC1:

IP: xxx.xxx.220.112
DNS: xxx.xxx.220.112


DC2:

IP: xxx.xxx.220.114
DNS: xxx.xxx.220.112


DC3:

IP: xxx.xxx.220.113
DNS: xxx.xxx.220.112


As far as I know, the clients use DC1 and an external DNS (from a
parent organisation) as their DNS, though I don't see why this should
matter.

Furthermore, I noticed that under Sites & Services, and the NTDS
settings, DC's 2 & 3 can check the topology, but 1 fails to check.
 
I'd rather not post too much information about the domain (and
organisation) on a public forum.

Changing the domain name isn't so bad but I have no faith in
chopped up IPconfigs -- they are usually hiding the critical piece
of info that the poster assumes is irrelevant.
The DC's use the following IP's and DNS's:
DC1:
IP: xxx.xxx.220.112
DNS: xxx.xxx.220.112
DC2:
IP: xxx.xxx.220.114
DNS: xxx.xxx.220.112
DC3:
IP: xxx.xxx.220.113
DNS: xxx.xxx.220.112


All using STRICTLY the first. DNS server assuming you didn't cut a
SECOND DNS server out of any or all of those. ( My point above
is I can't see that without the next line of the output.)

Run "NetDIAG /fix" or RESTART "NetLogon" service on each DC
and see if this helps. (I would also make all GCs too.)

Eventually you will want to make all three of these guys (AD Integrated)
DNS servers but that is NOT the current cause of your problem as long
as .112 is alive and well.
As far as I know, the clients use DC1 and an external DNS (from a
parent organisation) as their DNS,

You cannot do this UNLESS the "parent DNS" can also find the local
DNS (.112) and in that case there is not reason for clients to use the
parent directly probably.

Using different DNS server sets with different results is NEVER RELIABLE.
though I don't see why this should
matter.

It will for that client probably. And if the DCs do this then it will for
them
TOO. Clients "latch onto" the wrong DNS and fail to resolve the local
domain
names they need. This behavior is unpredictable and may even SEEM to
work so some people wrongly believe it can work.
 
I've run Netdiag which was fine. Dcdiag said that the directory
service could not be contacted for each DC.

I noticed that in Replication Monitor, DC2 and DC3 are both
successfully replicating with DC1 & DC3 and DC1 & DC2 respectively, but
DC1 is not replicating with either DC2 or DC3.

Also, the system state backup for DC1 is failing on the grounds of
active directory service errors. DC2 however, reports no such errors.

Apologies if this sounds confusing. I'm still learning the ropes,
particularly when it comes to DNS.
 
Back
Top