PDC emulator lock up

  • Thread starter Thread starter Ted Wood
  • Start date Start date
T

Ted Wood

We have a W2K PDC emulator that will lock up at random
intervals. The problem will follow the PDCe role to other
machines i.e. if we move the PDCe FSMO role to another DC,
that machine will also lock up. This being the case, we've
discounted a hardware problem on the original machine.

A perfmon trace and a network analysis do not indicate any
particular problem just prior to the lock up; the server just
goes away and has to be powered off /on to clear the problem.
Since the machine locks up we are not able to do any debugging...

We've got HIS2000/MSDE installed on the machine running the
PDCe role, but that's the only thing that varies from a plain
vanilla DC build.

We've had this problem since May 1st and it may correspond
to the application of some patches at the end of April.

Has anyone seen this particular problem before?
 
Take a perfmon again and monitor LSASS threads for any problems. Run a
"net files" (if you can) when it locks up and determine whether there
are large numbers of SAMR pipes being opened. Also, any 2019s/2020s in
the app logs? Perhaps a possible memory leak?


Chris Malone
 
Take a perfmon again and monitor LSASS threads for any problems. Run a
"net files" (if you can) when it locks up and determine whether there
are large numbers of SAMR pipes being opened. Also, any 2019s/2020s in
the app logs? Perhaps a possible memory leak?


Chris Malone

The server' locked up tight when this happens; the keyboard
and com ports don't respond so we can't event do a debug.

There's no trace of 2019's or 2020's in the app eventlog.
We've just migrated 10,000 NT4 SP6 workstations over to XP,
so our samr pipe problem has mostly gone away, but I'll try to take
sample of open files every couple of minutes or so via a
scheduled job to see if the number trends up just before the
lockup.

I'll try the perfmon again and look at the LSASS threads.
 
There is a couple of steps I would take to help identify the problems you are
facing. One is to enable debug logging for the netlogon service and reveiw
the log.

http://support.microsoft.com/?id=109626

The second is to enable crash on control to allow you to capture a dump when
the server does go unresponsive.

http://support.microsoft.com/kb/244139/EN-US/

Below is 244139, which describes how to enable crash on ctrl scroll. Be
sure the machines are set for a full memory dump and have a swap file of
sufficient size on the boot drive.

Windows feature allows a Memory.dmp file to be generated with the keyboard
(244139)

--------------------------------------------------------------------------------


The information in this article applies to:

Microsoft Windows Server 2003, Datacenter Edition
Microsoft Windows Server 2003, Enterprise Edition
Microsoft Windows Server 2003, Standard Edition
Microsoft Windows Server 2003, Web Edition
Microsoft Windows XP Professional
Microsoft Windows 2000 Server
Microsoft Windows 2000 Advanced Server
Microsoft Windows 2000 Professional
Microsoft Windows 2000 Datacenter Server
Microsoft Windows Small Business Server 2003, Premium Edition
Microsoft Windows Small Business Server 2003, Standard Edition

--------------------------------------------------------------------------------

This article used to be called: Q244139

Important This article contains information about modifying the registry.
Before you modify the registry, make sure to back it up and make sure that
you understand how to restore the registry if a problem occurs. For
information about how to back up, restore, and edit the registry, click the
following article number to view the article in the Microsoft Knowledge Base:

256986 Description of the Microsoft Windows Registry

Summary Section
Windows includes a feature that you can use to cause the system to stop
responding and to generate a Memory.dmp file (if configured to do so). The
"Stop" screen that is generated contains the following parameters:

*** STOP: 0x000000E2 (0x00000000,0x00000000,0x00000000,0x00000000)
The end-user manually generated the crashdump.

MoreInformation Section
Warning If you use Registry Editor incorrectly, you may cause serious
problems that may require you to reinstall your operating system. Microsoft
cannot guarantee that you can solve problems that result from using Registry
Editor incorrectly. Use Registry Editor at your own risk.
This feature is disabled by default. To enable this feature, you must edit
the registry as indicated below and restart the computer. After you restart
the computer, you can generate a Memory.dmp file by holding down the right
CTRL key and pressing the SCROLL LOCK key twice.

Note Make sure that you use the CTRL key on the right side of the SPACEBAR.

Please note that the steps below will not work on Legacy Free computers--for
example, those that use a USB keyboard. This key combination must be received
and processed by i8042prt.sys (the driver for Standard 101/102-Key keyboard
or the Microsoft Natural PS/2 keyboard) for your computer to stop responding.
Similarly this will not work in a Virtual PC session, as the VM Additions
replaces this driver with Vpc_8042.sys (the driver for the VM Additions PC/AT
Enhanced PS/2 keyboard [101/102-Key]). For those computers, you must attach a
debugger:

Start Registry Editor (Regedt32.exe).
Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\i8042prt\Parameters

On the Edit menu, click Add Value , and then add the following registry
value:
Value Name : CrashOnCtrlScroll
Data Type : REG_DWORD
Value : 1

Quit Registry Editor.

Note You must restart your computer for these changes to take effect.
How to Select Memory Dump Options
There are three types of memory dumps that can be generated. Choose the
appropriate one before manually triggering the dump.

Right click My Computer , and then click Properties .
Click the Advanced tab, and then click the Startup and Recovery button.
Click Write Debugging Information , and then click to select either Complete
Memory Dump , Kernel Memory Dump , or Small Memory Dump .
For additional information about memory dump options for Windows 2000, click
the following article number to view the article in the Microsoft Knowledge
Base:

254649 Windows memory dump file options overview

Note If your server has a feature such as the Automatic System Restart (ASR)
feature that is found in some Compaq computers, disable it. It can interrupt
the dump process. On Compaq computers, you can disable ASR by modifying the
basic input/output system (BIOS) settings.

Note Complete memory dumps may not be available on computers with 2 or more
gigabytes (GB) of RAM. Put the <MaxMem=2000> parameter in the Boot.ini file
to limit the amount of memory that Windows 2000 can access.

The third-party products that this article discusses are manufactured by
companies that are independent of Microsoft. Microsoft makes no warranty,
implied or otherwise, regarding the performance or reliability of these
products.

QueryWords Section
blue screen force dump bluescreen crash memory.dmp manual


Best regards,


John Powell
 
There is a couple of steps I would take to help identify the problems
you are facing. One is to enable debug logging for the netlogon service
and reveiw the log.

http://support.microsoft.com/?id=109626

The second is to enable crash on control to allow you to capture a dump
when the server does go unresponsive.

http://support.microsoft.com/kb/244139/EN-US/
--snip

I'll give the netlogon debug a try. We've already been through the "crash
on ctrl+scroll" with MS Premier. The machine is TOTALLY unresponsive in
that the keyboard and com ports (used for remote debug) are
non-responsive. Any yes, we have done a hardware diagnostic. The problem
follows the PDC emulator role, so we're thinking that a specific hardware
problem with the original machine is not the issue.
 
Ted,

Port mirror the switch the the PDCe is plugged into (or plug it and
another machine into a hub) and take a network sniff from the adjacent
machine when the PDCe locks up. You can leave Network Monitor running
for a good period of time with a large buffer size, so you can capture
the essential 'window' when the problem occurs.

Chris Malone
 
Ted,

Port mirror the switch the the PDCe is plugged into (or plug it and
another machine into a hub) and take a network sniff from the adjacent
machine when the PDCe locks up. You can leave Network Monitor running
for a good period of time with a large buffer size, so you can capture
the essential 'window' when the problem occurs.

Chris Malone
 
Ted,

Port mirror the switch the the PDCe is plugged into (or plug it and
another machine into a hub) and take a network sniff from the adjacent
machine when the PDCe locks up. You can leave Network Monitor running
for a good period of time with a large buffer size, so you can capture
the essential 'window' when the problem occurs.

Chris Malone

Chris,

We've tried that. The only thing we see is that the machine
just stops communicating on the network.

There's nothing obvious in the netmon capture to suggest
any sort of network initiated problem.

Ted
 
Back
Top