D
Daave
I'll start at the end.
Last night I left my PC on. I was downloading the installation file for
Ultimate Boot CD for Windows. The download apparently had completed. And
then at some later time, the PC seemed to freeze. Since I had woken up
in the wee hours of the morning, I went to the PC (at about 4:15 AM) and
I noticed that the screen was frozen: not only could I not move the
cursor with my mouse, but the time display in the system tray was frozen
at 3:40 AM.
I assume something significant happened at 3:40 AM.
Everything *looked* okay. The lights and fans were working. The little
light next to where the LAN cable plugs in showed activity. Nothing was
hot. I noted the temperatures in Abit's uGuru utility, and they were
fine.
I decided to run the Windows Memory Diagnostic (just the Standard
tests). In the first pass, there were failures. I think this is
significant. (The significance will become more apparent when I discuss
the history below.) But I'm not sure if this points to memory errors or
possibly something else (e.g., something wrong with the motherboard).
FWIW, there was nothing significant in the Event Viewer.
I also ran the SeaTools hard drive diagnostic, which was negative.
Just for yucks, I disconnected the printer and external hard drive (both
USB devices). I rebooted and ran WMD again. After nearly 2 1/2 hours and
three passes in Extended mode, there were no errors!
So far, here are my ideas as to what is causing this issue:
1. Since this behavior has *not* happened with either USB device plugged
in (not yet, anyway), there might be an anomaly associated with one of
them (probably the external hard drive) that is responsible. I somehow
doubt this is the case, but I figured I'd throw it out there, anyway.
2. My wife's personal care attendant was moving stuff around the other
day, and I believe the PC (which is on the floor) was moved slightly
while it was running. Whatever happened as a result might be the cause
as these hardware problems occurred after this incident.
3. BIOS settings are not quite right or there is something wrong with
the CMOS chip or perhaps motherboard battery (which I haven't replaced
just yet). Here are the specs of my PC:
- Abit IX38 Quad GT motherboard
- Intel Core 2 Duo E8400 CPU
- Crucial Ballistix PC8500 DDR2 RAM (2 x 1GB)
- 250 GB Seagate ES.2 SATA2 hard drive (this is the one that has Windows
XP, SP2)
- 500 GB Seagate 7200.11 SATA2 hard drive (for storage)
- Lite-On 20x DL DVD +/- RW drive
- A-922 case
- 600 watt Tiger power supply
- MSI 512MB HD3870 OC PCI-E graphics card
Regarding this last item, when I originally built the system this past
April, the connection going into the card was not 100%. I still haven't
gotten a newer monitor; I am using my old Samsung CRT monitor which
requires an adapter. Once the connection was more solid, all was well
(for over 4 months). Just to be on the safe side, I checked this
connection again. I also checked that the card was seated properly.
Here are some of the BIOS settings:
Frequency: 3230 MHz
CPU Operating Speed: User Define
Multiplier Factor: 9.5
Estimated New CPU clock: 3230 MHz
This last item is in gray and has an X next to it. It seems as if this
is overclocked. If so, I wonder how this happened as I never defined
these settings. Might this be the issue?
Other gray lines with Xs:
Target CPU core voltage: 1.2250 V
DDR2 Voltage: Auto
CPU Vit Voltage: 1.10 V
MCH 1.25V Voltage: 1.25 V
ICH 1.05V Voltage: 1.05 V
ICHIO 1.5 Voltage: 1.50 V
DDR2 Reference Voltage: Default
CPU GTLREF 0 & 2
CPU GTLREF 1 & 3
Nothing seems out of the ordinary here.
Also, CPU fan speed (under Fan Speed Monitoring): 1680-1740 RPM.
In Integrated Peripherals, On Chip SATA device:
SATA mode: IDE (other choices are RAID and AHCI)
Speaking of SATA, when I installed XP, there was the option to press F6
if I needed to install SCSI or RAID drivers. I never did this because I
do not have a RAID setup. And the installation went fine. Should I have
installed SATA drivers from the get-go? Or did I do it correctly?
Also, in the options section of the BIOS is an option to load fail-safe
defaults. I haven't done this yet. Is it useful to try?
Oddly, another time I went into the uGuru utility, the values for the
CPU had changed! They were:
CPU Operating Speed: 3000 (333)
Multiplier Factor: 9
Estimated New CPU clock: 3060 MHz
So, I'm not really sure what's going on here.
Finally, since I mention ACPI below, the BIOS value for ACPI Suspend
Type is S3 (Suspend-To-RAM). The other choice is S1 (Power On - Suspend)
4. There is still a loose connection I have overlooked.
5. There might be something wrong with the motherboard.
6. There might be a problem with the power supply. Although I do have a
multimeter, I haven't used it yet. But that might be one of the next
steps.
7. A power surge. Although I have a power strip that is supposed to
protect against power surges, I do not have a UPS. This might be a
stretch, but I recently recieved a notice from my apartment complex's
office stating that power will be shut off this coming Tuesday because
they "are installing new electrical service to this building." I'm not
sure if this is relevant or not, but I figured it's worth a mention.
Now, the rest of the story:
The original shenanigans started last weekend. At first I though
something somehow got corrupted in Windows (perhaps due to some odd
problems with updating AVG's definitions). There was a frozen screen
similar to what I mentioned above. The reset button on my PC wouldn't
work, so I pressed the power button. I waited 30 seconds and turned on
the PC again. I got this:
Windows could not start because the following file is missing or
corrupt:
\WINDOWS\SYSTEM32\CONFIG\SYSTEM
After Googling this error, I tried the method outlined here:
"How to recover from a corrupted registry that prevents Windows XP from
starting"
http://support.microsoft.com/kb/307545
That seemed to the trick.
But the same problem came back and I later got a reboot loop. If I
remember correctly, I finally realized I needed to disable the reboot on
failure feature. I was then greeted with this BSOD while trying to boot
into Windows:
STOP: 0x0000007E (0xC0000005, 0xF73C1D7C, 0xF78D2038, 0xF78D1D34)
acpi.sys - Address F73C1D73 base at F73AE000, DateStamp 480252b1
Recommendations were to:
1. Check for adequate disk space (no problem... 80% free)
2. Check drivers
3. Check for a change to the video adapter
4. Check for a BIOS update (mine is the latest)
5. Disable BIOS memory options such as caching or shadowing (I figured
I'd hold off on this one since I wasn't too familiar with it).
At this point, since I have a BART PE disk, I figured I'd boot off of
that. The first attempt started off fine but resulted in the following
BSOD when I was using the A43 File management utility:
*** STOP: ox00000024 (0x001902FE, 0xF78F6980, 0xF78667C0, 0xF709D1E0)
*** ntfs.sys - Address F709D1E0 base at F706E000, Datestamp 41107eea
Recommendations were to:
1. Check hard drive configuration
2. Check for any updated drivers
3. Run CHKDSK /F to check for hard drive corruption
I later attempted a second boot into BART PE. Then I was greeted with:
"BIOS in this system is not fully ACPI compliant ... turn off ACPI mode
during text mode setup."
There was also this stop error:
0x000000A5 (ox00000011, 0x00000000, 0x00000000)
It was at this point I concluded the problem is not in Windows (not the
original problem at any rate) and is not from the hard drive (by the
way, later in the week, I ran SeaTools and the drive seemed to be
healthy).
By the way, I Googled the above message and learned that pressing F7 at
the moment the F6 option of offered is the way to "silently disable
ACPI."
I then attempted to use chkdsk in both the Recovery Console and BART.
FWIW, there was an orphaned file:
avgsched.log in index $I30 of file 24276
I noted this just on the off-chance that this is relevant. (Recall I
seemed to be having problems with AVG earlier.)
Anyway, once more, I couldn't boot into Windows. Rather than repeat the
method above mentioned on the Microsoft page, I decided to run a repair
install because I wasn't 100% sure this was not a software issue.
The problems seemed to finally go away. Oddly since I hadn't reverted to
IE6 before the repair install (I had been running IE7), the repair
install of course changed IE back to IE6. Interestingly, the desktop
shortcut icon had only four choices:
1. Create shortcut (yes, this was the first choice, and double-clicking
the icon created another shortcut icon on the desktop!)
2. Delete
3. Rename
4. Properties
The above is minor in the overall scheme of things, but I figured I'd
include it.
Well, I'm sure you guessed it. The BSODs returned. Here's one:
Error Signature
BCCode: 1000008e BCP1:C0000005
BCP2:8063327C BCP3:AA7D58B0
BCP4:00000000 OSver:5_1_2600
SP:2_0 Product:768_1
There was also another one with a dump:
IRQL_NOT_LESS_OR_EQUAL
STOP: ox0000000A (0x0000000F, 0x00000002, 0x00000000, 0x80505809)
Beginning dump of physical memory.
Dumping physical memory to disk :42
C:\DOCUME~1\DAVETH~1\LOCALS~1\Temp\WERaecl.dir00\Mini090908-02.dmp
sysdata.xml
At this point, whenever I would try to boot into Windows (normal or Safe
Mode), all I would get was a black screen and moveable cursor. (Actually
in Safe Mode, it wasn't *totally* black; there was the usual indication
of Safe Mode in the corners.)
I decided to run Memtest 86+ because I had a feeling this was all due to
memory problems. It ran for over seven hours (22 passes!) and no errors
were found. Then I ran the Windows Memory Diagnostic. Although no errors
(in over 40 passes) were found initially when I ran it in standard mode,
thousands were later found in Extended mode (after nearly three passes).
Sure enough, these errors were associated with the 1GB stick in slot 1.
It should be noted that the final tally was 7,871 errors for RAM stick
#1. Unfortunately, there were four errors associated with RAM stick #2.
Zero would have been much more comforting! Anyway, I removed (what I
assumed was) the faulty module and placed the other 1GB stick (which was
in slot 3) into slot 1.
Removing the "bad" RAM seemed to fix the problem. "Seemed" is the
operative word.
This takes us back to yesterday, when I figured I'd restore an image of
my hard drive I had made a few weeks ago back when my system was stable.
This wouldn't be a problem, since my data backups were current. I used
Acronis True Image. I have the ATI plugin on my BART PE disk, which
worked quite nicely (only 15 minutes to restore the image). This was
done last night. It seemed successful. However, after I went to bed, the
problem mentioned above occurred at 3:40 AM (the freeze).
I apologize for such a long post. I just wanted to make sure I didn't
leave out any useful clues.
If you have gotten this far, dear reader, I thank you from the bottom of
my heart!
Last night I left my PC on. I was downloading the installation file for
Ultimate Boot CD for Windows. The download apparently had completed. And
then at some later time, the PC seemed to freeze. Since I had woken up
in the wee hours of the morning, I went to the PC (at about 4:15 AM) and
I noticed that the screen was frozen: not only could I not move the
cursor with my mouse, but the time display in the system tray was frozen
at 3:40 AM.
I assume something significant happened at 3:40 AM.
Everything *looked* okay. The lights and fans were working. The little
light next to where the LAN cable plugs in showed activity. Nothing was
hot. I noted the temperatures in Abit's uGuru utility, and they were
fine.
I decided to run the Windows Memory Diagnostic (just the Standard
tests). In the first pass, there were failures. I think this is
significant. (The significance will become more apparent when I discuss
the history below.) But I'm not sure if this points to memory errors or
possibly something else (e.g., something wrong with the motherboard).
FWIW, there was nothing significant in the Event Viewer.
I also ran the SeaTools hard drive diagnostic, which was negative.
Just for yucks, I disconnected the printer and external hard drive (both
USB devices). I rebooted and ran WMD again. After nearly 2 1/2 hours and
three passes in Extended mode, there were no errors!
So far, here are my ideas as to what is causing this issue:
1. Since this behavior has *not* happened with either USB device plugged
in (not yet, anyway), there might be an anomaly associated with one of
them (probably the external hard drive) that is responsible. I somehow
doubt this is the case, but I figured I'd throw it out there, anyway.
2. My wife's personal care attendant was moving stuff around the other
day, and I believe the PC (which is on the floor) was moved slightly
while it was running. Whatever happened as a result might be the cause
as these hardware problems occurred after this incident.
3. BIOS settings are not quite right or there is something wrong with
the CMOS chip or perhaps motherboard battery (which I haven't replaced
just yet). Here are the specs of my PC:
- Abit IX38 Quad GT motherboard
- Intel Core 2 Duo E8400 CPU
- Crucial Ballistix PC8500 DDR2 RAM (2 x 1GB)
- 250 GB Seagate ES.2 SATA2 hard drive (this is the one that has Windows
XP, SP2)
- 500 GB Seagate 7200.11 SATA2 hard drive (for storage)
- Lite-On 20x DL DVD +/- RW drive
- A-922 case
- 600 watt Tiger power supply
- MSI 512MB HD3870 OC PCI-E graphics card
Regarding this last item, when I originally built the system this past
April, the connection going into the card was not 100%. I still haven't
gotten a newer monitor; I am using my old Samsung CRT monitor which
requires an adapter. Once the connection was more solid, all was well
(for over 4 months). Just to be on the safe side, I checked this
connection again. I also checked that the card was seated properly.
Here are some of the BIOS settings:
Frequency: 3230 MHz
CPU Operating Speed: User Define
Multiplier Factor: 9.5
Estimated New CPU clock: 3230 MHz
This last item is in gray and has an X next to it. It seems as if this
is overclocked. If so, I wonder how this happened as I never defined
these settings. Might this be the issue?
Other gray lines with Xs:
Target CPU core voltage: 1.2250 V
DDR2 Voltage: Auto
CPU Vit Voltage: 1.10 V
MCH 1.25V Voltage: 1.25 V
ICH 1.05V Voltage: 1.05 V
ICHIO 1.5 Voltage: 1.50 V
DDR2 Reference Voltage: Default
CPU GTLREF 0 & 2
CPU GTLREF 1 & 3
Nothing seems out of the ordinary here.
Also, CPU fan speed (under Fan Speed Monitoring): 1680-1740 RPM.
In Integrated Peripherals, On Chip SATA device:
SATA mode: IDE (other choices are RAID and AHCI)
Speaking of SATA, when I installed XP, there was the option to press F6
if I needed to install SCSI or RAID drivers. I never did this because I
do not have a RAID setup. And the installation went fine. Should I have
installed SATA drivers from the get-go? Or did I do it correctly?
Also, in the options section of the BIOS is an option to load fail-safe
defaults. I haven't done this yet. Is it useful to try?
Oddly, another time I went into the uGuru utility, the values for the
CPU had changed! They were:
CPU Operating Speed: 3000 (333)
Multiplier Factor: 9
Estimated New CPU clock: 3060 MHz
So, I'm not really sure what's going on here.
Finally, since I mention ACPI below, the BIOS value for ACPI Suspend
Type is S3 (Suspend-To-RAM). The other choice is S1 (Power On - Suspend)
4. There is still a loose connection I have overlooked.
5. There might be something wrong with the motherboard.
6. There might be a problem with the power supply. Although I do have a
multimeter, I haven't used it yet. But that might be one of the next
steps.
7. A power surge. Although I have a power strip that is supposed to
protect against power surges, I do not have a UPS. This might be a
stretch, but I recently recieved a notice from my apartment complex's
office stating that power will be shut off this coming Tuesday because
they "are installing new electrical service to this building." I'm not
sure if this is relevant or not, but I figured it's worth a mention.
Now, the rest of the story:
The original shenanigans started last weekend. At first I though
something somehow got corrupted in Windows (perhaps due to some odd
problems with updating AVG's definitions). There was a frozen screen
similar to what I mentioned above. The reset button on my PC wouldn't
work, so I pressed the power button. I waited 30 seconds and turned on
the PC again. I got this:
Windows could not start because the following file is missing or
corrupt:
\WINDOWS\SYSTEM32\CONFIG\SYSTEM
After Googling this error, I tried the method outlined here:
"How to recover from a corrupted registry that prevents Windows XP from
starting"
http://support.microsoft.com/kb/307545
That seemed to the trick.
But the same problem came back and I later got a reboot loop. If I
remember correctly, I finally realized I needed to disable the reboot on
failure feature. I was then greeted with this BSOD while trying to boot
into Windows:
STOP: 0x0000007E (0xC0000005, 0xF73C1D7C, 0xF78D2038, 0xF78D1D34)
acpi.sys - Address F73C1D73 base at F73AE000, DateStamp 480252b1
Recommendations were to:
1. Check for adequate disk space (no problem... 80% free)
2. Check drivers
3. Check for a change to the video adapter
4. Check for a BIOS update (mine is the latest)
5. Disable BIOS memory options such as caching or shadowing (I figured
I'd hold off on this one since I wasn't too familiar with it).
At this point, since I have a BART PE disk, I figured I'd boot off of
that. The first attempt started off fine but resulted in the following
BSOD when I was using the A43 File management utility:
*** STOP: ox00000024 (0x001902FE, 0xF78F6980, 0xF78667C0, 0xF709D1E0)
*** ntfs.sys - Address F709D1E0 base at F706E000, Datestamp 41107eea
Recommendations were to:
1. Check hard drive configuration
2. Check for any updated drivers
3. Run CHKDSK /F to check for hard drive corruption
I later attempted a second boot into BART PE. Then I was greeted with:
"BIOS in this system is not fully ACPI compliant ... turn off ACPI mode
during text mode setup."
There was also this stop error:
0x000000A5 (ox00000011, 0x00000000, 0x00000000)
It was at this point I concluded the problem is not in Windows (not the
original problem at any rate) and is not from the hard drive (by the
way, later in the week, I ran SeaTools and the drive seemed to be
healthy).
By the way, I Googled the above message and learned that pressing F7 at
the moment the F6 option of offered is the way to "silently disable
ACPI."
I then attempted to use chkdsk in both the Recovery Console and BART.
FWIW, there was an orphaned file:
avgsched.log in index $I30 of file 24276
I noted this just on the off-chance that this is relevant. (Recall I
seemed to be having problems with AVG earlier.)
Anyway, once more, I couldn't boot into Windows. Rather than repeat the
method above mentioned on the Microsoft page, I decided to run a repair
install because I wasn't 100% sure this was not a software issue.
The problems seemed to finally go away. Oddly since I hadn't reverted to
IE6 before the repair install (I had been running IE7), the repair
install of course changed IE back to IE6. Interestingly, the desktop
shortcut icon had only four choices:
1. Create shortcut (yes, this was the first choice, and double-clicking
the icon created another shortcut icon on the desktop!)
2. Delete
3. Rename
4. Properties
The above is minor in the overall scheme of things, but I figured I'd
include it.
Well, I'm sure you guessed it. The BSODs returned. Here's one:
Error Signature
BCCode: 1000008e BCP1:C0000005
BCP2:8063327C BCP3:AA7D58B0
BCP4:00000000 OSver:5_1_2600
SP:2_0 Product:768_1
There was also another one with a dump:
IRQL_NOT_LESS_OR_EQUAL
STOP: ox0000000A (0x0000000F, 0x00000002, 0x00000000, 0x80505809)
Beginning dump of physical memory.
Dumping physical memory to disk :42
C:\DOCUME~1\DAVETH~1\LOCALS~1\Temp\WERaecl.dir00\Mini090908-02.dmp
sysdata.xml
At this point, whenever I would try to boot into Windows (normal or Safe
Mode), all I would get was a black screen and moveable cursor. (Actually
in Safe Mode, it wasn't *totally* black; there was the usual indication
of Safe Mode in the corners.)
I decided to run Memtest 86+ because I had a feeling this was all due to
memory problems. It ran for over seven hours (22 passes!) and no errors
were found. Then I ran the Windows Memory Diagnostic. Although no errors
(in over 40 passes) were found initially when I ran it in standard mode,
thousands were later found in Extended mode (after nearly three passes).
Sure enough, these errors were associated with the 1GB stick in slot 1.
It should be noted that the final tally was 7,871 errors for RAM stick
#1. Unfortunately, there were four errors associated with RAM stick #2.
Zero would have been much more comforting! Anyway, I removed (what I
assumed was) the faulty module and placed the other 1GB stick (which was
in slot 3) into slot 1.
Removing the "bad" RAM seemed to fix the problem. "Seemed" is the
operative word.
This takes us back to yesterday, when I figured I'd restore an image of
my hard drive I had made a few weeks ago back when my system was stable.
This wouldn't be a problem, since my data backups were current. I used
Acronis True Image. I have the ATI plugin on my BART PE disk, which
worked quite nicely (only 15 minutes to restore the image). This was
done last night. It seemed successful. However, after I went to bed, the
problem mentioned above occurred at 3:40 AM (the freeze).
I apologize for such a long post. I just wanted to make sure I didn't
leave out any useful clues.
If you have gotten this far, dear reader, I thank you from the bottom of
my heart!