Freeze-ups. Is power supply the only remaining possibility?

  • Thread starter Thread starter Not Here
  • Start date Start date
In alt.comp.hardware.pc-homebuilt Not Here said:
So how can I be sure I have the right amount. I totally cleaned CPU
and heatsink. The heatsink is not very smooth, some "pits" or
grooves". I put a very thin layer on it, smoothed all off except what
filled the depressions, then a thin (maybe .5 mm) even layer on the
CPU. When I took it apart today there wasn't any excess that had been
squeezed out when I clamped down the heatsink.
Time to replace the heatsink/fan combo.
About $20 US in most computer stores. Likely you can get a far better
all-copper replacement for the same price.

A. There should be *no* pits or grooves in it at all.
B. A very thin layer of Arctic Silver *should* be enough.
However, if you put on a little too much, no problem.
C. Even so, a tiny amount *should* squeeze out around the edges.
If not, something is wrong. Either:
1. The heatsink isn't mounted properly
2. You really didn't get enough on.
3. It isn't tight to the CPU.
4. Etc.
 
Time to replace the heatsink/fan combo.
About $20 US in most computer stores. Likely you can get a far better
all-copper replacement for the same price.

A. There should be *no* pits or grooves in it at all.
B. A very thin layer of Arctic Silver *should* be enough.
However, if you put on a little too much, no problem.
C. Even so, a tiny amount *should* squeeze out around the edges.
If not, something is wrong. Either:
1. The heatsink isn't mounted properly
2. You really didn't get enough on.
3. It isn't tight to the CPU.
4. Etc.

Well, I resolved the pit and groove issue. There was a thermal pad on
the heat sink. I'd never seen one before. So I removed that and have
smooth surfaces thoroughly cleaned with 99% isopropynol. I tried a
thin layer fully covering just the CPU, a thin layer on both CPU and
heatsink, a really thin layer just on CPU and finally a rice grain dab
on centre of CPU only. When I've removed the heat sink there is
evidence of full contact.

Every time the CPU is at 69-72 C immediately on boot to Windows. In
BIOS setup it sits about 40 C.

Graham
 
In alt.comp.hardware.pc-homebuilt Not Here said:
Well, I resolved the pit and groove issue. There was a thermal pad on
the heat sink. I'd never seen one before. So I removed that and have
smooth surfaces thoroughly cleaned with 99% isopropynol. I tried a
thin layer fully covering just the CPU, a thin layer on both CPU and
heatsink, a really thin layer just on CPU and finally a rice grain dab
on centre of CPU only. When I've removed the heat sink there is
evidence of full contact.

Every time the CPU is at 69-72 C immediately on boot to Windows. In
BIOS setup it sits about 40 C.
OK ... So now try a crash-reboot once in Windows and seeing the
70-degree value, using the reset-switch and going immediately to the
BIOS temperature check.

If the BIOS still says around 40C-50C, then your problem is with the
Windows software measuring the temp. If it then still measures high or
close to the same high 60C-70C temperature, then it's the
CPU/heatsink/fan/join problem.

If it's the Windows software reporting a high value, then go to the
motherboard manufacturer's website and get a replacement for the
software. (That might be a good idea anyway.) If they both measure the
same high temp or nearly-so, then replace the heatsink/fan combo.

Divide and conquer.

I was under the impression till now, that you were using the same
program to measure temperature all the time; and were watching it drift
up.
 
OK ... So now try a crash-reboot once in Windows and seeing the
70-degree value, using the reset-switch and going immediately to the
BIOS temperature check.

If the BIOS still says around 40C-50C, then your problem is with the
Windows software measuring the temp. If it then still measures high or
close to the same high 60C-70C temperature, then it's the
CPU/heatsink/fan/join problem.

If it's the Windows software reporting a high value, then go to the
motherboard manufacturer's website and get a replacement for the
software. (That might be a good idea anyway.) If they both measure the
same high temp or nearly-so, then replace the heatsink/fan combo.

Divide and conquer.

I was under the impression till now, that you were using the same
program to measure temperature all the time; and were watching it drift
up.


I'll try that, but last night I did have Windows running for hours,
with intensive activity with no problem and the same software reading
32 C.

When I'm showing 70 C I get frequent freeze-ups.

Graham
 
In alt.comp.hardware.pc-homebuilt Not Here said:
I'll try that, but last night I did have Windows running for hours,
with intensive activity with no problem and the same software reading
32 C.

When I'm showing 70 C I get frequent freeze-ups.
Well then, that pretty much clears the software.
I'm afraid it's either the heatsink/fan/compound problem or the CPU
itself. I *really* doubt it's the CPU; though that's possible. A crack
or other defect *could* cause such problems.

So ... Time for you to go down and buy a new fan/heatsink combo.
There are too many possibilities there to make it worth going on
easter-egging the problem. Replace both, clean the CPU off of all
present goo, put new Arctic Silver on, make certain the heatsink is
properly clipped into place, and I'd bet 10 to 1 that will fix your
problem. If not ... Well, I'd probably have to physically examine your
motherboard, CPU, and heatsink to figure out the problem.

Still, Three things you *might* look for:
A. Is the heatsink *clean*. (No dust in the fins.)
B. Is/are any case fans, especially exaust-fans near the CPU running.
A front-panel fan should blow *into* the case.
A back-panel fan (near the CPU) should blow *out* of the case.
C. Is the fan on the Power-supply running; and is it blowing air
*out* of the case, not in?
(I once had a *horrible* problem with my kid's computer where the
PSU was blowing hot air right onto the CPU.)

But I'd *still* replace the heatsink/fan combo.
 
Not said:
I'll try that, but last night I did have Windows running for hours,
with intensive activity with no problem and the same software reading
32 C.

When I'm showing 70 C I get frequent freeze-ups.

Graham

So are you saying, it is 32C when sitting idle in Windows, and
then zooms up to 70C if you use a program that loads the CPU ?
Like Prime95, Orthos, CPUburn or the like ?

What kind of CPU cooler are you using ? The retail one that
comes with the CPU ? A third party one ?

You've already mentioned that the thermal paste looks like it
is being spread by the applied pressure of the heatsink to the
processor. Is the heatsink secure on the socket ? Do the clamps
or fasteners hold it securely in place ? Based on the progress
in the thread so far, it sounds like you've thoroughly
examined the mounting issues. In which case, I'd look
for a different cooler, if nothing else is working out.

If both the idle and loaded temps were always high, that
might be software. But if the idle reads 32C, and the loaded
shoots to 70C, there are two observations to be made. One would
be, that the thing was capable of reading a low temp, and the
32C shows you that. And stopping at 70C is also significant,
because that is the temperature where an Intel processor would
start to throttle itself (drop internal clocking rate). So all
the symptoms are consistent with a correct temp reading, but
a heatsink with poor overall performance. Like the fan isn't
spinning :-) Or maybe only one corner of the heatsink is
touching the CPU.

To observe throttling, you can try RMClock. Looks like there
is a new version (I was using 225).

http://cpu.rightmark.org/download.shtml
http://cpu.rightmark.org/download/rmclock_230_bin_upd1.exe

There is a screenshot here, of RMClock and the monitor tab.
If the "throttle" (purple) is lower than the "clock" (red), then
the processor is probably throttling, trying to keep the temps
below 70C. I'm not really crazy about their graphing skills,
but this is one of the few programs than might detect
throttling when it happens.

http://www.notebookforums.com/attachment.php?attachmentid=8955&stc=1&d=1143767059

Paul
 
So are you saying, it is 32C when sitting idle in Windows, and
then zooms up to 70C if you use a program that loads the CPU ?
Like Prime95, Orthos, CPUburn or the like ?

No.
I only recently started monitoring temperatures. I left the computer
running Orthos overnight the night before last. In the morning the
system was frozen. I installed Speedfan and it was reading 70 C. At
the time I didn't know that was high, but when I educated myself I
pulled the HSF and found the blackened remains of a thermal pad. That
is there was hardened black stuff on the silver colored pad. In my
ignorance I greased the pad and CPU, booted to Windows with a steady
temp of 30-32. That was last night. Booting this morning and ever
since I get 30-40 in BIOS and 70 in Windows. It doesn't gradually
climb there, it is the first reading I get.
What kind of CPU cooler are you using ? The retail one that
comes with the CPU ? A third party one ?
Standard Intel branded. The fan consistently runs about 2800 rpm.
You've already mentioned that the thermal paste looks like it
is being spread by the applied pressure of the heatsink to the
processor. Is the heatsink secure on the socket ? Do the clamps
or fasteners hold it securely in place ? Based on the progress
in the thread so far, it sounds like you've thoroughly
examined the mounting issues. In which case, I'd look
for a different cooler, if nothing else is working out.

If both the idle and loaded temps were always high, that
might be software. But if the idle reads 32C, and the loaded
shoots to 70C, there are two observations to be made. One would
be, that the thing was capable of reading a low temp, and the
32C shows you that. And stopping at 70C is also significant,
because that is the temperature where an Intel processor would
start to throttle itself (drop internal clocking rate). So all
the symptoms are consistent with a correct temp reading, but
a heatsink with poor overall performance. Like the fan isn't
spinning :-) Or maybe only one corner of the heatsink is
touching the CPU.

To observe throttling, you can try RMClock. Looks like there
is a new version (I was using 225).

http://cpu.rightmark.org/download.shtml
http://cpu.rightmark.org/download/rmclock_230_bin_upd1.exe

There is a screenshot here, of RMClock and the monitor tab.
If the "throttle" (purple) is lower than the "clock" (red), then
the processor is probably throttling, trying to keep the temps
below 70C. I'm not really crazy about their graphing skills,
but this is one of the few programs than might detect
throttling when it happens.

http://www.notebookforums.com/attachment.php?attachmentid=8955&stc=1&d=1143767059

Paul

The throttling makes sense.

Graham
 
Not said:
No.
I only recently started monitoring temperatures. I left the computer
running Orthos overnight the night before last. In the morning the
system was frozen. I installed Speedfan and it was reading 70 C. At
the time I didn't know that was high, but when I educated myself I
pulled the HSF and found the blackened remains of a thermal pad. That
is there was hardened black stuff on the silver colored pad. In my
ignorance I greased the pad and CPU, booted to Windows with a steady
temp of 30-32. That was last night. Booting this morning and ever
since I get 30-40 in BIOS and 70 in Windows. It doesn't gradually
climb there, it is the first reading I get.

Standard Intel branded. The fan consistently runs about 2800 rpm.


The throttling makes sense.

Graham

OK. First the good news. Intel used to use a black material for a
thermal interface. It looks like lamp black or carbon. That is
not ordinary paste that has gone bad. What you are looking at,
is the original thermal interface solution from Intel.

(I had trouble finding anything even close. It is like this one,
only the gasket would be missing, the pad would have larger
dimensions, and be flat black in color and texture.)

http://www.pcper.com/images/reviews/232/heatsink_bottom.jpg

You are supposed to remove all the material, not just the black
carbon-like material. You want the base of the heatsink to be
available to you. If the heatsink is pure aluminum, you want to
see aluminum before you apply the new paste. If the heatsink
is aluminum with a copper plug (circle of copper) inserted in the
aluminum, then you want to be seeing bare copper before applying
the paste.

The idea is, the solid metals are the best conductors of heat. The
Intel foil material, once deformed or scratched, is not nearly
as good.

Lucky me, I still have my original S478 processor retail heatsink,
still in the plastic carrier :-) I never used it, because I bought a
third party heatsink instead. When I look at it, there is a pad
which is 1.75" by 1.5". It consists of the black material, on
top of a foil carrier. Now, if I was going to use this cooler,
I would remove the foil plus the black material. I think that would
leave me with solid aluminum under that. Then I'd apply the thermal
paste, install on the processor, and test it.

Some pastes take a couple days to settle and give their best
performance. But even so, there should not be a close to 40C swing
in temperature. That means you still don't have good contact or
a good thermal path for cooling to take place.

There is only one kind of paste I wouldn't use. Radio Shack used
to sell a zinc paste in silicon oil, and the paste is white in color.
The carrier runs out from where you put it, and I was never happy with
that stuff (used to use it on audio power transistors).

Typical paste products would use suspended Boron Nitride particles
as the main material. Arctic Silver Ceramique might be an example.
Arctic Silver AS5, adds a bit of silver to that, for a theoretically
better heat transfer. But many pastes are within a few degrees C of
one another in performance - the most important thing, is that the
paste not be "pumped out" in short order, as then you'd have to
reapply the paste.

Thanks to throttling, your processor was never in danger. And if
the heatsink falls off a modern Intel processor, the THERMTRIP
signal is supposed to turn off the computer.

HTH,
Paul
 
OK. First the good news. Intel used to use a black material for a
thermal interface. It looks like lamp black or carbon. That is
not ordinary paste that has gone bad. What you are looking at,
is the original thermal interface solution from Intel.

(I had trouble finding anything even close. It is like this one,
only the gasket would be missing, the pad would have larger
dimensions, and be flat black in color and texture.)

http://www.pcper.com/images/reviews/232/heatsink_bottom.jpg

You are supposed to remove all the material, not just the black
carbon-like material. You want the base of the heatsink to be
available to you. If the heatsink is pure aluminum, you want to
see aluminum before you apply the new paste. If the heatsink
is aluminum with a copper plug (circle of copper) inserted in the
aluminum, then you want to be seeing bare copper before applying
the paste.

The idea is, the solid metals are the best conductors of heat. The
Intel foil material, once deformed or scratched, is not nearly
as good.

Lucky me, I still have my original S478 processor retail heatsink,
still in the plastic carrier :-) I never used it, because I bought a
third party heatsink instead. When I look at it, there is a pad
which is 1.75" by 1.5". It consists of the black material, on
top of a foil carrier. Now, if I was going to use this cooler,
I would remove the foil plus the black material. I think that would
leave me with solid aluminum under that. Then I'd apply the thermal
paste, install on the processor, and test it.

Some pastes take a couple days to settle and give their best
performance. But even so, there should not be a close to 40C swing
in temperature. That means you still don't have good contact or
a good thermal path for cooling to take place.

There is only one kind of paste I wouldn't use. Radio Shack used
to sell a zinc paste in silicon oil, and the paste is white in color.
The carrier runs out from where you put it, and I was never happy with
that stuff (used to use it on audio power transistors).

Typical paste products would use suspended Boron Nitride particles
as the main material. Arctic Silver Ceramique might be an example.
Arctic Silver AS5, adds a bit of silver to that, for a theoretically
better heat transfer. But many pastes are within a few degrees C of
one another in performance - the most important thing, is that the
paste not be "pumped out" in short order, as then you'd have to
reapply the paste.

Thanks to throttling, your processor was never in danger. And if
the heatsink falls off a modern Intel processor, the THERMTRIP
signal is supposed to turn off the computer.

HTH,
Paul

So I just got the system up again, Speedfan showing 69 C, did a reset,
entered BIOS setup and got 34 gradually rising to 39. Hard to believe
it could cool that much in less than a minute.

If I'm not overheating then it's back to what else is causing my
freezes.

This all started when I got this machine from a friend who didn't want
to fix it (now I know why). At first it would run for long periods
from CD but would freeze doing anything demanding, like installing XP
or booting to Windows. With Paul's advice and the visual clue of 2
obviously blown capacitors next to the CPU, I replaced the caps and
everything was fine for a few days. Then the freezes started again and
maybe I got off on a tangent with this (bogus?) overheating thing.

My naive hope is that maybe the third cap beside the 2 I replaced is
also going but not yet blown. Should I replace that as a wild attempt?

Graham
 
So I just got the system up again, Speedfan showing 69 C, did a reset,
entered BIOS setup and got 34 gradually rising to 39. Hard to believe
it could cool that much in less than a minute.

If I'm not overheating then it's back to what else is causing my
freezes.

This all started when I got this machine from a friend who didn't want
to fix it (now I know why). At first it would run for long periods
from CD but would freeze doing anything demanding, like installing XP
or booting to Windows. With Paul's advice and the visual clue of 2
obviously blown capacitors next to the CPU, I replaced the caps and
everything was fine for a few days. Then the freezes started again and
maybe I got off on a tangent with this (bogus?) overheating thing.

My naive hope is that maybe the third cap beside the 2 I replaced is
also going but not yet blown. Should I replace that as a wild attempt?

Graham

Addendum to my post just above

There is definitely something wrong with the temp reporting here.
I just rebooted after siiting in BIOS setup for 10 minutes at about
40C and got readings in Windows of 16??? to 22 C, motheboard at 30-32.
Now the room I'm in is about 18.

Graham
 
In alt.comp.hardware.pc-homebuilt Not Here said:
There is definitely something wrong with the temp reporting here.
I just rebooted after siiting in BIOS setup for 10 minutes at about
40C and got readings in Windows of 16??? to 22 C, motheboard at 30-32.
Now the room I'm in is about 18.

Did you try getting more up-to-date software from the motherboard
maker's website yet?

The measurement of about 40C and remaining constant there, sounds right.
The other ....
 
Did you try getting more up-to-date software from the motherboard
maker's website yet?

The measurement of about 40C and remaining constant there, sounds right.
The other ....

I haven't been able to find specific Asrock M266A software, but
Motherboard Monitor 5.3.7.0 has a choice of that board and it's
readings match what I'm getting from Speedfan.

By the way, when the readings dropped to 16-22C the computer slowed to
a crawl, I mean !really! slow. I rebooted to see higher reported temps
and performance was normal.

I have also inserted a digital thermometer probe between the heat sink
fins. MBM reports 68C and the probe shows 70F.

Graham
 
Not said:
I haven't been able to find specific Asrock M266A software, but
Motherboard Monitor 5.3.7.0 has a choice of that board and it's
readings match what I'm getting from Speedfan.

By the way, when the readings dropped to 16-22C the computer slowed to
a crawl, I mean !really! slow. I rebooted to see higher reported temps
and performance was normal.

I have also inserted a digital thermometer probe between the heat sink
fins. MBM reports 68C and the probe shows 70F.

Graham

With regard to the capacitor question, it would be proper to replace all
the caps that are in the same cluster. They all work in parallel with one
another, like this.

--------+-----+-----+-----+-----+
| | | | |
--- --- --- --- ---
--- --- --- --- ---
| | | | |
--------+-----+-----+-----+-----+

If two were bad, then the other three carried the ripple current during
that time. You should see correlated failure, as when one fails, the others
have to work harder. Any AC ripple voltage across the cap, should result
in a ripple current flowing through the capacitor. Electrolytics have an
ampere rating for that ripple current. The reason there are so many caps in
a cluster, is to share that current and stay below the max. So the designer
of the circuit, felt that five in my example, would be enough to safely
share the current.

When finding a couple damaged in the above circuit, I would replace all five.
Because in a week or two, all five would be failed anyway.

For a sample switching regulator datasheet, try one like this:

http://www.intersil.com/data/fn/fn4567.pdf

Page 12 has a complete circuit. "Cin" in the upper left, consists of five
1000uF capacitors. "Cout" in the lower right, consists of nine 1000uF capacitors.
Treat the state of those two sets separately. If some of the five were failed,
replace all of them. If some of the nine were failed, replace all
of them. Failure can either be due to capacitor construction (premature
failure) or due to the actual stress of the circuit. If the caps are
failing just for the hell of it, then maybe you'd consider doing everything
in sight (recap whole board - $50 on the net).

I guess the message is, just replacing two of them is probably not the
best solution.

Also, read that datasheet. While it doesn't answer all your questions, it
should at least show that there is more to it, than just selecting enough
"total capacitance".

Paul
 
With regard to the capacitor question, it would be proper to replace all
the caps that are in the same cluster. They all work in parallel with one
another, like this.

--------+-----+-----+-----+-----+
| | | | |
--- --- --- --- ---
--- --- --- --- ---
| | | | |
--------+-----+-----+-----+-----+

If two were bad, then the other three carried the ripple current during
that time. You should see correlated failure, as when one fails, the others
have to work harder. Any AC ripple voltage across the cap, should result
in a ripple current flowing through the capacitor. Electrolytics have an
ampere rating for that ripple current. The reason there are so many caps in
a cluster, is to share that current and stay below the max. So the designer
of the circuit, felt that five in my example, would be enough to safely
share the current.

When finding a couple damaged in the above circuit, I would replace all five.
Because in a week or two, all five would be failed anyway.

For a sample switching regulator datasheet, try one like this:

http://www.intersil.com/data/fn/fn4567.pdf

Page 12 has a complete circuit. "Cin" in the upper left, consists of five
1000uF capacitors. "Cout" in the lower right, consists of nine 1000uF capacitors.
Treat the state of those two sets separately. If some of the five were failed,
replace all of them. If some of the nine were failed, replace all
of them. Failure can either be due to capacitor construction (premature
failure) or due to the actual stress of the circuit. If the caps are
failing just for the hell of it, then maybe you'd consider doing everything
in sight (recap whole board - $50 on the net).

I guess the message is, just replacing two of them is probably not the
best solution.

Also, read that datasheet. While it doesn't answer all your questions, it
should at least show that there is more to it, than just selecting enough
"total capacitance".

Paul

I replaced the third cap in the cluster. On first reboot the system
froze after about 2 minutes. Second boot was OK for 2 hours or so
before I shut it down. Temp still reading 69-70C.

I then tried booting to BIOS with the fan unplugged and my temperature
probe in the fins of the heatsink. Over the 2 minute test the BIOS
reported temp gradually rising from 36 to 44C. My heatsink probe rose
from 76F to 92. After shutdown the probe rose to 97F. This seems to be
a crude indication that the heat sink is working.

Can anyone think of another way to confirm that I have a cooling
problem?

Graham
 
I replaced the third cap in the cluster. On first reboot the system
froze after about 2 minutes. Second boot was OK for 2 hours or so
before I shut it down. Temp still reading 69-70C.

I then tried booting to BIOS with the fan unplugged and my temperature
probe in the fins of the heatsink. Over the 2 minute test the BIOS
reported temp gradually rising from 36 to 44C. My heatsink probe rose
from 76F to 92. After shutdown the probe rose to 97F. This seems to be
a crude indication that the heat sink is working.

Can anyone think of another way to confirm that I have a cooling
problem?

Graham

PS

I installed RightMark CPU Clock Utility and see no signs of throttling
yet, that is Core Clock and throttle are both at 2395.

Graham
 
In alt.comp.hardware.pc-homebuilt Not Here said:
Can anyone think of another way to confirm that I have a cooling
problem?

Yeah: Grab a can of "circuit cooler" and spray the thing; CPU, heatsink,
and all.

However, I *still* think you should just go ahead and replace the
fan/heatsink combo, just for General Porpoises. If not, then just for
the Halibut.

I know that sounds a bit fishy; but I still say try it.
The amount of time saved alone ....
 
I then tried booting to BIOS with the fan unplugged and my temperature
probe in the fins of the heatsink. Over the 2 minute test the BIOS
reported temp gradually rising from 36 to 44C. My heatsink probe rose
from 76F to 92. After shutdown the probe rose to 97F. This seems to be
a crude indication that the heat sink is working.

Can anyone think of another way to confirm that I have a cooling
problem?

Step back a minute. I read a thermometer. Its digital display
reports a temperature too high. So I replaced my eyes? That is what
was happening when replacing temperature reporting software. Software
does not measure temperature. Anyone with hardware knowledge would
have known that. Software only reads a register (a display) and
reports that number on your screen. If software says 70 degrees C,
then that is what the hardware is reporting - assuming software is for
the temperature measuring hardware.

A heatsink properly machined is applied directly to the CPU for
perfectly good cooling. Nothing between them. However not all
heatsinks are properly machined (and that does not mean perfectly flat
either). Therefore we use thermal compound to fix poor machining. So
little compound that does not spread out to edges and spreads in the
center half of the CPU - to fill in gaps where a heatsink does not
directly touch the CPU. (Almost all heat gets transferred to
heatsink from CPU center.) A better heatsink means heatsink compound
does even less. BTW, compounds from any responsible heatsink
manufacturer are just as good as Arctic Silver - at much less money.
Don't fall for hype.

Heatsinks are rated in 'degree C per watt'. A lower number means a
better heatsink. If your CPU is 70 watts and if the heatsink is rated
at 0.2 degree C per watt, then temperature difference between CPU and
air must be 14 degrees C.

Only after doing these calculations do we confirm numbers with
testing.

Meanwhile, it's an Intel processor. Heat does not cause crashing.
Heat only results in a slower computer. Computer crashes? Then you
are ignoring important facts.

What can cause temperature readings to be in error and computer to
crash? Temperature reading error is completely hardware dependent -
not software dependent. What is common to both symptoms? Voltages.
So what are those voltages especially when CPU is accessing all
peripherals simultaneously (multitasking) - when PSU is under maximum
load? Not whether voltages are OK. What are the numbers?

When room temperature increases from a normal 70 degree F to an also
normal 100 degree F, then CPU should still be perfectly happy and
unthrottled at below 70 degree C. There is no reason for a processor
to be at 70 degree C in a 70 degree F room even with no Arctic
Silver. Your external measurements confirm the CPU appears to be at
a perfectly happy temperature. So what causes hardware to report a
wrong number to software? What are other symptoms? IOW first collect
all facts before fixing something. It currently sounds like you are
curing a ghost of the common cold.
 
Step back a minute. I read a thermometer. Its digital display
reports a temperature too high. So I replaced my eyes? That is what
was happening when replacing temperature reporting software. Software
does not measure temperature. Anyone with hardware knowledge would
have known that. Software only reads a register (a display) and
reports that number on your screen. If software says 70 degrees C,
then that is what the hardware is reporting - assuming software is for
the temperature measuring hardware.

A heatsink properly machined is applied directly to the CPU for
perfectly good cooling. Nothing between them. However not all
heatsinks are properly machined (and that does not mean perfectly flat
either). Therefore we use thermal compound to fix poor machining. So
little compound that does not spread out to edges and spreads in the
center half of the CPU - to fill in gaps where a heatsink does not
directly touch the CPU. (Almost all heat gets transferred to
heatsink from CPU center.) A better heatsink means heatsink compound
does even less. BTW, compounds from any responsible heatsink
manufacturer are just as good as Arctic Silver - at much less money.
Don't fall for hype.

Heatsinks are rated in 'degree C per watt'. A lower number means a
better heatsink. If your CPU is 70 watts and if the heatsink is rated
at 0.2 degree C per watt, then temperature difference between CPU and
air must be 14 degrees C.

Only after doing these calculations do we confirm numbers with
testing.

Meanwhile, it's an Intel processor. Heat does not cause crashing.
Heat only results in a slower computer. Computer crashes? Then you
are ignoring important facts.

What can cause temperature readings to be in error and computer to
crash? Temperature reading error is completely hardware dependent -
not software dependent. What is common to both symptoms? Voltages.
So what are those voltages especially when CPU is accessing all
peripherals simultaneously (multitasking) - when PSU is under maximum
load? Not whether voltages are OK. What are the numbers?

When room temperature increases from a normal 70 degree F to an also
normal 100 degree F, then CPU should still be perfectly happy and
unthrottled at below 70 degree C. There is no reason for a processor
to be at 70 degree C in a 70 degree F room even with no Arctic
Silver. Your external measurements confirm the CPU appears to be at
a perfectly happy temperature. So what causes hardware to report a
wrong number to software? What are other symptoms? IOW first collect
all facts before fixing something. It currently sounds like you are
curing a ghost of the common cold.


The voltages I'm seeing are quite consistent:
VCore 1.60
+3.28
+4.84
+11.73
-11.79
-4.91

During all operations including high stress (running Orthos, playing
..avi movie etc, etc, simultaneously) I've never seen any throttling in
Rightmark CPU Clock Utility.

However, I just tried running with the heatsink fan unplugged and the
reported temperature was the usual 71C on boot into Windows, climbing
to 76 and only when I tried playing a movie did I see throttling down
to 58%. Even assuming the temperature readings are correct, this
doesn't seem to explain the freezes.

I also swapped out the RAM to eliminate that possiblity. I was running
with only 128 MB with 32 reserved for video but ran OK. I did have 2
spontaneous reboots and after 50 minutes of fairly heavy use (Orthos
runnning 100% CPU and other apps open) I had a freeze.

Any other suggestions as to where to look?

Graham
 
The voltages I'm seeing are quite consistent:
VCore 1.60
+3.28
+4.84
+11.73
-11.79
-4.91

Before moving on, this is why one should always post those
numbers. Numbers must exceed 3.23, 4.87, and 11.7. Were these
numbers when everything was being accessed simultaneously? Were you
downloading from the net, while doing a virus search of the hard
drive, while playing a movie so that video card was doing complex
graphics, sound card was working constantly, and CD-Rom was reading
data, while typing on the keyboard and reading a floppy disk? If yes,
your numbers are suspiciously marginal. If no, then you have a power
supply problem.

Appreciate what the spec numbers are, then appreciate what is output
from a weak or failing power supply, and then appreciate how a
multimeter works. Whereas +5 volts must always exceed 4.5 volts; a
power supply with excessive ripple can oscillate between 4.45 and 4.85
volts. Then your meter would report 4.85 volts. Add some load (all
those listed devices) and the voltage drops even lower. Your numbers,
if on a lightly loaded power supply, suggest a defective power supply.

Again, a power supply that boots a computer just fine can still be
100% defective. Therefore when a power supply is replaced, those
numbers are taken again to confirm the new supply is also working
properly. Those who never learned these electronic concepts would see
a computer boot; then assume the defective supply is good.

If excessive ripple voltage, then your temperature numbers may also
vary significantly. When the computer gets warmer, that marginal
condition created by a 100% defective power supply could result in
intermittent crashes. IOW that power supply is like the foundation of
a house. Everything inside the house does unacceptable and weird
things if the foundation is failing. Look at your computer's
foundation. Those numbers are marginal or defective.

Heat does not cause crashing as too many others would assume. Heat
is a diagnostic tool used to locate defects. Does your machine work
just fine in a 100 degree F room? If not, then locate hardware that
is 100% defective. Heat is a tool to find defective hardware; an
especially powerful tool to locate intermittent failures.

Meanwhile, I don't see voltages from the purple wire, green wire, or
gray wire (power supply to motherboard). You did use a multimeter?
If using the BIOS, you still need a multimeter to calibrate those BIOS
numbers. It is called a motherboard monitor. Its function is to
detect changes. Until you calibrate each voltage, then its accuracy
is questionable. How to get those important numbers was posted
previously in "When your computer dies without warning....." starting
6 Feb 2007 in the newsgroup alt.windows-xp at:
http://tinyurl.com/yvf9vh

In your case, significant numbers are voltages on any one of red,
orange, yellow, and purple wires when computer is drawing maximum load
- when multitasking to all peripherals simultaneously.
 
Before moving on, this is why one should always post those
numbers. Numbers must exceed 3.23, 4.87, and 11.7. Were these
numbers when everything was being accessed simultaneously? Were you
downloading from the net, while doing a virus search of the hard
drive, while playing a movie so that video card was doing complex
graphics, sound card was working constantly, and CD-Rom was reading
data, while typing on the keyboard and reading a floppy disk? If yes,
your numbers are suspiciously marginal. If no, then you have a power
supply problem.

Appreciate what the spec numbers are, then appreciate what is output
from a weak or failing power supply, and then appreciate how a
multimeter works. Whereas +5 volts must always exceed 4.5 volts; a
power supply with excessive ripple can oscillate between 4.45 and 4.85
volts. Then your meter would report 4.85 volts. Add some load (all
those listed devices) and the voltage drops even lower. Your numbers,
if on a lightly loaded power supply, suggest a defective power supply.

Again, a power supply that boots a computer just fine can still be
100% defective. Therefore when a power supply is replaced, those
numbers are taken again to confirm the new supply is also working
properly. Those who never learned these electronic concepts would see
a computer boot; then assume the defective supply is good.

If excessive ripple voltage, then your temperature numbers may also
vary significantly. When the computer gets warmer, that marginal
condition created by a 100% defective power supply could result in
intermittent crashes. IOW that power supply is like the foundation of
a house. Everything inside the house does unacceptable and weird
things if the foundation is failing. Look at your computer's
foundation. Those numbers are marginal or defective.

Heat does not cause crashing as too many others would assume. Heat
is a diagnostic tool used to locate defects. Does your machine work
just fine in a 100 degree F room? If not, then locate hardware that
is 100% defective. Heat is a tool to find defective hardware; an
especially powerful tool to locate intermittent failures.

Meanwhile, I don't see voltages from the purple wire, green wire, or
gray wire (power supply to motherboard). You did use a multimeter?
If using the BIOS, you still need a multimeter to calibrate those BIOS
numbers. It is called a motherboard monitor. Its function is to
detect changes. Until you calibrate each voltage, then its accuracy
is questionable. How to get those important numbers was posted
previously in "When your computer dies without warning....." starting
6 Feb 2007 in the newsgroup alt.windows-xp at:
http://tinyurl.com/yvf9vh

In your case, significant numbers are voltages on any one of red,
orange, yellow, and purple wires when computer is drawing maximum load
- when multitasking to all peripherals simultaneously.


I do not currently have access to a multimeter, but I did briefly, and
noted these values under very light load.

Or-3.32 Or 3.32 Red 4.94 Red 4.94 Gray 4.67 Prpl 4.97 Yellow 12.5
Or -12 Blu -12 Grn .03 White -5.24 Red 4.94 Red 4.94

Those 2 lines represent the 2 sides of the connector, ignoring the
blacks which were all 0.

Graham
 
Back
Top