Intermittent shutdown

  • Thread starter Thread starter Mike Easter
  • Start date Start date
M

Mike Easter

How do I troubleshoot an intermittent shutdown problem?

The computer shuts down abruptly, usually very infrequently, occasionally
frequently.

A year ago the problem was rarely observed, but at that time it was almost
always when an operating system initiated a restart. Instead of a
restart, the computer shut down and could not be turned back on with the
case front power switch without turning off the rear switch for a few
seconds. Then the problem would not be observed again for many weeks or
months.

More recently the problem is seen under a variety of conditions, sometimes
during the POST, sometimes at the time of the OS startup sound. Today it
happened about 8 times in a row at various stages between POST and OS
startup. Currently it has been running without interruption for hours.

Won't all of the multimeter readings be expected to be normal during its
normal operation? Is is foolish to just 'arbitrarily' replace the PS?
 
Mike Easter said:
How do I troubleshoot an intermittent shutdown problem?

The computer shuts down abruptly, usually very infrequently, occasionally
frequently.

A year ago the problem was rarely observed, but at that time it was almost
always when an operating system initiated a restart. Instead of a
restart, the computer shut down and could not be turned back on with the
case front power switch without turning off the rear switch for a few
seconds. Then the problem would not be observed again for many weeks or
months.

More recently the problem is seen under a variety of conditions, sometimes
during the POST, sometimes at the time of the OS startup sound. Today it
happened about 8 times in a row at various stages between POST and OS
startup. Currently it has been running without interruption for hours.

Won't all of the multimeter readings be expected to be normal during its
normal operation?

Yes, a multimeter will tell you nothing other than the voltage readings AT
THAT SECOND, which is not when the problem is.
Is is foolish to just 'arbitrarily' replace the PS?

Not when you have such obvious symptoms of a power supply failure. -Dave
 
I spent months on the same issue. Swapped out all parts and eventually got
down to mobo or CPU. Removed Fan/CPU from mobo. AMD chip, loaded with
silver thermal paste that had hardened to rock solid where it squished off
the chip
interface. Scrapped it off with screwdrive, ouch. Sandpapered CPU to get
the pink stuff
off interface. Ouch again. Re-assembled with new thermal paste.
Nada. Almost discarded it, but thought
i'd try one last thing. Pulled CPU, sprayed all 7,000 CPU pins with contact
cleaner spray,
sprayed all the little holes in the CPU housing, too. Re-assembled. It's
been working fine
ever since and hasn't shut down yet. Been running about 2 months perfectly,
and it used to shut
down everyday. Sometimes repeatedly.

Al
 
Dave said:
"Mike Easter"
Not when you have such obvious symptoms of a power supply failure.

I don't understand enough about the ATX PS circuitry to understand how
many other 'misfiring' conditions that are not the PS itself which could
cause the same kind of condition of the 'off' condition of the PS in which
the case front switch does not function to restart the computer.

That is, if some other hypothetical situation were to cause a shutdown,
say cpu overheating or whatever; would that (type of) shutdown be unable
to be restarted (using the front case powerswitch) without using the rear
poweroff switch in an off cycle lasting some number of seconds, and not
just instantaneously cycled?

I guess put another way, what kinds of shutdowns cause this state of a PS?
The state of not being able to be turned on with the front case switch?
Could such a condition be caused by something in the mobo?

In that state, when the computer has shut down, and the front case
powerswitch won't turn it back on, what kinds of multimeter readings might
be useful?
 
Mike Easter said:
How do I troubleshoot an intermittent shutdown problem?

A voltage regulator nuked that problem here.
The computer shuts down abruptly, usually very infrequently,
occasionally frequently.

A year ago the problem was rarely observed, but at that time it
was almost always when an operating system initiated a restart.
Instead of a restart, the computer shut down and could not be
turned back on with the case front power switch without turning
off the rear switch for a few seconds. Then the problem would not
be observed again for many weeks or months.

More recently the problem is seen under a variety of conditions,
sometimes during the POST, sometimes at the time of the OS startup
sound. Today it happened about 8 times in a row at various stages
between POST and OS startup. Currently it has been running
without interruption for hours.

Won't all of the multimeter readings be expected to be normal
during its normal operation? Is is foolish to just 'arbitrarily'
replace the PS?

Using a multimeter to troubleshoot a PC is not required or even
recommended by any responsible technical help person. A user should
definitely not be poking around inside of his (or her) computer when
it is plugged in and turned on. You risk damage doing so, and you
might never know how the damage happened. Replacing the power supply
is a recommended course of action. That is what spare parts are for.
Another possibility is bad house current... An inexpensive voltage
regulator can help troubleshoot and solve that problem.
 
Mike Easter said:
In that state, when the computer has shut down, and the front case
powerswitch won't turn it back on, what kinds of multimeter
readings might be useful?

Try using your head instead of using a multimeter. I have never run
across a situation where a multimeter would be useful when
troubleshooting a personal computer problem. Maybe if you have a lot
of time on your hands, no money, for some strange reason you have no
spare parts, and you are willing to take the risk of creating
problems you do not already have. Most spare parts around here are
difficult to sell, I cannot imagine why anyone would have trouble
maintaining at least one set of spare PC parts.

People who recommend using a multimeter are probably unskilled
techies who are trying to look skilled. It is inefficient and likely
to cause more problems than it solves.
 
John said:
"Mike Easter"

Try using your head instead of using a multimeter.

I could use my head better if I could understand the condition/state that
causes a PS to not be able to be turned on with the front case
powerswitch -- which condition/state can be restored to normalcy by using
the rear powerswitch (or disconnecting the AC plug) -- which powerdown
condition must last several seconds, not an instant.

What PS (or other) condition becomes 'reset' by such a power source
interruption of sufficient seconds duration?

I'm not trying to incite a war with multimeter fans such as westom, I'm
just trying to understand how come I'm observing the condition of "the
front powerswitch won't work just now unless you disconnect the mains for
a few seconds first" -- which condition immediately follows the shutdown
I'm trying to diagnose.
 
Mike said:
I could use my head better if I could understand the condition/state that
causes a PS to not be able to be turned on with the front case
powerswitch -- which condition/state can be restored to normalcy by using
the rear powerswitch (or disconnecting the AC plug) -- which powerdown
condition must last several seconds, not an instant.

What PS (or other) condition becomes 'reset' by such a power source
interruption of sufficient seconds duration?

I'm not trying to incite a war with multimeter fans such as westom, I'm
just trying to understand how come I'm observing the condition of "the
front powerswitch won't work just now unless you disconnect the mains for
a few seconds first" -- which condition immediately follows the shutdown
I'm trying to diagnose.

There is a power supply spec here. Check out how many times the word
"latch" or "latch-off" occurs in the spec. Latch-off is used, in
situations where there might be some danger if the supply continued
to run. But fault detection circuits are never perfect, and they
can trip when a supply becomes "weak" or compromised on output.
An example might be a heatsink which no longer cools a hot device
inside the supply - that makes it easier for the supply to overheat
and latch off.

http://www.silverstonetek.com/downloads/v2st65zfspecs.pdf

Even the motherboard itself can have latch-off protection, for
the Vcore regulator. That means, pressing the reset button on
the front of the case, may not cause a motherboard to reboot
successfully. But once power is removed from the motherboard
entirely, the motherboard may recover.

Swapping the power supply, is a good place to start. The thing
is, you could disconnect the supply, and try and test it separately.
You'd need to build a load box, to apply some kind of load to it.
And then you could verify whether it still responds to PS_ON# or not,
without cycling the main power. You could spend $50 to $100 on parts,
to make a power supply test load. (It really depends on whether you
have a surplus electronics store nearby, with old load resistors you
can buy for cheap.) The same money could buy you a replacement supply,
to use as a swap test.

Paul
 
Ian said:
"Mike Easter
It sounds like the PSU overcurrent sensing is tripping. Once it
trips, the only way to reset it is by cutting the AC supply to the
PSU, by unplugging it, or using the rear AC switch. The over-
current tripping could be due to a PSU, motherboard, internal
peripheral (drive), or a wiring/connector fault.
fault.

I'm now reading that "Most computer power supplies have short circuit
protection, overpower (overload) protection, overvoltage protection,
undervoltage protection, overcurrent protection, and over temperature
protection."

.... along with the potential for the logic or circuitry to 'think' such
protection should be employed in the form of a shutdown which also
requires reset.

I might have an overabundance of protection here.
 
Paul said:
Mike Easter wrote:
But fault detection circuits are never perfect, and they
can trip when a supply becomes "weak" or compromised on output.

I can grok that.
Even the motherboard itself can have latch-off protection, for
the Vcore regulator. That means, pressing the reset button on
the front of the case, may not cause a motherboard to reboot
successfully. But once power is removed from the motherboard
entirely, the motherboard may recover.

Hmm. There's that mobo issue again.
Swapping the power supply, is a good place to start.

I understand that concept, and I have a spare PS which I bought on sale a
while back just for such eventuality. But the PS even on sale is 'worth'
more than the mobo is currently. I can't see part swapping here when the
mobo is one of the parts that might need to/ should/ be swapped.
The thing
is, you could disconnect the supply, and try and test it separately.

The thing I don't like about troubleshooting the supply is that the
problem is so intermittent and runs perfectly for hours and days and weeks
and months at a time without any malfunction.
The same money could buy you a replacement supply,
to use as a swap test.

On top of everything else, the whole computer isn't worth the trouble to
troubleshoot it. As long as I'm replacing pieces and parts, I might as
well replace the mobo, cpu, ram, and hdd, because the mobo/cpu are way low
end; and if the mobo gets replaced even with a cheap but more modern one,
then it needs modern ram and a modern hdd.

OTOH, as it is currently (as I speak/write) running just fine, so maybe I
shouldn't replace anything until it completely dies. It isn't a primary
or 'solo' computer. It is running 'beside' another computer on a KVM
hookup so that I can flip from one OS to another, and it gets to have
experimental OSes, such as linux distros and some MS 'derivatives' that I
won't speak about :-)
 
Mike Easter said:
I could use my head better if I could understand the
condition/state that causes a PS to not be able to be turned on
with the front case powerswitch -- which condition/state can be
restored to normalcy by using the rear powerswitch (or
disconnecting the AC plug) -- which powerdown condition must last
several seconds, not an instant.

Even if using a multimeter were okay... If you think you are going
to find an intermittent failure with a multimeter, you are kidding
yourself.
What PS (or other) condition becomes 'reset' by such a power
source interruption of sufficient seconds duration?

A faulty one. If you suspect something is wrong with the power
supply, remove it from your system and have someone who knows what
they are doing troubleshoot it.
I'm not trying to incite a war with multimeter fans such as
westom,

Who cares?
I'm just trying to understand how come I'm observing the condition
of "the front powerswitch won't work just now unless you
disconnect the mains for a few seconds first" -- which condition
immediately follows the shutdown I'm trying to diagnose.

If you want to experiment with electronics, do it with something
other than a personal computer.

Whatever you might be able to determine with a multimeter, you can
just as easily determine by the operation of your computer. If it
does not work the way you want it to work, replace the part. You
cannot repair components inside of a personal computer, so what is
the point?

People who promote the multimeter troubleshooting garbage are
probably not quite grown-up old people who resent the advent of
"blackbox troubleshooting". Here on USENET, since they are not
responsible for what they say, they can promote obsolete, useless,
and hazardous troubleshooting methods.
 
John said:
"Mike Easter"

A faulty one.

A faulty condition, a faulty PS, a faulty mobo or some other kind of
faulty? It seems to me to be a faulty condition undiagnosed. Except when
it isn't a condition at all, which is almost all the time.
If you suspect something is wrong with the power
supply, remove it from your system and have someone who knows what
they are doing troubleshoot it.

Personally I would rather replace the mobo than the PS, but it seems that
replacing one part and then another should be done based on something
besides feelings and whether or not you have a spare one lying around.

It happens that I have a spare PS (and there's another one on sale today
for $20), but I don't happen to have a spare mobo. And I emphasize again
that the system in question is still/currently running just fine, and
might do so for as many more months consecutively as it has in the past.

Or not.
 
I'm not trying to incite a war with multimeter fans such as westom, I'm
just trying to understand how come I'm observing the condition of "the
front powerswitch won't work just now unless you disconnect the mains for
a few seconds first" -- which condition immediately follows the shutdown
I'm trying to diagnose.

Your original question. Power supply voltages can be completely
wrong and a system will still boot and operate. A slight variation of
the already defective voltage means intermittent power failures. Only
a multimeter reports on a majority of inputs to the power supply
controller; in minutes and without doubt. No other suggestion posted
can do that. A computer powered by a defective supply: only way to
identify that defect or a defect in some other part of the power
supply ‘system’ is a multimeter (or the $thousands in equipment that
we use).

Again, inputs include the power supply switch, a power supply
handshake, the essential power for that power controller, CPU, and
some enabled BIOS functions. Those are your inputs that define an
answer your question.

Myths include cleaning pins. If a clean pin creates a cure, the
problem still remains. Pin cleaning only cures symptoms.

Is overcurrent tripping? That also would be immediately obvious in
30 seconds from meter numbers. Just another in a long list of facts
that 30 second and a meter will report.

That latch-off or protective lockout function (which symptoms
suggest is being tripped) would also be identified from meter
numbers. That function is tripped by the same above defined
controller inputs.

A faulty or intermittent connection - not just identified by the
meter. But the meter can quickly identify which wire connection is
intermittent.

Not possible to use your head without facts and knowledge. But many
have a problem. They have long declared themselves 'expert'. The
meter shows how little they really knew and how much extraneous work
was unnecessary labor.

Notice the many who did not even know of a power supply
controller. Many who never learned basic electric concepts just
'know' the meter is not informative because they cannot use a meter
and never learned the information provided in those numbers. Either
get numbers and immediately have a useful answer. Or you just keep
speculating.

For example, how many others defined where power on is controlled?
Paul discussed the latch-off function that others did not even know.
How many listed the significant inputs to that controller? Notice how
many 'it could be this' speculations are all answered with the meter
in 30 seconds. Meter is why a useful reply can be obtained quickly.
The only reason I post:- when the popular answer is also the junk
science one and does not provide the OP with useful answers. Get the
meter. Get numbers. Learn how each suspects get identified or
eliminated logically, definitively (the first time), quickly, and
without any war - in minutes; not hours.

You have a list of significant inputs. Your answer begins with
those suspects - which is why the meter is used only by those who
actually know this stuff and by others who want to learn.
 
Mike Easter said:
I don't understand enough about the ATX PS circuitry to understand how
many other 'misfiring' conditions that are not the PS itself which could
cause the same kind of condition of the 'off' condition of the PS in which
the case front switch does not function to restart the computer.

That is, if some other hypothetical situation were to cause a shutdown,
say cpu overheating or whatever; would that (type of) shutdown be unable
to be restarted (using the front case powerswitch) without using the rear
poweroff switch in an off cycle lasting some number of seconds, and not
just instantaneously cycled?

I guess put another way, what kinds of shutdowns cause this state of a PS?

Some protection circuit (under-voltage, over-voltage, etc.) has been tripped
inside the power supply itself. That is why you have to physically throw
the power switch ON the power supply. Sometimes, that will reset the
protection circuit, depending on the design of the protection
ircuit. -Dave
 
Mike Easter said:
I'm now reading that "Most computer power supplies have short circuit
protection, overpower (overload) protection, overvoltage protection,
undervoltage protection, overcurrent protection, and over temperature
protection."

H'mmm... dunno about "most". Good ones, costing north of 50 quid, will
have most or all of the above features.

Yer average PSU off the mom'n'pop shop shelf (I don't count the
appalling cheapy PSUs thrown in with cases)..

s/c protection (on outputs), yes;
overload protection (on outputs) yes;
overvoltage protection (on outputs) yes;
undervoltage protection (on outputs) maybe;
overcurrent protection (on outputs) yes;
overtemperature protection no. Unless you count pyrotechnics as
protection.
 
Back
Top