Mobo vs PS problem

  • Thread starter Thread starter Mike Easter
  • Start date Start date
M

Mike Easter

I'm trying to conceive of the mechanism behind this hardware problem
even if I decide to not repair it, I would like to have a better
understanding of what is going on. I've brought this problem up here
before in a different form.

Here is the hardware: mobo ECS 741GX-M2 which is a AMD socket A using a
AMD Geode NX-1750 which board has integrated everything, video, sound, &
ethernet via the SiS chipsets 741GX north & 964L south. It has a 250W
PS. Currently only one of the two ram slot/s are/is populated with 1G
DDR ram.

Here is the hardware problem which occurs infrequently under very
limited software circumstances, but is the same with various operating
systems such as XP or Mint or Mandriva and various browsers such as
Chrome or Firefox. While reading a webpage which extends below beyond
the screen's window, I typically use the mouse on the vertical slider
rather than paging; that is, I slide the window slider so as to expose
that lower part of the page which goes beyond the window's edge.

Occasionally, about once a week or so, while performing that operation,
the machine abruptly shuts down. Instantly.

During this shutdown, the following conditions can be observed. The
front power light is off. The two yellow and green ethernet lights at
the rear remain lighted. Most significantly, the front powerswitch does
not work to power back up. The relationship between the mobo and the PS
is 'dead'.

In order to restore funcitonality, it is necessary to turn off the PS's
rear switch. During this type of shutdown, whatever gets reset doesn't
happen 'instantaneously'. It requires a few seconds, say 5-8. The
other thing which requires a few seconds is those little lights beside
the ethernet port. Once those seconds have elapsed and the ethernet
lights are off, I can turn the rear PS switch back on, then the front
panel power switch works again and the machine powers up and everything
is operational for a considerable time, maybe more than a week. Maybe
only a few days.

I routinely view webpages in the manner described, and almost 100% of
the time this viewing method works. But occasionally I get the shutdown
described. Presently I get the shutdown under no other software
conditions that I have found.

However, if I put 2 1G ram sticks into the slots, I have a bigtime
problem with this type of shutdown under all kinds of other conditions.

Since replacing the PS is a popular troubleshooting method for some
power related mysteries, I'm sure that replacing the PS would be
informative, because the result might be that a new PS causes everything
to work properly with 1 or 2 1G sticks of ram and the whole problem is
relatively undiagnosed but solved. The result of that PS replacement
might also be that under conditions of a new PS that the (mobo) failure
problem would occur just as before, in which circumstance the next
experiment would be to replace the mobo.

However, rather than replace anything for information, I would like to
have a better understanding of what the 'meaning' of the change in the
demands on the power that the movement of the window slider causes. I
would also like to have a better understanding of what is the meaning of
the power light on the front panel going off but the ethernet lights
staying on.

I would also like to have a better understanding of the what kind of a
relationship between the mobo and the PS can cause the two of them to
need to be reset because of some kind of power fault protection which
resetting requires to shut down the rear switch for some seconds. And
how come it takes some seconds for the ethernet lights to go off when
the rear switch is shut off.
 
Mike said:
I'm trying to conceive of the mechanism behind this hardware problem
even if I decide to not repair it, I would like to have a better
understanding of what is going on. I've brought this problem up here
before in a different form.

Here is the hardware: mobo ECS 741GX-M2 which is a AMD socket A using a
AMD Geode NX-1750 which board has integrated everything, video, sound, &
ethernet via the SiS chipsets 741GX north & 964L south. It has a 250W
PS. Currently only one of the two ram slot/s are/is populated with 1G
DDR ram.

Here is the hardware problem which occurs infrequently under very
limited software circumstances, but is the same with various operating
systems such as XP or Mint or Mandriva and various browsers such as
Chrome or Firefox. While reading a webpage which extends below beyond
the screen's window, I typically use the mouse on the vertical slider
rather than paging; that is, I slide the window slider so as to expose
that lower part of the page which goes beyond the window's edge.

Occasionally, about once a week or so, while performing that operation,
the machine abruptly shuts down. Instantly.

During this shutdown, the following conditions can be observed. The
front power light is off. The two yellow and green ethernet lights at
the rear remain lighted. Most significantly, the front powerswitch does
not work to power back up. The relationship between the mobo and the PS
is 'dead'.

In order to restore funcitonality, it is necessary to turn off the PS's
rear switch. During this type of shutdown, whatever gets reset doesn't
happen 'instantaneously'. It requires a few seconds, say 5-8. The
other thing which requires a few seconds is those little lights beside
the ethernet port. Once those seconds have elapsed and the ethernet
lights are off, I can turn the rear PS switch back on, then the front
panel power switch works again and the machine powers up and everything
is operational for a considerable time, maybe more than a week. Maybe
only a few days.

I routinely view webpages in the manner described, and almost 100% of
the time this viewing method works. But occasionally I get the shutdown
described. Presently I get the shutdown under no other software
conditions that I have found.

However, if I put 2 1G ram sticks into the slots, I have a bigtime
problem with this type of shutdown under all kinds of other conditions.

Since replacing the PS is a popular troubleshooting method for some
power related mysteries, I'm sure that replacing the PS would be
informative, because the result might be that a new PS causes everything
to work properly with 1 or 2 1G sticks of ram and the whole problem is
relatively undiagnosed but solved. The result of that PS replacement
might also be that under conditions of a new PS that the (mobo) failure
problem would occur just as before, in which circumstance the next
experiment would be to replace the mobo.

However, rather than replace anything for information, I would like to
have a better understanding of what the 'meaning' of the change in the
demands on the power that the movement of the window slider causes. I
would also like to have a better understanding of what is the meaning of
the power light on the front panel going off but the ethernet lights
staying on.

I would also like to have a better understanding of the what kind of a
relationship between the mobo and the PS can cause the two of them to
need to be reset because of some kind of power fault protection which
resetting requires to shut down the rear switch for some seconds. And
how come it takes some seconds for the ethernet lights to go off when
the rear switch is shut off.

The motherboard and power supply have protection features.

1) Processor THERMTRIP. Modern processors have an overtemperature detector,
which can be used by motherboard logic. Your S462 motherboard, would need
an 8 pin DIP connected to the processor thermal diode, to give the
equivalent overtemperature protection. (That is what my old Nforce2
motherboard uses.) Earlier S462 motherboards relied on BIOS mediated
control of overtemperature detection, which wasn't always reliable.
Newer designs are "all-in-hardware", to ensure they trigger properly.

2) Vcore regulator overcurrent detection. The regulator may "latch-off" if
it detects a problem. The Power Good signal from the circuit, is one
signal the regulator sends, to the rest of the motherboard logic, indicating
how happy it is.

3) PSU overvoltage/overcurrent protection (internal to the PSU). This
may cause latch-off, and require cycling the power, in order to
remove the supervisor latched off state info.

Circuits powered by +5VSB can be used to "remember" the fault, and
prevent instantly repeating it. Requiring the user to turn off the
power supply, to get the computer to work again, means a human
will be present when power is applied again.

As to how some of these features would be wired up, it is up to the
motherboard designer to decide what to do with them. The PS_ON# logic
for example, is already pretty extensive, including connections to
the Southbridge and Super I/O chip. Additional jelly-bean logic may
be used to incorporate safety features into the chain of logic, for
things that the major chips haven't taken into account.

If you want an interesting test to try, use Prime95 Torture Test from
mersenne.org/freesoft . In your particular case, you'd want to run
"small FFTs", so that the Torture Test runs within CPU cache, and
makes the CPU get as hot as possible. Can the processor and Vcore
run the Torture Test for at least ten minutes ? Does anything overheat ?
Are your symptoms reproduced ? Did the computer shut off ?

If the computer shut off, run the test again, and use Speedfan to
monitor temperatures as best you can. Do you see any temperatures
spiking as soon as Prime95 starts the Torture Test ?

Looking at a picture of your motherboard, it looks like Vcore may be
powered from +5V. Some regulators have UVLO (under voltage lock out),
and if the main power supply connector isn't making good connections
(low impedance), that can be enough to trigger protection on Vcore.
If the main power connector was loose, the pins can burn and contact
surfaces degrade. Pull the main connector and examine it for
damage.

If each motherboard documented how all this stuff worked, and what
symptoms to expect, it would then be possible to make a fault tree
for your consideration. But as it is, with no documentation, we
can only speculate what the side effect would be from some of these
things. They add safety features, but don't go out of their way
to tell you what they are, or how they're tied into PS_ON#.

Paul
 
Mike said:
mobo ECS 741GX-M2 ... socket A
integrated everything, video, sound, & ethernet via the SiS chipsets
741GX north & 964L south. It has a 250W PS.
While reading a webpage which extends below beyond
the screen's window,
Occasionally, about once a week or so, while performing that operation,
the machine abruptly shuts down. Instantly.

During this shutdown, the following conditions can be observed. The
front power light is off. The two yellow and green ethernet lights at
the rear remain lighted. Most significantly, the front powerswitch does
not work to power back up. The relationship between the mobo and the PS
is 'dead'.

In order to restore functionality, it is necessary to turn off the PS's
rear switch. During this type of shutdown, whatever gets reset doesn't
happen 'instantaneously'. It requires a few seconds, say 5-8. The
other thing which requires a few seconds is those little lights beside
the ethernet port. Once those seconds have elapsed and the ethernet
lights are off, I can turn the rear PS switch back on, then the front
panel power switch works again and the machine powers up and everything
is operational for a considerable time, maybe more than a week.

Brevity, Mike, brevity. :)

The fact that your computer suddenly turns off occasionally AND you
have to flip its rear power switch to restore operation indicates the
power supply is at fault and its overcurrent/overpower protection
circuitry has triggered. Turning off the PSU resets that circuitry.
And if your PSU is the original one for your Socket A motherboard,
it's probably at least 5 years old and has some worn-out electrolytic
capacitors inside it (www.BadCaps.net has tons of information about
this). Worn caps can really reduce the power capacity of a PSU.
 
larry said:
Brevity, Mike, brevity. :)

I'll try. :-)
If you want an interesting test to try, use Prime95 Torture Test from
mersenne.org/freesoft . In your particular case, you'd want to run
"small FFTs", so that the Torture Test runs within CPU cache, and
makes the CPU get as hot as possible. Can the processor and Vcore
run the Torture Test for at least ten minutes ? Does anything overheat ?
Are your symptoms reproduced ? Did the computer shut off ?

No. No. and No.

I don't see how these explanations help me understand...

-1- why this symptom is only reproducible with one specific type of
scrolling in a browser window regardless of OS or browser
-2- why part of the mobo still has power to light the ethernet lights
after general failure protection shutdown
 
Oops. My brevity answered the stress test assignment qx/s ambiguously.
If you want an interesting test to try, use Prime95 Torture Test from
mersenne.org/freesoft . In your particular case, you'd want to run
"small FFTs", so that the Torture Test runs within CPU cache, and
makes the CPU get as hot as possible. Can the processor and Vcore
run the Torture Test for at least ten minutes ?

Yes. test completed.
Does anything overheat ?
No.

Are your symptoms reproduced ?
No.

Did the computer shut off ?

No.

Nothing even speeded up. In the past, but not recently, perhaps since I
added more ram, I had observed 'something' - that sounded like a fan
speeding up or becoming louder - under certain conditions, say maybe
booting up from a linux live CD - temporarily becoming louder for a
little while, some 10s of seconds, not a minute, before resolving.
However, this transient increased loudness was not associated with any
kind of shutdown or problem at all.
 
Mike said:
I'm trying to conceive of the mechanism behind this hardware problem
Here is the hardware: mobo ECS 741GX-M2 which is a AMD socket A using a
AMD Geode NX-1750 which board has integrated everything, video, sound, &
ethernet via the SiS chipsets 741GX north & 964L south.

What are the chances of some kind of 'mismatch' (involving something
that is triggered betwen the cpu and the supporting chipsets) between
the mobo and the cpu? Even tho' the cpu came with the mobo, the socket
A mobo is described as supporting athlon xp, sempron, or athlon, but it
doesn't actually say anything about the geode, nx or otherwise.

I'm reading this about the geode nx

Features:

* 7th generation core (based on Mobile Athlon XP-M).
* Power management: AMD PowerNow!, ACPI 1.0b and ACPI 2.0.
* 128 KB L1 cache.
* 256 KB L2 cache with hardware data prefetch
* 133 MHz Front Side Bus (FSB)
* 3DNow!, MMX and SSE instruction sets
* 0.13 µm (130 nm) fabrication process
* Pin compatibility between all NX family processors.
* OS support: Linux, Windows CE, MS Windows XP.
* Compatible with Socket A motherboards

And then I'm reading in the wiki about the athlon xpm

http://en.wikipedia.org/wiki/Athlon#Mobile_Athlon_XP

Also, I'm confused about some bios settings regarding AGP. There is no
AGP card installed, but somehow what I read from hardware reports from
lshw or SIW, it seems that the SiS chipset is trying to tell me that it
is acting like AGP for the VGA.

If I am supposed to make some intelligent choices about how anything
about how the AGP aperture size is configured, then maybe I would also
need to make an intelligent choice about other AGP related issues.
 
Mike said:
I'll try. :-)


No. No. and No.

I don't see how these explanations help me understand...

-1- why this symptom is only reproducible with one specific type of
scrolling in a browser window regardless of OS or browser
-2- why part of the mobo still has power to light the ethernet lights
after general failure protection shutdown

In terms of things to look at, if the motherboard had a LED that
monitored +5VSB, I might watch that during a scroll-related failure.
+5VSB should never blink or glitch. Asus motherboards generally all
have a LED fitted to +5VSB, which is a convenient warning that the
motherboard still has power. Other brands, aren't likely to provide that.
(My Asrock doesn't have that LED, which is an annoyance.) Some oscillating
power supply faults, end up causing the Asus green LED to blink one or
two times a second. Connecting a multimeter to +5VSB, isn't as likely
to catch any transient behavior quite as well as the LED can. Naturally,
if the transient is in the nanosecond region, you can't see it. But
power supplies tend to misbehave in the millisecond region.

If the +5VSB disappears for even a fraction of a second, that can
deprive the chip driving PS_ON# of drive, and PS_ON# goes to the
off state. So you may find that, say, shorting out +5VSB, causes
the supply to switch off.

But your latch-off symptoms are significant. As Larry points out,
a power supply can latch off if there is a failure. So you could
try another power supply and repeat your test. The motherboard
Vcore also has latched fault capability, but the motherboard designer
would have to make an extra effort to cause that to require
turning off all the power. So the simplest explanation is the
power supply did it. It could be a problem with the transient
response of the supply - some unique combination of current
waveforms between the supply and the motherboard, when the
scrolling thing happens.

If the computer was just switching off, that would be much harder
to isolate, since a crafty person could write code to trigger
turning off the computer. But pressing the power button on the
front of the computer, would cause it to come on again. It is
the fact that your machine requires switching off power on the
back, which is significant. The power supply can latch up easily.
The motherboard designer can do it with a bit more effort. If I
didn't know which to swap out first, I'd have to try the power
supply. Even if there is no good explanation as to why it should
happen.

To do the necessary debugging, it helps to have a $35000 four channel
storage scope on your bench. You can trace down all sorts of craziness
with one of those. I worked many years with one of those sitting next to
me. But if you don't have one of those, then swapping a power supply
for $50, is the next best option :-)

Paul
 
Back
Top