Intermittent/variable hardware problem.

  • Thread starter Thread starter Mike Easter
  • Start date Start date
M

Mike Easter

ECS EliteGroup GeForce 7050M-M mobo^1 AMIBIOS. AMD64 X2. 3G ram.

Normal operation typically from cold start. CPU and system temps OK
while observing readout from the BIOS over minutes. Can boot into fully
operational XP or linux or Hiren's diagnostic for a while. Attempts to
use ram checking utilities have failed because during the testing vid
becomes erratic and the DOS program freezes.

Sometimes 'spontaneous' reboots occur. The failing condition results in
the next higher level of failure; reboot gives no vid for the BIOS.
There is no card, the vid is from the mobo. In this disturbed hardware
condition, the case power switch does not work properly. Reset works,
but it is not possible to turn off the machine's power light status with
the case powerswitch regardless of how long it is held down or the
pattern of hold 4+ sec release or quicker release.

How/ What order/ should I proceed with troubleshooting? Replace PS,
replace CMOS battery, remove ram/ remove hdd/s, other.

I don't know what the observable symptoms reflect/suggest.
 
Mike said:
ECS EliteGroup GeForce 7050M-M mobo^1 AMIBIOS. AMD64 X2. 3G ram.

Normal operation typically from cold start. CPU and system temps OK
while observing readout from the BIOS over minutes. Can boot into fully
operational XP or linux or Hiren's diagnostic for a while. Attempts to
use ram checking utilities have failed because during the testing vid
becomes erratic and the DOS program freezes.

Sometimes 'spontaneous' reboots occur. The failing condition results in
the next higher level of failure; reboot gives no vid for the BIOS.
There is no card, the vid is from the mobo. In this disturbed hardware
condition, the case power switch does not work properly. Reset works,
but it is not possible to turn off the machine's power light status with
the case powerswitch regardless of how long it is held down or the
pattern of hold 4+ sec release or quicker release.

How/ What order/ should I proceed with troubleshooting? Replace PS,
replace CMOS battery, remove ram/ remove hdd/s, other.

I don't know what the observable symptoms reflect/suggest.

The Southbridge and SuperI/O feed into control over PS_ON#.
So some issue with either is possible. It might be possible
to program a register in there, to entirely ignore the power
switch. Normally, you would not be able to gate off RESET,
and a push on RESET should restore everything except SATA
drive sanity. There is no effective external means, to tell
a SATA drive to reset itself. (Whereas IDE drives have a
separate RESET wire, which hammers them and makes them sane
again.)

The integrated video could be failing intermittently. While
the system is apparently running properly, push down *gently*
on the motherboard PCB and see if it reboots. That might be a
sign of a cracked solder ball, like on the main chipset chip.

The chipset has its own DC power conversion. So there's a
1.5V supply or so on the motherboard. If that is out of spec,
it can cause failures.

And virtually all of these feed into "replace motherboard"
as a solution. When you get tired of recording the details
of individual crashes.

I'm facing something like that here now. I've had several "events"
on the machine. Testing the CMOS battery reads 2.9V, which is
not low enough to cause a problem (below 2.4V would be a problem).
I'm still in the mode of recording failure information, in an
effort to figure it out. For example, I had an IDE drive
connected to a separate JMicron chip "disappear", and subsequent
diagnostic testing showed no problems at all with the drive
(clean SMART, good bench). The only common element I can
think of for my problem set, is a motherboard chipset
powering issue (which means, "replace motherboard").
Simple minded multimeter tests of the PSU voltages,
show nothing really abnormal.

If you have fixed speed cooling fans on the computer, I've
had a couple cases here, where a "warbling" in the fan speed,
signaled imminent failure of the PSU. Of course, this relies
on your ability to note the "normal" amount of wander in
fan speed, from the "abnormal" amount. The human ear is pretty
sensitive to that wander of the fan speed.

I've always wondered how many obscure PSU failure modes
there are. Such as the PSU I had, where it was injecting AC noise
onto the mains, and causing my ADSL modem to "tip over". Like,
could a PSU develop abnormally high "ripple" on the output, with
no other symptoms apparent ? I have no means here to measure
or observe such a failure, if it was happening. And that would
not affect fan speed either (since the ripple is at a high
frequency). The most likely thing to suffer from excessive ripple,
would be some of the chipset voltage regulators (which use
MOSFETS and op amps, as a cheap solution). If all of the
regulation on a motherboard was done with switchers, it
might make a difference to how sensitive the motherboard
is to incoming power quality. My current motherboard uses
a switcher to run the DIMMs for example, whereas my old
P4 system used linear regulators for the DIMMs. And with
relatively low headroom (so the regulator can stay cool).

Paul
 
ECS EliteGroup GeForce 7050M-M mobo^1 AMIBIOS. AMD64 X2. 3G ram.

Normal operation typically from cold start. CPU and system temps OK
while observing readout from the BIOS over minutes. Can boot into fully
operational XP or linux or Hiren's diagnostic for a while. Attempts to
use ram checking utilities have failed because during the testing vid
becomes erratic and the DOS program freezes.

Sometimes 'spontaneous' reboots occur. The failing condition results in
the next higher level of failure; reboot gives no vid for the BIOS.
There is no card, the vid is from the mobo. In this disturbed hardware
condition, the case power switch does not work properly. Reset works,
but it is not possible to turn off the machine's power light status with
the case power switch regardless of how long it is held down or the
pattern of hold 4+ sec release or quicker release.

How/ What order/ should I proceed with troubleshooting? Replace PS,
replace CMOS battery, remove ram/ remove hdd/s, other.

I'd try another PS, without paying for one, but if you need
to buy a PS, NewEgg and TigerDirect often feature a 430W
Corsair for $20, after rebate, and it's really good.

Don't trust a power supply tester that uses only indicator
lights. Testers with an LCD digital readout are a lot better
but cost about as much as that 430W Corsair PS and a lot more
than a digital multimeter that's more accurate and versatile.

Replacing the CMOS battery is cheap, but while a weak battery
will affect some motherboards, especially when the AC has been
turned off, in my lmiited experience ECS motherboards are not
among them, except in regards to time and date.

Memory lasts forever, unless zapped by excessive voltage (surge,
static, faulty PS). Most of my motherboard failures have been
due to bad capacitors in the voltage regulator next to the DIMM
slots, all of those in ECS motherboards with OST brand capacitors.
This photo of your model ECS motherboard shows lots of OST caps.
Coincidence? Caps can go bad even without bulging or leaking on
top, and the experts at BadCaps.net says this often happens with
OSTs.

http://www.ixbt.com/mainboard/ecs/geforce7050m-m/board.jpg

If you're good at soldering electronics, you may want to try
changing any bulging caps, but if none bulge, I'd start with
the OST caps around the DIMM slots, near the blue IDE PATA
connector and white floppy disk connector, especially the
latter. If that helps, then replace the other larger OST
caps, like those between the yellow USB headers and the
large heatsinked chip (north bridge)
and to the right of the orange PCI-E 16x slot. Even the largest
OSTs in the top right area, next to the serial and video ports,
may need to be replaced. Notice there are also green caps
stamped with a "K" on top. Those are Sanyo/Sun, model WG, a
very good brand that rarely fails. Other good brands are
Nichicon and Panasonic. You can't use general purpose caps but
need some rated for very low ESR. DigiKey can be cheap for
small orders because of their low shipping charge.
 
ECS EliteGroup GeForce 7050M-M mobo^1 AMIBIOS. AMD64 X2. 3G ram.

Normal operation typically from cold start. CPU and system temps OK
while observing readout from the BIOS over minutes. Can boot into fully
operational XP or linux or Hiren's diagnostic for a while. Attempts to
use ram checking utilities have failed because during the testing vid
becomes erratic and the DOS program freezes.

Sometimes 'spontaneous' reboots occur. The failing condition results in
the next higher level of failure; reboot gives no vid for the BIOS.
There is no card, the vid is from the mobo. In this disturbed hardware
condition, the case power switch does not work properly. Reset works,
but it is not possible to turn off the machine's power light status with
the case powerswitch regardless of how long it is held down or the
pattern of hold 4+ sec release or quicker release.

How/ What order/ should I proceed with troubleshooting? Replace PS,
replace CMOS battery, remove ram/ remove hdd/s, other.

I don't know what the observable symptoms reflect/suggest.


Open the case and check your cooling fans and capacitors.

If all OK there are two good possibilities:


1) bad ram

2) bad on-board video


try known good ram and/or install a video card.



A bad cmos battery is unlikely and there is only a smaller chance the
problem is due to a bad PSU
 
Back
Top