How reliable is Memtest86+?

  • Thread starter Thread starter Nil
  • Start date Start date
N

Nil

I'm trying to troubleshoot a sick Windows PC (reboots randomly or
bluescreens with DRIVER_IRQL_NOT_LESS_OR_EQUAL errors.) I may have more
questions to ask about that later, but for now, I've run Memtest86+
twice now for about 4 passes of it's default tests each time, about 2.5
hours each time. No errors are reported. Can I confidently conclude
that the RAM is OK?
 
Nil said:
I'm trying to troubleshoot a sick Windows PC (reboots randomly or
bluescreens with DRIVER_IRQL_NOT_LESS_OR_EQUAL errors.) I may have more
questions to ask about that later, but for now, I've run Memtest86+
twice now for about 4 passes of it's default tests each time, about 2.5
hours each time. No errors are reported. Can I confidently conclude
that the RAM is OK?

Paul said here one time that prime95 was more stressful but less
diagnostic than memtest86.

http://bit.ly/eqq3d1
 
Nil said:
I'm trying to troubleshoot a sick Windows PC (reboots randomly or
bluescreens with DRIVER_IRQL_NOT_LESS_OR_EQUAL errors.) I may have more
questions to ask about that later, but for now, I've run Memtest86+
twice now for about 4 passes of it's default tests each time, about 2.5
hours each time. No errors are reported. Can I confidently conclude
that the RAM is OK?

One person found no errors with MemTest86+, but when he tried Gold
Memory 6.92 he found one bad location, and it took another 9 hours to
detect that error again. IOW your 10 hours of testing may not be
enough. Gold Memory rated well in RealWorldTech.com 's evaluation
from several years ago, beating MemTest86 and bettered only by the
expensive PhD RST products. BTW, despite MemTest86+ being based on
MemTest86, I've found the two give different results. The same goes
for Gold Memory v. 5.07 and v. 6.92.

Unless you're using modules whose RAM chips are clearly marked with
the actual chip manufacturer's (Samsung/SEC, Micron/Inotera/Nanya,
Hynix, ProMOS, PowerChip) logo or part number, be very suspicous of
the quality, especially if the voltage rating is higher than normal
(over 1.5V for DDR3, overl 1.8V for DDR2).
 
Nil said:
Mike Easter

I'd like to try it, but it's a Windows application, and I can't get
Windows to stay up for more than a few minutes at a time.

The latest Hiren's has prime95.

http://blog.sisq.info/en/2011/02/10/latest-hirens-bootcd-13-1/ Latest
Hiren’s BootCD 13.1 - New apps: Prime95

http://www.hiren.info/pages/bootcd Testing Tools - Prime95 25.11 This
will detect for errors in CPU or RAM within a matter of minutes if an
overclock is not stable, you can run Torture Test (burn-in) overnight to
ensure long-term stability of the hardware.

Be careful where you get it and be nice about copyright issues :-)
 
Nil said:
I'd like to try it, but it's a Windows application, and I can't get
Windows to stay up for more than a few minutes at a time.

Prime95 is available for Linux as well as Windows. Boot
a Linux LiveCD, go to mersenne.org/freesoft and get your
Linux test software there.

http://www.mersenne.org/freesoft/

Before Prime95 had a multi-threaded version, you could
run multiple single copies of Prime95, by keeping each
of them in a separate folder in the Linux home directory.
That's how I used to thrash a machine with the single
threaded version. Now that multi-threaded versions
exist, that job should be easier to do, with a single
copy running. (The multi-threaded version, means you
get to run at least a single thread per core.)

Using that technique, I ran four copies of Prime95 in
Linux, and could consistently get failures in a particular
"quadrant" of the memory space. But that doesn't necessarily
mean I can easily map what I'm seeing, to a particular
DIMM. So while it was fun to do, it didn't make locating
the culprit any easier.

One benefit of using multiple separate copies, would be
the ability to test machines with very large memory. I
don't know what upper limit Prime95 has for its memory
footprint. It may have been around 1600MB on the Windows
version.

An even more stressful test, is Prime95 running at the
same time as a 3D application. I used to use 3DMark2001
for that, but there are other options. That creates more
stress, than Prime95 alone. You start Prime running, then
start 3DMark. I've had a computer crash in 3DMark
by doing that, with the audio stuck in a loop, so it
does add an extra bit of stress.

Memtest86+ is mainly useful for stuck-at faults. It
detects dynamic faults too, but if the dynamic faults
only occur under extreme stress, you'll never find them.

There are better things than Prime95, but I've stopped
tracking "burn in" options. There is some Intel tool
for example, that the enthusiasts like. But I haven't
investigated replacements for Prime95, in quite a while.

Paul
 
Nil said:
I'm trying to troubleshoot a sick Windows PC (reboots randomly or
bluescreens with DRIVER_IRQL_NOT_LESS_OR_EQUAL errors.) I may have more
questions to ask about that later, but for now, I've run Memtest86+
twice now for about 4 passes of it's default tests each time, about 2.5
hours each time. No errors are reported. Can I confidently conclude
that the RAM is OK?

Try updating your video cards drivers. I dont IMHO think its ram related.
 
Paul said:
Using that technique, I ran four copies of Prime95 in Linux, and
could consistently get failures in a particular "quadrant" of
the memory space. But that doesn't necessarily mean I can easily
map what I'm seeing, to a particular DIMM. So while it was fun
to do, it didn't make locating the culprit any easier.

That is some serious techieness.
 
Try updating your video cards drivers. I dont IMHO think its ram
related.

I'm beginning to think the same thing. I've run repeated memory tests
and they don't find anything wrong. But there's something triggers the
crash, and I can't seem to find it. While Windows is going, it feels
like certain network or display activity triggers it, but it's so
inconsistent that I can't figure out where to put the blame.
 
I'm trying to troubleshoot a sick Windows PC (reboots randomly or
bluescreens with DRIVER_IRQL_NOT_LESS_OR_EQUAL errors.) I may have more
questions to ask about that later, but for now, I've run Memtest86+
twice now for about 4 passes of it's default tests each time, about 2.5
hours each time. No errors are reported. Can I confidently conclude
that the RAM is OK?

4 passes at 2.5 hours is nowhere near enough to be considered thorough.

I once lazily allowed myself to believe that 24 hours of testing was
enough, but the instability persisted, so I let it go for 48 hours, and
I *did* catch an error. Those are the hardest to troubleshoot since it
takes so long to test for success, but you gotta do what you gotta do.

I'd say 3 straight days would be good enough to feel "confident" about
the results.

BTW, ignore those who try to convice you that memtest is only for
memory; it's great for finding *lots* of various problems. Hell, I once
used it to verify that a faulty *case fan* was the cause of my
instability.
 
4 passes at 2.5 hours is nowhere near enough to be considered
thorough.

Probably not, but I've been trying to get this thing back on its feet
quickly. I didn't realize it would be such a problem.

Anyway, I don't think it's the RAM. I've tried running it just one of
the two memory sticks, then the other. The computer crashes with either
separately and with both. I suppose it's possible that both of them
could have failed at the same time, after 5 years of good service, but
I doubt it. I've also tried two different power supplies, two different
video cards (the motherboard's built-in, and a separate AGP card) and
checked the integrity of the hard disk multiple times. The computer
crashes no matter what combination of my available hardware I install.
I'm thinking it must be a failure in the motherboard, which is
something I don't know how to troubleshoot, and I probably couldn't fix
even if I did.
BTW, ignore those who try to convice you that memtest is only for
memory; it's great for finding *lots* of various problems. Hell, I
once used it to verify that a faulty *case fan* was the cause of
my instability.

How did you do that? by swapping fans in and out and watching the tests
fail with one of them?
 
From reading the various threads about your PC's possible memory
problem I wonder if the problem might be the memory in your video card.

Try another video card if it's memory can not be replaced, if presently
using just a motherboard built in video then add an external card and
turn off the motherboard one and see if that helps any.

I'm not sure what memory test programs actually check the memory
resident in an external video card but I'm sure others here do.
 
Nil said:
Probably not, but I've been trying to get this thing back on its feet
quickly. I didn't realize it would be such a problem.

Anyway, I don't think it's the RAM. I've tried running it just one of
the two memory sticks, then the other. The computer crashes with either
separately and with both. I suppose it's possible that both of them
could have failed at the same time, after 5 years of good service, but
I doubt it. I've also tried two different power supplies, two different
video cards (the motherboard's built-in, and a separate AGP card) and
checked the integrity of the hard disk multiple times. The computer
crashes no matter what combination of my available hardware I install.
I'm thinking it must be a failure in the motherboard, which is
something I don't know how to troubleshoot, and I probably couldn't fix
even if I did.

Have you tried any testing under a Linux LiveCD ? I confirmed a hardware
problem on my oldest system, by comparison testing Windows and Linux, and
finding both crashed the same way.

*******

One thing that's hard to do in Linux, is video card stress testing.
I don't have a good test case there, yet. (It's not something I work
on full time, but I've put time into it in the past, without being
happy with the level of expertise needed to get it going. Spending
a week to set up one test case, is a non-starter :-( )

I just found this. Haven't tested it yet.

http://www.phoronix-test-suite.com/?k=downloads

but for that to be effective, you might want a driver from your
video card company. On my Nvidia setup, I found the Nvidia driver
was about 15x faster than the "out-of-the-box" driver on the LiveCD.
But I had to install the OS to a spare hard drive, to do that.
I don't think it's possible to install the Nvidia driver package
while running a LiveCD (because it'll probably ask for a reboot,
and you'd need a "persistent" home directory for the changes to
survive the reboot - persistent storage allows things updated
under / to be stored for the next session).

If you don't use a "real" driver, it's possible there won't be
enough video card stress for a good test. I've tried SpecViewPerf
under Linux, and I got the impression, from the crappy way it
was running, virtually all the rendering was being done in
software. (Which is good if you're testing the CPU, but bad
if you want to heat up the video card.)

This Mepis distro, claims to be using Nvidia 260 out of the box.
Nvidia has stopped supporting hardware for two generations of
hardware, so there are actually three driver releases of importance.
Something like 96? and 177? are used for older hardware, while 260
would be for more modern hardware (6200 or better?). The fact that
the older releases would not be maintained, means eventually they
won't install in a modern OS.

http://distrowatch.com/index.php?distribution=mepis&month=all&year=all

So the ingredients are there, but I bet if I start now, a week from
now I won't be very happy with the results.

Paul
 
From reading the various threads about your PC's possible memory
problem I wonder if the problem might be the memory in your video
card.

Try another video card if it's memory can not be replaced, if
presently using just a motherboard built in video then add an
external card and turn off the motherboard one and see if that
helps any.

I've tried that, running the MB builtin video and also a separate AGP
adapter - it crashes with both. I did just find an old PCI video
adapter, and I'll try that out to see if it makes a difference. I'm
doubtful.
 
I've tried that, running the MB builtin video and also a separate AGP
adapter - it crashes with both. I did just find an old PCI video
adapter, and I'll try that out to see if it makes a difference. I'm
doubtful.

I agree, two different video cards with the problem not changing does
tend to rule out the video memory.
 
Mon, 4 Apr 2011 22:29:54 +0000 (UTC): written by ShadowTek
BTW, ignore those who try to convice you that memtest is only for
memory; it's great for finding *lots* of various problems. Hell, I once
used it to verify that a faulty *case fan* was the cause of my
instability.

I used it to determine that one of the two SO-DIMM sockets on a Jetway
NC81-LF motherboard was the issue and not the RAM itself.

As such, be sure to individually test the RAM in various sockets.
Basically, if one socket says good, move that same stick to a different
socket and test to see if it is good.

The thoroughness will be up to you and of course, the more thorough the
test, the more time required.
 
Mon, 04 Apr 2011 20:13:38 -0400: written by Nil
Anyway, I don't think it's the RAM. I've tried running it just one of
the two memory sticks, then the other. The computer crashes with either
separately and with both. I suppose it's possible that both of them
could have failed at the same time, after 5 years of good service, but
I doubt it. I've also tried two different power supplies, two different
video cards (the motherboard's built-in, and a separate AGP card) and
checked the integrity of the hard disk multiple times. The computer
crashes no matter what combination of my available hardware I install.
I'm thinking it must be a failure in the motherboard, which is
something I don't know how to troubleshoot, and I probably couldn't fix
even if I did.

Have you tried a second CPU? I know I had to buy a second one to rule
that out.
 
Probably not, but I've been trying to get this thing back on its feet
quickly. I didn't realize it would be such a problem.

Anyway, I don't think it's the RAM. I've tried running it just one of
the two memory sticks, then the other. The computer crashes with either
separately and with both. I suppose it's possible that both of them
could have failed at the same time, after 5 years of good service, but
I doubt it. I've also tried two different power supplies, two different
video cards (the motherboard's built-in, and a separate AGP card) and
checked the integrity of the hard disk multiple times. The computer
crashes no matter what combination of my available hardware I install.
I'm thinking it must be a failure in the motherboard, which is
something I don't know how to troubleshoot, and I probably couldn't fix
even if I did.

It's best to have a least two of every type of thing: pci, pcie, memory,
motherboard, etc.. Once your collection of junk gets to that point, testing hardware
gets a lot easier since you can swap out pretty much everything..

How did you do that? by swapping fans in and out and watching the tests
fail with one of them?

Memtest did give plenty of errors with the fan in, but I got lucky in detecting the
culprit since the guilty fan was in the side-panel, which had to be
disconnected while I had the case open. Eventually, I realized that the
connection of the fan was the only difference between failing and
successful runs.
 
One thing that's hard to do in Linux, is video card stress testing.
I don't have a good test case there, yet. (It's not something I work
on full time, but I've put time into it in the past, without being
happy with the level of expertise needed to get it going. Spending
a week to set up one test case, is a non-starter :-( )

I just found this. Haven't tested it yet.

http://www.phoronix-test-suite.com/?k=downloads

I just install Nexuiz and walk through all the time-demos. They always
get my cards nice and warm, so I figure that's a decent stress test.
 
Memtest did give plenty of errors with the fan in, but I got lucky in detecting the
culprit since the guilty fan was in the side-panel, which had to be
disconnected while I had the case open. Eventually, I realized that the
connection of the fan was the only difference between failing and
successful runs.

Sounds like the ages-ago NIC I encountered that would take out the
network if you tightened down the mounting screw. (The case was
warped. Once I figured it out I left the screw with simply a bit of
tension on it and it worked fine.)
 
Back
Top