M2N-E and Kingston DDR2 RAM 667 - benchmark low?! Please help!

  • Thread starter Thread starter David D
  • Start date Start date
David said:
Had to take some time away from it, but now I am back at it. With
Prim95, I always get an Error #7 - Server has run out of Exponents to
Assign. Is that an internet thing or a CPU problem?

You must have signed up with mersenne.org . The purpose of that tool,
is like SETI. They want you to leave your computer running, and the
Prime95 tool attempts to find prime numbers. The search time for
large numbers is extensive.

Computer enthusiasts use Prime95 for the "Torture Test" or the
benchmark. I've never tried to "pull any exponents" from Mersenne
and I'm not signed up to work on their problem in the background.
So just stick with the "Torture Test" option, and don't attempt
to register yourself. That would be my advice. The "Torture Test"
was intended to be an integrity test, to help users decide whether
their computer was stable enough to work on the Prime95 project.

Paul
 
David said:
Hey Paul,

The torture test is greyed out on my Prime95 (hence I cannot use it) -
any ideas?

In the install directory, you'll find "prime.ini" and it has
some text strings with settings.

The "UNDOC.TXT" file lists some options and should be in the
same directory.

*******
If you still are not making any progress getting it configured,
there is a derivative work here. "Orthos" is used by the enthusiast
community as well, but I haven't tried it. The second link is
to the Orthos version of Stress Prime. If you try it, tell me
how it works out.

http://sp2004.fre3.com/
http://sp2004.fre3.com/beta/beta2.htm (orthos edition - dual core capable)

Paul
 
I am trying the Ortho right now. I am not computer programming
literate enough to figure out the Prime95 torture test and why it is
greyed out. I will report the ortho test after letting it run for a
few hours.
 
Well, I ran Ortho for 3 hours and no errors. I don't get it. Maybe
it is time to reformat., Does Ortho have a memory test too? I notice
it is just for processors.
 
David said:
Well, I ran Ortho for 3 hours and no errors. I don't get it. Maybe
it is time to reformat., Does Ortho have a memory test too? I notice
it is just for processors.

Orthos and Prime95, are stability/integrity tests. If they run without
error, it means your computer is not making any *computing* errors. It
doesn't tell you if the performance level is normal or not. Since
both programs allocate a block of memory from the available system memory,
they cannot test *all* the system memory. Memtest86+ can test all of
system memory (because no OS is present). So those are tradeoffs of the
two test environments. Memtest86+ is not the best stability test, but
covers all the memory. Prime95/Orthos tests a fraction of system
memory, but gives the CPU a good thrashing. (There is also an application
called S&M, which is supposed to heat a CPU more than Prime95, and maybe
I'll be able to find that later.)

Your problem is one of performance.

We reviewed your numbers, and it seems everything is set up for normal
operation. But via benchmarking (SuperPI), the conclusion is your
actual measured performance is not right.

Possible reasons:

1) The displayed numbers for clocks/multipliers/timing, do not
represent what is being used by the hardware. Some hardware
could be broken (like a defective CPU). The registers might
say a certain PLL multiplier is being used, when it is not.
This one is hard to prove/disprove, since the theory is that
the registers are no longer an accurate indicator of what the
hardware has been commanded to do.

2) Since memory is slow, it could be a disabled cache. There is L1
and L2 on the processor. Some processors even have an L3 (but
not yours). Cache is disabled by default, when a CPU starts,
and is enabled later in the BIOS sequence.

3) The processor could be thermally throttling. Processors can
have thermal overheat protection, where the first step is to
reduce the computing rate. Which is a reason why we tell people
to use a good enough cooler to stay under 70C temperature. If
the temps go over 70C, the processing rate will drop, and the
purpose of having expensive processors is negated.

I cannot help with (1). That would only be detectable via a benchmark,
and after all other theories had been exhausted.

For (2), many sites have used Cachemem 2.65 as a testing tool. It
is apparently a command line util, and could well have been written
years ago. There is a download on Simtel, but I haven't the guts to
try it. I'm not crazy about how this is named.

http://cdn.simtel.net/pub/simtelnet/msdos/sysinfo/dm/download-cachem26.zip.exe

An alternative, is the CPUZ program has a separate executable. (At
least my 1.33 version included it, and 1.4 has the separate program
as well.) If you look in the CPUZ directory, where the CPUZ.exe is
kept, there is a "latency.exe". When you double-click latency.exe,
a command window will open. The program will benchmark transfers
of various sizes of memory transfer. At the end, it will indicate
how many levels of caches it detected, based on detecting a drop
in memory bandwidth at a certain memory block size. For example,
my Northwood is detected as two levels, L1 is 8KB, L2 is 512KB.
And when I run the Intel Processor Identification Utility, that
is exactly what my processor has got.

If you want to make a copy of the screen output of "latency.exe",
you will need to open a command window. CD (change directory, a DOS
command) to the install directory. Then, once you are in the
CPUZ directory, you would type:

Latency.exe > out.txt

Wait a few seconds, then press carriage return. The program
should exit when a carriage return is seen. Then, the "out.txt"
file should contain what would normally be printed on the
screen by the "latency.exe" program.

This is the output of latency.exe for my OC 3.2GHz Northwood, x14, DDR460.

*******
Cache latency computation, ver 1.0
www.cpuid.com

Computing ...


stride 4 8 16 32 64 128 256 512
size (Kb)
1 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2
4 2 2 2 2 2 2 2 2
8 2 3 3 5 4 3 4 4
16 3 3 13 22 24 34 34 36
32 3 3 13 16 38 27 37 38
64 3 3 4 17 39 39 39 39
128 3 4 13 17 39 39 38 39
256 3 3 5 24 40 40 39 42
512 3 5 20 25 41 42 45 48
1024 3 11 19 37 75 270 230 237
2048 3 11 20 36 71 264 264 239
4096 3 11 21 38 69 216 221 274
8192 3 12 22 40 75 230 224 239
16384 4 12 22 40 76 225 220 274
32768 4 12 22 38 69 223 264 241

2 cache levels detected
Level 1 size = 8Kb latency = 2 cycles
Level 2 size = 512Kb latency = 38 cycles
*******

For (3), there have been a couple efforts to detect throttling
in processors. Panopsys has a download, but AFAIK it is not
in active development. While you can download this, there is a
better one.

Throttlewatch, looking for CPU throttling due to overheat
http://www.panopsys.com/downloads/ThrottleWatch_2_0_1.zip

A better version, might be RMclock. At least RMclock is getting
updated. The "Monitor" tab gives a similar display to Throttlewatch.

http://cpu.rightmark.org/download.shtml
http://cpu.rightmark.org/download/rmclock_225_bin.exe

Try running Orthos at the same time as RMclock is watching your
system. What do you see ? Any frequencies taking a dip ?

Paul
 
If you want to make a copy of the screen output of "latency.exe",
you will need to open a command window. CD (change directory, a DOS
command) to the install directory. Then, once you are in the
CPUZ directory, you would type:

Latency.exe > out.txt

Here is my output:
Cache latency computation, ver 1.0
www.cpuid.com

Computing ...


stride 4 8 16 32 64 128 256 512
size (Kb)
1 3 3 3 3 3 3 3 3
2 3 3 3 3 3 3 3 3
4 3 3 3 3 3 3 3 3
8 3 3 3 3 3 3 3 3
16 3 3 3 3 3 3 3 3
32 3 3 3 3 3 3 3 3
64 3 3 3 3 3 3 3 3
128 4 6 8 16 17 15 12 17
256 4 6 8 16 17 12 13 13
512 4 6 8 16 17 12 13 13
1024 4 8 12 27 49 108 109 114
2048 4 8 15 27 49 108 109 114
4096 4 8 14 27 50 109 111 117
8192 4 8 14 27 51 110 111 118
16384 4 8 14 27 50 111 113 117
32768 4 8 14 27 51 116 112 117

2 cache levels detected
Level 1 size = 64Kb latency = 3 cycles
Level 2 size = 512Kb latency = 13 cycles

This is the output of latency.exe for my OC 3.2GHz Northwood, x14, DDR460.

*******
Cache latency computation, ver 1.0www.cpuid.com

Computing ...

stride 4 8 16 32 64 128 256 512
size (Kb)
1 2 2 2 2 2 2 2 2
2 2 2 2 2 2 2 2 2
4 2 2 2 2 2 2 2 2
8 2 3 3 5 4 3 4 4
16 3 3 13 22 24 34 34 36
32 3 3 13 16 38 27 37 38
64 3 3 4 17 39 39 39 39
128 3 4 13 17 39 39 38 39
256 3 3 5 24 40 40 39 42
512 3 5 20 25 41 42 45 48
1024 3 11 19 37 75 270 230 237
2048 3 11 20 36 71 264 264 239
4096 3 11 21 38 69 216 221 274
8192 3 12 22 40 75 230 224 239
16384 4 12 22 40 76 225 220 274
32768 4 12 22 38 69 223 264 241

2 cache levels detected
Level 1 size = 8Kb latency = 2 cycles
Level 2 size = 512Kb latency = 38 cycles
*******

Can you see differences that mean something? To me it is just
numbers... ;)
Try running Orthos at the same time as RMclock is watching your
system. What do you see ? Any frequencies taking a dip ?

Paul

I ran RMclock at the same time with Ortho and nothing seems out of the
ordinary. The frequencies stayed at the top of the graph, the CPU
stayed at the top of the graph.
Any ideas?
 
David said:
Here is my output:
Cache latency computation, ver 1.0
www.cpuid.com

Computing ...


stride 4 8 16 32 64 128 256 512
size (Kb)
1 3 3 3 3 3 3 3 3
2 3 3 3 3 3 3 3 3
4 3 3 3 3 3 3 3 3
8 3 3 3 3 3 3 3 3
16 3 3 3 3 3 3 3 3
32 3 3 3 3 3 3 3 3
64 3 3 3 3 3 3 3 3
128 4 6 8 16 17 15 12 17
256 4 6 8 16 17 12 13 13
512 4 6 8 16 17 12 13 13
1024 4 8 12 27 49 108 109 114
2048 4 8 15 27 49 108 109 114
4096 4 8 14 27 50 109 111 117
8192 4 8 14 27 51 110 111 118
16384 4 8 14 27 50 111 113 117
32768 4 8 14 27 51 116 112 117

2 cache levels detected
Level 1 size = 64Kb latency = 3 cycles
Level 2 size = 512Kb latency = 13 cycles



Can you see differences that mean something? To me it is just
numbers... ;)


I ran RMclock at the same time with Ortho and nothing seems out of the
ordinary. The frequencies stayed at the top of the graph, the CPU
stayed at the top of the graph.
Any ideas?

The results show two levels of cache being evident. In other
words, it proves by measurement (rather than checking the settings
in a register somewhere) that two levels of cache are
present and working. The results are in units of clock cycles,
and in order to compare the numbers between two different
processors, the inverse of the clock frequency (clock cycle
period in nanoseconds) would need to be applied.

To make further use of the info, would ideally need some other
X2 3800+ AM2 user, to run off a similar benchmark. Apparently
an X2 3800+ S939 would also be acceptable, since the L1 and L2
latency are supposed to be roughly the same.

If the SuperPI results had only been off a little, I'd be suggesting
to you right now, that you check that the memory is installed
in dual channel mode. But I think one page I checked today,
puts your SuperPI results around the level of a Duron 700.
Either the result is due to a memory effect, or a core speed effect.

http://www.ocforums.com/showthread.php?&t=386495&page=0

While I want to cook up another theory to try, I keep looking
at your CPUZ screen results, and nothing there seems out of order.
So maybe the running conditions don't match what is shown here.
How that could be, implies a broken CPU, or some setting which
is invisible, but critical to performance. It is tempting to
suggest the "Duron 700" result, means the processor is in the
low power state, while running SuperPI. I was hoping some
display in RMclock, could uncover something out of the
ordinary, while benchmarks are running. (Try SuperPI again,
while RMclock display is up on the screen.)

http://www.bionicbuddha.com/cpu1.jpg

If you want yet another benchmark to play with, Sandra Lite
is a free download. If you've paid for Sandra already, then you
aren't supposed to install this, as it could blow away the license
for the other version. At least that is the warning that appears
on the download page.

http://www.sisoftware.net/index.html?dir=dload&location=sware_dl_3264&langx=en&a=
http://www.majorgeeks.com/download92.html "SiSoftware Sandra XI.SP2 1.1.35"

To measure unbuffered memory performance in Sandra, you have
to uncheck some of the boxes in the Options panel. This is the
part that bothers me about Sandra, as the appropriate settings
should just have had a button labeled "unbuffered", so you don't
have to look up how to do it the standard way. If you do this
correctly, you'll get a memory bandwidth consistent with the
results seen in the memtest86+ bandwidth display on the left
of the screen. Unbuffered is supposed to be more representative
of when you'd see issues, than the buffered version with all
the boxes checked.

http://www.anandtech.com/showdoc.aspx?i=1839&p=2

You can keep plugging away at this, and perhaps you'll discover
one benchmark, for which you can find another user's results, where
the difference is immediately evident.

As well as benchmarks, Sandra also has a "Hardware" tab, and if you
click, say, the processor tab, there are a series of warnings
at the bottom of the screen. This is where Sandra measures some
stuff, and finds deviations from the norm. For example, Sandra
knows my processor is a 2.8Ghz Northwood, and Sandra can see I've
overclocked via the FSB. It has even given me a warning that
my "thermal resistance is too high". How it determined that is
a real mystery. Which is crap, since my temps under load never
go above about 45C or so. So sometimes the "Tips" are correct,
and sometimes they are merely amusing.

I think you've already been here, and I don't see much in the
way of feedback.

http://vip.asus.com/forum/topic.aspx?board_id=1&model=M2N-E&SLanguage=en-us

Paul
 
To make further use of the info, would ideally need some other

Last night when running Sony Vegas, I also had freeze up probems
(another issue now, what's next huh?). It had to do with the Samsung
codec. Ran fine on my other computer, but the AMD kept freezing me
out. So, I am also at a point where a fresh format and install might
be necessary before running tests again, just to be sure it is not a
windows problem. I hate reinstalling though because if I am at the
same point as before after all the hassles of putting back the
software, it will be for naught (except piece of mind).
I will keep you posted.
 
Ok running SuperPI and RMclock revealed CPU throttle at 2009 for both
CPU's, 81%-99% on CPU usage and voltage was 10.0 and 1.3 respectively.
SuperPI took 43 seconds at 1 million decimal places.
With Sandisoft, I went for the benchmark of memory and Cache and
unchecked everything except (in options panel) :
1) Logical chipset memory bank
2) Processor Cache.

Here are the results :
Combined index 4532
Speed Factor (I am guessing the big problem) 3.1

Ok, so I followed the link also for Sisoft and unchecked the buffer
ones that was stated (although the version seems to be older). Here
is what I got :
Ram Bandwidth : 3787
Ram Float FPU : 3835

I did also run the Mem86 on boot up a while back and it found no
errors. I am going to try tonight to hit some forums and see if
anyone has the same setup as I do and see what it going on. What do
you think of the Speed Factor being so low?

As always, thanks
Paul you are a scholar.
 
David said:
Ok running SuperPI and RMclock revealed CPU throttle at 2009 for both
CPU's, 81%-99% on CPU usage and voltage was 10.0 and 1.3 respectively.
SuperPI took 43 seconds at 1 million decimal places.
With Sandisoft, I went for the benchmark of memory and Cache and
unchecked everything except (in options panel) :
1) Logical chipset memory bank
2) Processor Cache.

Here are the results :
Combined index 4532
Speed Factor (I am guessing the big problem) 3.1

Ok, so I followed the link also for Sisoft and unchecked the buffer
ones that was stated (although the version seems to be older). Here
is what I got :
Ram Bandwidth : 3787
Ram Float FPU : 3835

I did also run the Mem86 on boot up a while back and it found no
errors. I am going to try tonight to hit some forums and see if
anyone has the same setup as I do and see what it going on. What do
you think of the Speed Factor being so low?

As always, thanks
Paul you are a scholar.

Well, first of all, benchmarks are only useful, if you can find
someone else's benchmark results to compare to.

Are you saying you got 43 seconds as a result for SuperPI of
1 million digits, on your Athlon64 X2 machine ? 43 seconds sounds
about right. Whereas your result with several minutes run time would
not be correct (the results that are closer to the performance of
a Duron 700).

What has changed to give you the 43 second result ?

Can you reproduce the 43 second result ? Can you reboot and do
it again without a problem ?

Have you tried setting the affinity of the SuperPI program, just
before doing the Calculate() step ? I'd want to repeat the test
first on Core 0 and then on Core 1 and see if the results are
the same. Maybe one Core is bad and one is good.

Do your other tests run at a normal speed now ? If a task uses both
cores at the same time, you should get a better speedup than you would
see with SuperPI. SuperPI only tests one core at a time, whereas it
is possible your NLE program uses both cores.

Does the fact the RMclock is present and running, make a difference
to the SuperPI results ? Does SuperPI slow down, if RMclock is not
running ?

For Sisoftware Sandra, you can compare the results using the menu
items in the chart. But I don't see an easy way to determine whether
an Athlon64 X2 in the list, is socket 939 or socket AM2.

Paul
 
Well, first of all, benchmarks are only useful, if you can find
someone else's benchmark results to compare to.

Finding someone with the same machine isn't easy. Trying the forums,
but no one is responding. Frustrating.
Are you saying you got 43 seconds as a result for SuperPI of
1 million digits, on your Athlon64 X2 machine ? 43 seconds sounds
about right. Whereas your result with several minutes run time would
not be correct (the results that are closer to the performance of
a Duron 700).

What has changed to give you the 43 second result ?

I don;t know what changed. I reverted back to the v5 bios, but I think
that was before I ran the test. Crazy, I don;t know what I did
different, exactly.
Can you reproduce the 43 second result ? Can you reboot and do
it again without a problem ?

I can reproduce about 44 seconds now, easily. Just reproduced at 42
seconds.
Have you tried setting the affinity of the SuperPI program, just
before doing the Calculate() step ? I'd want to repeat the test
first on Core 0 and then on Core 1 and see if the results are
the same. Maybe one Core is bad and one is good.
Ok, trying that now. I turned on Prime95, set the core to 0 and got
43seconds with Super PI
then I set the core to 1 with Prime95 and ran the SuperPI again, and
got 42 seconds. I am guessing this is the way you want me to do it?
Do your other tests run at a normal speed now ? If a task uses both
cores at the same time, you should get a better speedup than you would
see with SuperPI. SuperPI only tests one core at a time, whereas it
is possible your NLE program uses both cores.
Does the fact the RMclock is present and running, make a difference
to the SuperPI results ? Does SuperPI slow down, if RMclock is not
running ?

No it does not make a difference.
For Sisoftware Sandra, you can compare the results using the menu
items in the chart. But I don't see an easy way to determine whether
an Athlon64 X2 in the list, is socket 939 or socket AM2.

Paul

Thanks Paul, I agree there isn't an easy way to verify this. I tried
out RMclock while rendering with the NLE - it shoots right to the top
100% of the CPU is being used. Is there a way to tell if the memory
is being used as well?
 
Well, first of all, benchmarks are only useful, if you can find
someone else's benchmark results to compare to.

Finding someone with the same machine isn't easy. Trying the forums,
but no one is responding. Frustrating.
Are you saying you got 43 seconds as a result for SuperPI of
1 million digits, on your Athlon64 X2 machine ? 43 seconds sounds
about right. Whereas your result with several minutes run time would
not be correct (the results that are closer to the performance of
a Duron 700).

What has changed to give you the 43 second result ?

I don;t know what changed. I reverted back to the v5 bios, but I think
that was before I ran the test. Crazy, I don;t know what I did
different, exactly.
Can you reproduce the 43 second result ? Can you reboot and do
it again without a problem ?

I can reproduce about 44 seconds now, easily. Just reproduced at 42
seconds.
Have you tried setting the affinity of the SuperPI program, just
before doing the Calculate() step ? I'd want to repeat the test
first on Core 0 and then on Core 1 and see if the results are
the same. Maybe one Core is bad and one is good.
Ok, trying that now. I turned on Prime95, set the core to 0 and got
43seconds with Super PI
then I set the core to 1 with Prime95 and ran the SuperPI again, and
got 42 seconds. I am guessing this is the way you want me to do it?
Do your other tests run at a normal speed now ? If a task uses both
cores at the same time, you should get a better speedup than you would
see with SuperPI. SuperPI only tests one core at a time, whereas it
is possible your NLE program uses both cores.
Does the fact the RMclock is present and running, make a difference
to the SuperPI results ? Does SuperPI slow down, if RMclock is not
running ?

No it does not make a difference.
For Sisoftware Sandra, you can compare the results using the menu
items in the chart. But I don't see an easy way to determine whether
an Athlon64 X2 in the list, is socket 939 or socket AM2.

Paul

Thanks Paul, I agree there isn't an easy way to verify this. I tried
out RMclock while rendering with the NLE - it shoots right to the top
100% of the CPU is being used. Is there a way to tell if the memory
is being used as well?
 
David said:
Finding someone with the same machine isn't easy. Trying the forums,
but no one is responding. Frustrating.

I don;t know what changed. I reverted back to the v5 bios, but I think
that was before I ran the test. Crazy, I don;t know what I did
different, exactly.


I can reproduce about 44 seconds now, easily. Just reproduced at 42
seconds.

Ok, trying that now. I turned on Prime95, set the core to 0 and got
43seconds with Super PI
then I set the core to 1 with Prime95 and ran the SuperPI again, and
got 42 seconds. I am guessing this is the way you want me to do it?


No it does not make a difference.

Thanks Paul, I agree there isn't an easy way to verify this. I tried
out RMclock while rendering with the NLE - it shoots right to the top
100% of the CPU is being used. Is there a way to tell if the memory
is being used as well?

If you are getting 43 seconds for SuperPI, I suspect that is the
right ballpark for the results. Certainly a lot better than your
other result. And yes, I wanted you to try the affinity test, to
make sure both cores were responding the same way (to rule out
one core being broken, while the other one was fine).

If you can manage to find another user with similar hardware, and
both of you are using the same version of SuperPI, then you can
fine tune your view of your machine. If they get 42 seconds and
you get 42 seconds, then that is as good as it gets.

Paul
 
Back
Top