Mobos with 875P chipset slower with 4 ram sticks?

  • Thread starter Thread starter Joey
  • Start date Start date
J

Joey

Surprise! Less memory might speed up photoshop and other programs.
See the article at
http://firingsquad.com/hardware/building_gaming_opteron_2003_Part2/page14.asp

They tested several platforms including a Pentium on a board with the
875P (Canterwood) chipset. They compared 2 memory sticks of 512MB
(1GB) to 4 memory sticks of 512MB (2GB). The memory was DDR400.
Their tests showed the 1GB system was faster, sometimes by a
significant margin.

At least one reason for the decreased performance is that the 875P
increases memory latency if you use more than 2 sticks of memory.

In a forum on the same site the article authors acknowledge that some
of the slowdown with 4 memory sticks might be specific to the
motherboard in their system. They were not using an ASUS board.

This article is of great interest to me since I'm planning to build a
system for running photoshop based on the ASUS P4C800-E Deluxe
motherboard which uses the 875P chipset.

Any comments on the above article? Some googling failed to find much
discussion based on the article's findings.

Anyone else out there have any data on photoshop performance on a 875P
system with 2 sticks of ram compared to 4 sticks?

If there really is a significant photoshop performance hit when
running 4 sticks of ram, then the best memory configuration for
photoshop might be 2 sticks of 1024 ram (2GB). This would be a
matched pair running in dual channel mode.

Joey
 
Surprise! Less memory might speed up photoshop and other programs.
See the article at
http://firingsquad.com/hardware/building_gaming_opteron_2003_Part2/page14.asp

They tested several platforms including a Pentium on a board with the
875P (Canterwood) chipset. They compared 2 memory sticks of 512MB
(1GB) to 4 memory sticks of 512MB (2GB). The memory was DDR400.
Their tests showed the 1GB system was faster, sometimes by a
significant margin.

At least one reason for the decreased performance is that the 875P
increases memory latency if you use more than 2 sticks of memory.

In a forum on the same site the article authors acknowledge that some
of the slowdown with 4 memory sticks might be specific to the
motherboard in their system. They were not using an ASUS board.

This article is of great interest to me since I'm planning to build a
system for running photoshop based on the ASUS P4C800-E Deluxe
motherboard which uses the 875P chipset.

Any comments on the above article? Some googling failed to find much
discussion based on the article's findings.

Anyone else out there have any data on photoshop performance on a 875P
system with 2 sticks of ram compared to 4 sticks?

If there really is a significant photoshop performance hit when
running 4 sticks of ram, then the best memory configuration for
photoshop might be 2 sticks of 1024 ram (2GB). This would be a
matched pair running in dual channel mode.

Joey

One of the problems with any review articles, is you don't know
how competent the reviewers are at configuring the system.

I have yet to see an article that compares all aspects of memory
performance on 875/865. Here are the issues that affect performance:

1) Clock rate. The single most important factor in processor
performance and memory performance. Clock rate is more important
than a low CAS for example.
2) Command rate. When the number of loads goes up on a memory
bus, command rate has to be switched from 1T to 2T. Basically,
an address is placed on the bus for two cycles, but the strobe
for the address is only active for the second cycle, when command
rate is set to 2T. This improves setup time under heavy address
loading. Two double sided modules on a single bus is 32 chips,
and generally requires 2T timing. Both the Athlon64 and the
Pentium systems will face this issue. In a way, this is like
a CAS penalty.
3) CAS number. There is a slight improvement with CAS2 over CAS3.
4) Intel has PAT, which is a shortening of the memory path when
the FSB is synchronous with the transfer rate from the memory
subsystem. There are a whole bunch of conditions that seem to
influence whether PAT exists or not. There is no single register
that I can find on a 875/865 Northbridge to identify whether
PAT is in action or not (unless it is an undocumented feature).
It is possible the effects of Command Rate are considered a
part of PAT, because I cannot find a BIOS setting for Command
Rate on the P4C800-e. This article is just the tip of the
iceberg:

http://www.anandtech.com/mb/showdoc.aspx?i=1851&p=5
5) The interleave factor. More banks is better in Dynamic mode.
4x512MB double sided is better than 2x1024 double sided, as
the former config has twice as many memory banks as the latter
configuration.

It is not possible to write a closed form expression for performance.
How the memory can be driven, depends on its past history. For
example, if a memory page is already open, the next access to it
can be faster. If memory access is random, there can be a lot of
page opening and closing, to degrade performance. Thus, the
workload (memory access pattern and frequency of access) plays
a part in what the performance will be. Obviously, a memory
benchmark is an optimistic case, as the pattern is bursting and
open pages are fully utilized. Real programs don't do that.

On the issue of clock rate, overclocking is a slight winner over
tight CAS memory. A CAS3 DDR500 memory config with 1:1 CPU:Mem
ratio is faster than CAS2 DDR400 memory config with 5:4 CPU:Mem.
But, when running four sticks, the math becomes more cloudy,
because you don't know how far above DDR440 you would get with
4 double sided 512MB sticks.

On command rate, you can experiment. For example, some people
get away with changing 2T to 1T on the Athlon64 when using two
double sided sticks. I don't know anything about 875/865 and
how important the command rate setting is, or whether the user
can change it. I cannot find a register in the datasheet for
the 875 that controls command rate!

For PAT, it is really a crapshoot. Get a copy of CTIAW, or use
CTIAW as a search term on abxzone.com/forums/search.php . The
only condition that _might_ give you PAT, is a single stick of
CAS2 memory per channel, FSB800, DDR400, no overclock. (So the
Corsair TwinX mentioned below would fit the bill.)

Speaking in general terms, I like your plan of buying 2x1024.
These modules use 64Mx8 chips and are pretty expensive. Visit
this page and look at the specs for TWINX2048-3200C2

http://corsairmicro.com/corsair/xms.html
http://corsairmicro.com/corsair/products/specs/twinx2048-3200c2.pdf

If you want the stability of a stock system, buy your fast
processor, run it at FSB800, DDR400 (1:1 ratio), slap in
the 3200C2 1GB modules, run CTIAW, and there is the odd chance
you'll even get CTIAW to claim PAT is enabled.

If you are an overclocker, I cannot guarantee that someone won't
beat you with another memory config. If a module comes along
tomorrow, with lower capacitive loading, and DDR500+ performance,
four of those might beat you.

Here is some overclocker chatter to consider, posted very
recently.

http://www.abxzone.com/forums/showthread.php?t=80311

Intel offers this document, but with the results of the
article you site, their advice is now in doubt.

ftp://download.intel.com/design/chipsets/applnots/25273001.pdf

Whatever you decide, post back any significant results of
your testing :-)

HTH,
Paul
 
Thanks Paul. I have absorbed a number of your other posts.

The thing I like about this report is that they took a stock PC, only
changed memory (2 v. 4 sticks), and then measured the time it took
real world apps to do things. Yes, the striking difference in time
they measured could be due to:
A. 2 v. 4 sticks of RAM
B. Motherboard
C. BIOS
D. The list goes on........

But the time performance they measured is so significant that I am
amazed that the online community has not poked at this with various
folks reporting their results with 2 v. 4 sticks to see if 4 sticks
really is the most important factor in this performance penalty.

Sorry but even though I'm most curious on this issue, if I go with
2x1024 I will not be able to post any results myself comparing 2 v. 4
sticks of RAM since I will have only the 2 sticks.

Joey
 
Thanks Paul. I have absorbed a number of your other posts.

The thing I like about this report is that they took a stock PC, only
changed memory (2 v. 4 sticks), and then measured the time it took
real world apps to do things. Yes, the striking difference in time
they measured could be due to:
A. 2 v. 4 sticks of RAM
B. Motherboard
C. BIOS
D. The list goes on........

But the time performance they measured is so significant that I am
amazed that the online community has not poked at this with various
folks reporting their results with 2 v. 4 sticks to see if 4 sticks
really is the most important factor in this performance penalty.

Sorry but even though I'm most curious on this issue, if I go with
2x1024 I will not be able to post any results myself comparing 2 v. 4
sticks of RAM since I will have only the 2 sticks.

Joey

You cannot think of a single benchmark you could run ?
Like Sandra buffered and unbuffered ? Maybe some kind
of standard Photoshop benchmark ? Anything that someone
else can reproduce is a start. The thing is, if no one
posts data on the configs they've got, there is no place
to start. Even posting the bandwidth number printed
on the screen while running memtest86 from memtest.org
is better than nothing (the test program is free).

Paul
 
Your message caught my eye because I recently built a system for Photoshop
work. I don't know if the following will help, but here's some information:

My system:
ASUS P4P800-E Deluxe (BIOS 1002)
Pentium 4, 3.0C
2GB Mushin PC3200 Blue RAM -- 4-512 DIMMS
Matrox G450 video card—for primary monitor
Matrox Millennium PCI video card—for second monitor
2—WD 120GB HD (1 SATA; 1 EIDE); DVD-RW/CD-RW drive
Windows XP Home SP2

Memtest86+ v.1.26 shows:
Pentium 4 (0.13) 2998 Mhz
L1 Cache 8K 24575MB/s
L2 Cache 512K 20966MB/s
Memory 2047M 2390MB/s
Chipset i848/i865
FSB 1999
RAM 199Mhz/DDR398
CAS 2-3-4-7 (set at the suggestion of Mushkin tech support when I had
Memtest errors on test #10)
Dual channel (128 bits)
PAT and ECC disabled

Prime95 Benchmark:
Intel(R) Pentium(R) 4 CPU 3.00GHz
CPU speed: 2998.51 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 11.933 ms.
Best time for 448K FFT length: 14.192 ms.
Best time for 512K FFT length: 16.233 ms.
Best time for 640K FFT length: 19.414 ms.
Best time for 768K FFT length: 23.734 ms.
Best time for 896K FFT length: 28.035 ms.
Best time for 1024K FFT length: 31.318 ms.
Best time for 1280K FFT length: 40.974 ms.
Best time for 1536K FFT length: 50.564 ms.
Best time for 1792K FFT length: 59.961 ms.
Best time for 2048K FFT length: 67.714 ms.

John
 
Your message caught my eye because I recently built a system for Photoshop
work. I don't know if the following will help, but here's some information:

My system:
ASUS P4P800-E Deluxe (BIOS 1002)
Pentium 4, 3.0C
2GB Mushin PC3200 Blue RAM -- 4-512 DIMMS
Matrox G450 video card—for primary monitor
Matrox Millennium PCI video card—for second monitor
2—WD 120GB HD (1 SATA; 1 EIDE); DVD-RW/CD-RW drive
Windows XP Home SP2

Memtest86+ v.1.26 shows:
Pentium 4 (0.13) 2998 Mhz
L1 Cache 8K 24575MB/s
L2 Cache 512K 20966MB/s
Memory 2047M 2390MB/s
Chipset i848/i865
FSB 1999
RAM 199Mhz/DDR398
CAS 2-3-4-7 (set at the suggestion of Mushkin tech support when I had
Memtest errors on test #10)
Dual channel (128 bits)
PAT and ECC disabled

Prime95 Benchmark:
Intel(R) Pentium(R) 4 CPU 3.00GHz
CPU speed: 2998.51 MHz
CPU features: RDTSC, CMOV, PREFETCH, MMX, SSE, SSE2
L1 cache size: 8 KB
L2 cache size: 512 KB
L1 cache line size: 64 bytes
L2 cache line size: 128 bytes
TLBS: 64
Prime95 version 23.8, RdtscTiming=1
Best time for 384K FFT length: 11.933 ms.
Best time for 448K FFT length: 14.192 ms.
Best time for 512K FFT length: 16.233 ms.
Best time for 640K FFT length: 19.414 ms.
Best time for 768K FFT length: 23.734 ms.
Best time for 896K FFT length: 28.035 ms.
Best time for 1024K FFT length: 31.318 ms.
Best time for 1280K FFT length: 40.974 ms.
Best time for 1536K FFT length: 50.564 ms.
Best time for 1792K FFT length: 59.961 ms.
Best time for 2048K FFT length: 67.714 ms.

John
 
Back
Top