A7V880 and so called dual channel memory access

  • Thread starter Thread starter David Pipe
  • Start date Start date
D

David Pipe

I have a Sempron 2600+ plugged into an A7V880. It's a 333MHz FSB
system, rather than a 400 MHz, and I have two Kinston 256 MB DIMMs--in
the blue slots just as it suggests to. When it boots, the BIOS does
tell me it is in Dual Channel mode. In the BIOS you can turn on or
off dual channel memory access. What is interesting is that according
to the Sandra benchmark the memory bandwidth numbers REMAIN NEARLY
IDENTICAL whether dual channel memory is enabled or disabled. Int =
2261MB, Float 2100.

Also, if you take a look at Kingston's site there is a PDF whitepaper
on Intel's dual channel memory information, and one graphic is a chart
that outlines the different bandwidths of different types of memory in
GB/sec. DDR333 peaks at 5.4GB/sec.

No real world results seem to measure up to the theoretical limit,
including all the Intel CPU/chipset combos, and I'm just curious why.
I'm most curious as to why there is no difference in performance on
the board I have when dual channel memory is off--it's one of the
reasons I bought the board.

Dave in Colorado
 
I have a Sempron 2600+ plugged into an A7V880. It's a 333MHz FSB
system, rather than a 400 MHz, and I have two Kinston 256 MB DIMMs--in
the blue slots just as it suggests to. When it boots, the BIOS does
tell me it is in Dual Channel mode. In the BIOS you can turn on or
off dual channel memory access. What is interesting is that according
to the Sandra benchmark the memory bandwidth numbers REMAIN NEARLY
IDENTICAL whether dual channel memory is enabled or disabled. Int =
2261MB, Float 2100.

Also, if you take a look at Kingston's site there is a PDF whitepaper
on Intel's dual channel memory information, and one graphic is a chart
that outlines the different bandwidths of different types of memory in
GB/sec. DDR333 peaks at 5.4GB/sec.

No real world results seem to measure up to the theoretical limit,
including all the Intel CPU/chipset combos, and I'm just curious why.
I'm most curious as to why there is no difference in performance on
the board I have when dual channel memory is off--it's one of the
reasons I bought the board.

Dave in Colorado

Pretty cool, eh :-) Makes you wonder why they put dual channel on
the Athlon motherboards.

The deal is all in the numbers. Both the Athlon and the P4 have a
64 bit data bus. The Athlon is a DDR bus, and the P4 is a QDR
(quad pumped) bus. If the Athlon is clocked at 200MHz, there are
400Megatransfers per second, or 3200MB/sec. That bandwidth is
obviously fully matched by a single DDR DIMM running at PC3200,
clocked at 200MHz for a DDR400 transfer rate.

The P4 is quad pumped, and with a 200MHz clock, has 800Megatransfer
per second. With the same 64 bit bus width, that gives 6.4GB/sec
transfer rate. That is fully met by two PC3200 DIMMs running in dual
channel (Uber DIMM) mode.

So, the P4 can actually hoover in the data from a dual channel
configuration, while the Athlon cannot. Of course, if you have some
slow memory, like two sticks of DDR266 memory, then two of those in
dual channel configuration will be faster than one stick at DDR266
on the Athlon. But, as the sticks get closer to matching the
processor FSB and running synced with the processor, there is less
and less reason for dual channel.

Dual channel on an Athlon is good for allowing simultaneous AGP
texture transfer, or for allowing PCI DMA transfers to happen at
the same time as the processor is accessing the memory. It is also
helpful if the Northbridge has integrated graphics, and uses main
memory for frame buffer and texture memory. And, dual channel memory
also allows more sticks to be installed on a motherboard, before
the buses become overloaded and need to have the clock rate reduced.

With video cards having decent sized video memory now, AGP texture
transfer is not a high runner kind of bus cycle. And PCI DMA, at about
100MB/sec, is not going to make a dent in a multi GB/sec interface.
Only a board with integrated Northbridge graphics is going to be
a winner with dual channel.

About the best display of a difference you might see, is to run
memtest86 from memtest.org. It has a bandwidth display in the upper
left hand corner of the screen. I think you'll find a difference
between the single and dual channel configs there. For most normal
uses, you'll see virtually no application difference between the
two test cases. (I.e. Even if there was a 5% difference in bandwidth,
the application difference would be about 1.5%)

About the theoretical limit versus the practical limit. The command
bus is SDR and the data bus is DDR. From one command to another,
there are internal limits in the memory (the memory timing numbers)
that prevent commands from coming back to back. All of the time
setting up the memory for a transfer, represents time not spent
transferring data, and so those "dead" cycles represent inefficiency.
One of the biggest efficiency killers is "command rate" or for the
Nforce2 folks, "Command per clock (CPC)" mode. If a memory channel
is heavily loaded (two or more double sided DIMMs being typical),
the address bus begins to fail on setup time. The solution, which
in many cases is not a configurable BIOS option, is to run in
2T mode (AKA "CPC off" mode). What happens is, the address bus is
driven for two clock cycles, but the info on the bus is only
strobed on the second cycle. This gives 1+ cycles of setup time
for the info on the address bus, and solves the loading problem.
But it also adds a whole wasted cycle every time a command is sent.
On an Athlon64 system, setting command rate at 2T will chop
1000MB/sec off the Sandra benchmark, a 20% or so hit (I'm going
from memory here, and don't want to trace down a reference for this
- try looking on Abxzone for more info).

If you want another puzzler to play with, I tried setting CAS to
2.5 or 3 on my A7N8X-E board, and got as close to identical
bandwidth numbers as you could ask for. If you have the option to
try that on your board, I'm curious whether your chipset does the
same thing or not. My suspicion is the Nforce2 chipset may not
actually support fractional data transfer cycles, and maybe it
actually only runs at CAS2 or CAS3, but not CAS2.5. I wouldn't
expect all chipsets that support Athlon to do that, so there is
another experiment for you to try. (I think I did that experiment
in dual channel mode, and maybe it behaves differently in single
channel mode. As I've put the board away for the time being, it
may be a few days before I can try that again.)

Paul
 
Pretty cool, eh :-) Makes you wonder why they put dual channel on
the Athlon motherboards.
The deal is all in the numbers. Both the Athlon and the P4 have a
64 bit data bus. The Athlon is a DDR bus, and the P4 is a QDR
(quad pumped) bus. If the Athlon is clocked at 200MHz, there are
400Megatransfers per second, or 3200MB/sec. That bandwidth is
obviously fully matched by a single DDR DIMM running at PC3200,
clocked at 200MHz for a DDR400 transfer rate.

The P4 is quad pumped, and with a 200MHz clock, has 800Megatransfer
per second. With the same 64 bit bus width, that gives 6.4GB/sec
transfer rate. That is fully met by two PC3200 DIMMs running in dual
channel (Uber DIMM) mode.

Thanks, Paul, for a great explanation!

Another question/observation if I may. As I understand it, at any
given moment, the CPU (depending on its size and the job at hand)
actually has in it's L1 or L2 cache approximately 90% of what it needs
to do its job. I don't recall how many times or where I've learned
that. Perhaps it isn't true. I don't know. If it's true, doubling
RAM speed actually might make a difference in performance of half of
the rest--half of 10%.

What I do know is this: I've got at work both a P4 2000 (400 MHz FSB)
and I've got an Athlon 2000+ on a 133MHz FSB. With equal RAM, OS and
some of the fastest hard disks as of a year ago--the WD WD800JB 8MB
cache model I found very similar performance while compiling/building
a large Visual Studio .NET 2003 project (I followed identical OS and
software procedures.) They both took between 12 and 13 minutes to
build the project, and CPU use hovered at around 95 percent during the
builds on both of them. I don't even remember specifically which was
faster at the moment--I just remember thinking that it really was not
very significant, and that's with the P4 also having the benefit of 2X
the cache.

My point is that in this particular real world application that is
extremely CPU and RAM intensive, high tech high speed RAM and high FSB
chipset essentially meant nothing, and the AMD CPU was half the price.

I feel like I'm missing something here.

Dave in Colorado
 
Thanks, Paul, for a great explanation!

Another question/observation if I may. As I understand it, at any
given moment, the CPU (depending on its size and the job at hand)
actually has in it's L1 or L2 cache approximately 90% of what it needs
to do its job. I don't recall how many times or where I've learned
that. Perhaps it isn't true. I don't know. If it's true, doubling
RAM speed actually might make a difference in performance of half of
the rest--half of 10%.

What I do know is this: I've got at work both a P4 2000 (400 MHz FSB)
and I've got an Athlon 2000+ on a 133MHz FSB. With equal RAM, OS and
some of the fastest hard disks as of a year ago--the WD WD800JB 8MB
cache model I found very similar performance while compiling/building
a large Visual Studio .NET 2003 project (I followed identical OS and
software procedures.) They both took between 12 and 13 minutes to
build the project, and CPU use hovered at around 95 percent during the
builds on both of them. I don't even remember specifically which was
faster at the moment--I just remember thinking that it really was not
very significant, and that's with the P4 also having the benefit of 2X
the cache.

My point is that in this particular real world application that is
extremely CPU and RAM intensive, high tech high speed RAM and high FSB
chipset essentially meant nothing, and the AMD CPU was half the price.

I feel like I'm missing something here.

Dave in Colorado


I was looking for some reviews for a7v880 and I like your
conversation.
I have A7v266 and I have opportunity to swap the motherboard with
a7v880 for a little money. A have a lot of programs instilled on my
computer and I hate to preinstall it. I am working with graphic and
web design. I need your opinion.

Do I go for it?

TKS
 
Back
Top