Dual-core Opty benchmarks at TechReport

  • Thread starter Thread starter Rob Stow
  • Start date Start date
R

Rob Stow

Lots of non-server benchmarks here:
http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=1

They put the dual-core 2.2 GHz Opty 175 and 275 through a bunch
of non-server tests. I expect AnandTech and XBitLabs to come up
with some server numbers very soon.

What impresses me is that a 2.2 GHz Opty 175 outperforms a pair
of 2.2 GHz Opty 248's in *everything*. Sometimes the margin is
small, but in other cases it is in the neighbourhood of 10%.
And, of course, it also beats out a single Opty 148.

The power consumption numbers are extremely impressive. For total
system power under load, an Opty 175 uses about 13 W less than
the Opty 148 and about 126 W less than a pair of Opty 248's.
 
Rob said:
Lots of non-server benchmarks here:
http://techreport.com/reviews/2005q2/opteron-x75/index.x?pg=1

They put the dual-core 2.2 GHz Opty 175 and 275 through a bunch of
non-server tests. I expect AnandTech and XBitLabs to come up with some
server numbers very soon.

What impresses me is that a 2.2 GHz Opty 175 outperforms a pair of 2.2
GHz Opty 248's in *everything*. Sometimes the margin is small, but in
other cases it is in the neighbourhood of 10%.
And, of course, it also beats out a single Opty 148.

What is the explanation for that? Ability to both access the same 1-st
level cache? Faster communication between processors?

I presume the architecture of single units of the processor did not
change much in mutly-core implementation?

Regards,
Evgenij
 
What is the explanation for that? Ability to both access the same 1-st
level cache? Faster communication between processors?

Access the same L1? Don't the dual Opterons have separate L2s even?

Perhaps the gain is because the dual Opterons dual are UMA so they
don't have to access any memory across the HT links.
I presume the architecture of single units of the processor did not
change much in mutly-core implementation?

One would suppose this, but there may be tweaks thrown in. Who knows,
and AMD isn't likely reveal any details.
 
And AnandTech has some up already.
http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2397
They do complain a tad about having to use a Tyan S2895
"workstation" motherboard instead of a genuine server motherboard.

A 2P Opty 275 system does very well against 4-way 3.3 GHz and
4-way 3.6 GHz Xeon boxes in the AnandTech "preview". AMD has a
huge lead until Intel is ready with dual-core Xeons.
What is the explanation for that? Ability to both access the same 1-st
level cache?

Both cores have their own L1 and L2 caches, but share the same
memory controller. The memory controller is supposedly better in
the dual core chips. In some tests there is also the possibility
of a helping hand from SSE3 in the dual core chips.
Faster communication between processors?

Yes. There is an on-die crossbar between the two cores. This
allows them to do things like cache-snooping without having to
leave the die or even to go through the memory controller.

Across a 1 GHz HT link, the L2 cache snooping latency between a
pair of Opty 148's is about 23 ns, IIRC. I wonder what the
latency is between the two cores in an Opty 175 ?
I presume the architecture of single units of the processor did not
change much in mutly-core implementation?

Look at the schematics at the TechReport and AnandTech pages:
they are nothing new and have been available at the AMD web site
for several years.

Schematically, the dual core chips can be thought of as having
two cores, a crossbar, and a memory controller. (The AnandTech
review makes it a little more complicated than that.)

Core Core
One Two
Crossbar
Memory Cont


However, the original AMD64 chips were designed for ease of
transition to future dual-core versions: they already apparently
had everything in place waiting for "Core Two" to
be tacked on:

Core
One
Crossbar
Memory Cont


The above, of course, is a gross oversimplification.
 
Keith said:
Access the same L1? Don't the dual Opterons have separate L2s even?

Separate L1 and L2. Allegedly improved memory controller shared
by both cores. And no need to use HT links for interprocessor
communications: cache snooping, etc, done by an on-die crossbar.
Perhaps the gain is because the dual Opterons dual are UMA so they
don't have to access any memory across the HT links.




One would suppose this, but there may be tweaks thrown in. Who knows,
and AMD isn't likely reveal any details.

The E4 stepping of the single core AMD64 chips has an improved
memory controller over previous steppings and that memory
controller is supposed to be in all of the dual core chips.
Ditto for SSE3.
 
Access the same L1? Don't the dual Opterons have separate L2s even?

They do indeed have separate L2 caches (and certainly separate L1 as
well).
Perhaps the gain is because the dual Opterons dual are UMA so they
don't have to access any memory across the HT links.

I think you're hitting at the main improvement right there.
One would suppose this, but there may be tweaks thrown in. Who knows,
and AMD isn't likely reveal any details.

There are indeed tweaks thrown in there, and actually AMD did reveal
some details in a presentation a while back. It's not the sort of
thing that they're likely to have on their website anywhere, but if
you search enough you might be able to come up with it. In terms of
buzzword compliance, the new Opteron adds SSE3 support, which
potentially helps in a few media encoding and gaming tests.
 
Tony said:
There are indeed tweaks thrown in there, and actually AMD did reveal
some details in a presentation a while back. It's not the sort of
thing that they're likely to have on their website anywhere, but if
you search enough you might be able to come up with it. In terms of
buzzword compliance, the new Opteron adds SSE3 support, which
potentially helps in a few media encoding and gaming tests.

http://techreport.com/onearticle.x/6363
http://www.geek.com/news/geeknews/2004Mar/bch20040303024101.htm

New features
o Power reductions
o Speed improvements
o Lower power in Halt, Stopclock states
o SSE3 (Prescott New Instructions)
o Enhanced data prefetch
- Negative stride, improved page crossing
o On-Die Thermal Throttling
o Additional write combining buffers
o Convert LEA -> ADD
o DRAM controller improvements
- DDR400, managing open pages, 2T
 
Back
Top