AMD vs Intel - Ghz & performance question

  • Thread starter Thread starter Big McLargehuge
  • Start date Start date
B

Big McLargehuge

This may seem like a stupid question, but please humor me.

I currently have a rig with a P4 2.8Ghz cpu.

I'm thinking of getting a new machine and relegating my current one to
secondary duties and am looking at AMD instead of Intel.

I got my Intel cpu about 2 years ago, and the newer AMD CPUs i'm
looking at run around the same clock speed. I've heard that AMD runs
better than Intel at similar clock speeds, and I want to get a
relatively high end CPU since I'm a big gamer.

is an $800 AMD cpu that clocks in at 2.8 Ghz actually that much faster
than my old P4 2.8? It seems like a lot of money for the same speed,
although I realize that there other differences than just clock speed.

Can anyone explain to me what all the differences are that support the
claims (and price) of the AMD promoters? This is not a loaded
question, I'm seriously considering buying AND (a dual core, actually)
and want to make an informed decision.

Thanks.
Big
 
This may seem like a stupid question, but please humor me.

I currently have a rig with a P4 2.8Ghz cpu.

I'm thinking of getting a new machine and relegating my current one to
secondary duties and am looking at AMD instead of Intel.

I got my Intel cpu about 2 years ago, and the newer AMD CPUs i'm
looking at run around the same clock speed. I've heard that AMD runs
better than Intel at similar clock speeds,

This is very true.
and I want to get a
relatively high end CPU since I'm a big gamer.

In that case an AMD chip is DEFINITELY what you should be looking at.
Games are one area where AMD has Intel very solidly beat.
is an $800 AMD cpu that clocks in at 2.8 Ghz actually that much faster
than my old P4 2.8?

Yes they are quite substantially faster, despite the same clock speed.
It seems like a lot of money for the same speed,
although I realize that there other differences than just clock speed.

Can anyone explain to me what all the differences are that support the
claims (and price) of the AMD promoters? This is not a loaded
question, I'm seriously considering buying AND (a dual core, actually)
and want to make an informed decision.

There are a MANY factors that determine the performance of a
processor, clock speed being only one of them. Other important
factors include the size and speed of cache (both L1 and L2), the
number of fetch+decode units and execution units in a chip, the length
of the pipelines and the ability of the chip to keep those pipelines
full. Another important factor that is the bandwidth and (especially)
the latency of the memory controller. The integrated memory
controller of the Athlon64 helps a LOT considering the relative speed
of memory vs. CPUs (consider that on a 2.8GHz Athlon64 you're probably
looking at waiting about 130-150 clock cycles for memory vs. a 2.8GHz
P4 you're waiting for 200-250 clock cycles for the same data, ie a
long time in CPU terms).

Generally speaking there are two main concepts of CU design, the
"speed daemon" design, a sort of narrow-and-fast approach, and the
"brainiac" design, a more slow-and-wide approach. Now all chips
incorporate some aspects of both, but the P4 definitely is more of the
former while the Athlon64 is more the latter. The P4 clocks to higher
speed but does a lot less per clock cycle. Athlon64 chips don't clock
as high but do much more per clock cycle.

In the end though, a lot of this be rather academic. What really
matters is how it runs your applications, and for that you want to
check out the benchmarks. There are lots out there, but here are a
few comparative tests:

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2668&p=8

http://www.extremetech.com/article2/0,1697,1909484,00.asp

http://www.legitreviews.com/article/289/10/

http://www.xbitlabs.com/articles/cpu/display/athlon64-fx60_11.html


As you can see in these reviews, the 2.6GHz Athlon64 FX-60 is able to
rather consistently outperform the 3.46GHz Pentium Extreme Edition.
In fact, in the last test they tried overclocking the Intel chip up to
4.26GHz and the AMD chip at it's stock 2.6GHz was still faster on 3
out of 4 gaming tests.
 
Big said:
is an $800 AMD cpu that clocks in at 2.8 Ghz actually that much faster
than my old P4 2.8? It seems like a lot of money for the same speed,
although I realize that there other differences than just clock speed.

An $800 AMD chip? Then that must mean that you're looking at either an
A64 FX-58 single-core, or an A64 X2-4800+ dual-core. Since you mentioned
2.8Ghz, then that would indicate that you're looking at the FX rather
than the X2 which runs at 2.6Ghz (but it runs two processors at the
2.6Ghz).
Can anyone explain to me what all the differences are that support the
claims (and price) of the AMD promoters? This is not a loaded
question, I'm seriously considering buying AND (a dual core, actually)
and want to make an informed decision.

Whoops, I should've read further! You're looking at a 2.8Ghz dual-core?
That would likely mean you'll be looking at an X2-5000+, which haven't
even come out yet, and when they do come out they're usually closer to
$1000 mark.

As for why the AMD's do so much better at a given Mhz? It's mostly
because the AMD's do more work per clock cycle than the P4's. That's why
Intel is phasing its P4's out, and eventually replacing them with
derivations of its Pentium-M mobile processor. The P-M's are closer in
philosophy to the AMD processors, i.e. do more work per clock cycle.

Other factors affecting AMD's performance is something they call Direct
Connect Architecture. DCA is a quick way to describe both its integrated
memory controller and its Hypertransport bus. Intels use a single
connection called the Front-Side-Bus (FSB) to connect all of the memory
and peripherals to the processor. It's simple, but it's also
congestable, with so much data coming over one link. AMD replaced the
FSB with DCA. The peripherals connect through a point-to-point link
called Hypertransport, instead of a shared bus. The memory connects
through its own memory controller, instead of through a chipset which
then connects to the FSB. Basically a lot of the AMD success is
completely attributable to DCA.

Yousuf Khan
 
As for why the AMD's do so much better at a given Mhz? It's mostly
because the AMD's do more work per clock cycle than the P4's. That's why
Intel is phasing its P4's out, and eventually replacing them with
derivations of its Pentium-M mobile processor.

No it's not. Intel is phasing out the Pentium 4 because increasing the
clock rates increased the thermal and power requirements too much. The
fact that their replacement has more emphasis on IPC is a result of
that, not the cause.
The P-M's are closer in
philosophy to the AMD processors, i.e. do more work per clock cycle.

It's funny how you try and make this situation sound like Intel is
following in AMD's footsteps. When, ironically, Intel is simply
returning to prior successes (the P6 and derivatives. Moreover, the
philosophy behind AMD's processors is 'balanced design', not braniac
(although it is all relative).

The only CPUs that really focus on doing the most work per cycle are
Itaniums. One might also consider the POWER5 in that category, but it
does have a rather long pipeline.
Other factors affecting AMD's performance is something they call Direct
Connect Architecture. DCA is a quick way to describe both its integrated
memory controller and its Hypertransport bus. Intels use a single
connection called the Front-Side-Bus (FSB) to connect all of the memory
and peripherals to the processor. It's simple, but it's also
congestable, with so much data coming over one link. AMD replaced the
FSB with DCA. The peripherals connect through a point-to-point link
called Hypertransport, instead of a shared bus. The memory connects
through its own memory controller, instead of through a chipset which
then connects to the FSB. Basically a lot of the AMD success is
completely attributable to DCA.

Hypertransport really is only worth 10% performance gains for single
socket systems...nobody has really offered proof otherwise. Especially
since for single socket systems the FSB is only used for memory and
I/O...just like HT.



To answer the OP's question:

1. Use an AMD system, they are much better for gaming.
2. The performance of a CPU is basically a result of how much work it
does per cycle (IPC) and how many cycles per second (frequency).

Basically, AMD's design execute more instructions per cycle (IPC) than
Intel's desktop designs. The reasons why are mostly related to
pipeline depth and a number of other factors that aren't really
relevant to a purchasing decision. Intel chose a while back to pursue
higher frequencies, and therefore sacrificed IPC. This strategy relied
on being able to keep ahead of AMD in frequency by a substantial
amount. Unfortunately, problems with heat forced Intel to stop
increasing the speed/frequency of their chips...consequently, AMD has
the highest performance desktop chips.

David
 
David said:
No it's not. Intel is phasing out the Pentium 4 because increasing the
clock rates increased the thermal and power requirements too much. The
fact that their replacement has more emphasis on IPC is a result of
that, not the cause.

You may quibble if you wish.
It's funny how you try and make this situation sound like Intel is
following in AMD's footsteps. When, ironically, Intel is simply
returning to prior successes (the P6 and derivatives. Moreover, the
philosophy behind AMD's processors is 'balanced design', not braniac
(although it is all relative).

It was a philosophy that Intel *used* to follow, and then it abandonned
in favour of the Pentium 4. It was forced to readopt the philosophy due
to its competition.
The only CPUs that really focus on doing the most work per cycle are
Itaniums. One might also consider the POWER5 in that category, but it
does have a rather long pipeline.

Wonderful. Anyways, "brainiac" was your interpretation. All I ever said
was "higher IPC".

Hypertransport really is only worth 10% performance gains for single
socket systems...nobody has really offered proof otherwise. Especially
since for single socket systems the FSB is only used for memory and
I/O...just like HT.

Hypertransport is not used for memory, just i/o; the integrated memory
controller is not part of the HT system. Well, in multi-socket systems,
the HT is kind of used for memory when two processors share cache
contents with each other, but that's really just part of interprocessor
communications, not memory per se.

However, there are some well-known areas where HT has helped even in
single processor systems. That would be the situation when you're using
Nvidia's SLI dual-graphics. It's been shown that you gain more
performance when going to SLI with AMD systems. I believe the percentage
increases are between 20-40% in Intel systems, whereas it's between
60-70% in AMD systems, comparing Nvidia's own Nforce chipsets against
each other.

Yousuf Khan
 
It's been shown that you gain more
performance when going to SLI with AMD systems. I believe the percentage
increases are between 20-40% in Intel systems, whereas it's between
60-70% in AMD systems, comparing Nvidia's own Nforce chipsets against
each other.

Yousuf Khan

Why the big difference?.... Is it the AMD64 on-die memory controller?...
and If AMD puts PCI-e on the CPU ( a rumor? ) like they did the memory
controller would a SLI setup get closer to 80%-90% or would it just free
up the PCI-e bus so it runs more hardware smoother?

EdG
 
EdG said:
Why the big difference?.... Is it the AMD64 on-die memory controller?...
and If AMD puts PCI-e on the CPU ( a rumor? ) like they did the memory
controller would a SLI setup get closer to 80%-90% or would it just free
up the PCI-e bus so it runs more hardware smoother?

I don't know why the big difference. It's not because of the onboard
memory controller, as it's not involved here. All SLI GPU activities are
done directly on video memory itself. System memory is mostly only used
on low-end integrated video systems, certainly not high-end SLI systems.
If I had to guess, I would say it's because the two video cards in an
SLI system use the CPU as a sort of router between them, keeping the two
of them synchronized. There might be enough data here to be a
significant load on the FSB.

If AMD integrates a PCI-e right into the CPU, then I would expect that
it's possible that the increase in performance would be much higher, as
it will get rid of the entire step of converting between PCI-e to HT and
back.

Yousuf Khan
 
Yousuf said:
You may quibble if you wish.

Why thank you for that privilege.
It was a philosophy that Intel *used* to follow, and then it abandonned
in favour of the Pentium 4. It was forced to readopt the philosophy due
to its competition.

Part of it was competition, but another part was simply pragmatism.
Nobody doubts that 65W CPUs are easier and cheaper to deal with than
130W ones...
Wonderful. Anyways, "brainiac" was your interpretation. All I ever said
was "higher IPC".


Hypertransport is not used for memory, just i/o; the integrated memory
controller is not part of the HT system. Well, in multi-socket systems,
the HT is kind of used for memory when two processors share cache
contents with each other, but that's really just part of interprocessor
communications, not memory per se.

Sorry I meant the memory controller...brain malfunction there.
However, there are some well-known areas where HT has helped even in
single processor systems. That would be the situation when you're using
Nvidia's SLI dual-graphics. It's been shown that you gain more
performance when going to SLI with AMD systems. I believe the percentage
increases are between 20-40% in Intel systems, whereas it's between
60-70% in AMD systems, comparing Nvidia's own Nforce chipsets against
each other.

That's not due to HT. That performance gap is largely due to the fact
that NVIDIA only recently started making Intel chipsets, while they've
been doing AMD chipsets for around 5 years. You'll never be able to
figure out how much of a benefit you get from HT on it's own. At least
with a memory controller, you have a chance, since you can guess the
latency without the controller.

DK
 
David Kanter wrote:

It was a philosophy that Intel *used* to follow, and then it abandonned
in favour of the Pentium 4. It was forced to readopt the philosophy due
to its competition.

It's been said that Barrett was to blame for P4 with his "They buy the
Megahertz" remark - not sure if that is an exact quote or not.
Wonderful. Anyways, "brainiac" was your interpretation. All I ever said
was "higher IPC".

Looks like David hit the wrong target - I knew brainiac had appeared in
this thread... but it was Tony's characterisation.:-)
Hypertransport is not used for memory, just i/o; the integrated memory
controller is not part of the HT system. Well, in multi-socket systems,
the HT is kind of used for memory when two processors share cache
contents with each other, but that's really just part of interprocessor
communications, not memory per se.

However, there are some well-known areas where HT has helped even in
single processor systems. That would be the situation when you're using
Nvidia's SLI dual-graphics. It's been shown that you gain more
performance when going to SLI with AMD systems. I believe the percentage
increases are between 20-40% in Intel systems, whereas it's between
60-70% in AMD systems, comparing Nvidia's own Nforce chipsets against
each other.

Hmm, that's an odd one. Has nVidia lost the ability to do a memory
controller/graphics interface?... or did they just adopt the same one they
had in nForce2?... which was not that bad, that I'd noticed but things do
move along. Or could it be that the AMD64 CPU/memory/HT cross-bar is so
much better?
 
It's been said that Barrett was to blame for P4 with his "They buy the
Megahertz" remark - not sure if that is an exact quote or not.

Ultimately he is, however folks like Louis Burns were also in the
management chain that signed off on Prescott and Tejas. The P4P was a
fine core, Prescott ran into problems (and is a very different core) as
did Tejas. A bunch of folks said that Prescott and Tejas were
basically nuts...and the management chose to ignore them.
Looks like David hit the wrong target - I knew brainiac had appeared in
this thread... but it was Tony's characterisation.:-)

Braniac is a well established term for MPUs that attain high
performance by a low clock rate and high IPC. The two most prominent
examples are PA-RISC and IPF. The problem is that over time, speed
demon and braniac shift around.

Once upon a time, the Alpha was considered a pure speed demon, with a 6
(or 7) stage pipeline. Ironically, almost every Alpha outclocked the
K7 except at the end of it's life. However, with MPUs like the
POWER4/5/6 and the P4, the Alpha seems more balanced.

I just don't see the K7/8 as being branaics, they are very much middle
of the road designs as I said.
Hmm, that's an odd one. Has nVidia lost the ability to do a memory
controller/graphics interface?... or did they just adopt the same one they
had in nForce2?... which was not that bad, that I'd noticed but things do
move along.

It was their first time workign with Intel's vintage FSB. HT is a lot
prettier (I suspect), so they probably were in rather uncharted waters.
Or could it be that the AMD64 CPU/memory/HT cross-bar is so
much better?

Doubt it. For single socket systems, it really matters very little.

DK
 
David said:
Why thank you for that privilege.

And let's not be unsure about what we meant by the term:
http://tinyurl.com/7fqcf
Part of it was competition, but another part was simply pragmatism.
Nobody doubts that 65W CPUs are easier and cheaper to deal with than
130W ones...

If it weren't for trying to stay ahead of the competition, Intel
wouldn't have even been trying to flirt with 4 Ghz so quickly. We'd
likely be around the 2.5 Ghz mark right now with the P4, and plenty of
Watts to go before heat became a problem.
Sorry I meant the memory controller...brain malfunction there.
Okay.


That's not due to HT. That performance gap is largely due to the fact
that NVIDIA only recently started making Intel chipsets, while they've
been doing AMD chipsets for around 5 years. You'll never be able to
figure out how much of a benefit you get from HT on it's own. At least
with a memory controller, you have a chance, since you can guess the
latency without the controller.

It's been at least a year since it's announcement:

http://www.nvidia.com/object/IO_17070.html

Since that time, there's been one major upgrade (AMD side was still
using Nforce 3, while Intel version got released as Nforce 4; AMD side
didn't get Nforce 4 till a bit later), and probably countless stepping
upgrades on both sides. Nvidia had also started doing AMD chipsets when
AMD was still using its own FSB. And let's not forget that Nforce was
derived from the Xbox, which used a Pentium 3. So it's not like as if
Nvidia didn't know how to handle a FSB. Still the gap exists between
the implementations of SLI on AMD vs. Intel platforms.

You could say that Nvidia is partisan to AMD. That can only be prooved
if Intel came out with a competing SLI chipset. But so far there's no
Intel chipset capable of SLI yet, despite the fact that Intel and
Nvidia swapped patents in that announcement above. So one should think
that an Intel SLI chipset should've emerged by now. I can only guess
the reason there isn't one now is because Intel can't get any better
performance out of their chipset's SLI than Nvidia can.

Yousuf Khan
 
You could say that Nvidia is partisan to AMD. That can only be prooved
if Intel came out with a competing SLI chipset.

No, actually Jen-Hsun Huang already said it, what's good for AMD is
good for Nvidia.
But so far there's no
Intel chipset capable of SLI yet, despite the fact that Intel and
Nvidia swapped patents in that announcement above. So one should think
that an Intel SLI chipset should've emerged by now. I can only guess
the reason there isn't one now is because Intel can't get any better
performance out of their chipset's SLI than Nvidia can.

LOL. Do you realize how long it takes to design a chipset? Besides,
Intel is strictly in the high volume business for x86 chipsets; they
prefer to leave ultra high performance to their partners (IBM for
servers, possibly nvidia for desktop).

DK
 
No, actually Jen-Hsun Huang already said it, what's good for AMD is
good for Nvidia.


LOL. Do you realize how long it takes to design a chipset?

LOL, not nearly as long as a processor. Without a chipset a processor is
less than useful.
Besides, Intel is strictly in the high volume business for x86 chipsets;

Seems Intel's strategy leaves something to be desired.
they prefer to leave ultra high performance to their partners (IBM for
servers, possibly nvidia for desktop).

....or perhaps IBM (and possibly nVidia) have no other choice?
 
David Kanter said:
No, actually Jen-Hsun Huang already said it, what's good for AMD is
good for Nvidia.


LOL. Do you realize how long it takes to design a chipset? Besides,
Intel is strictly in the high volume business for x86 chipsets; they
prefer to leave ultra high performance to their partners (IBM for
servers, possibly nvidia for desktop).

DK

yes I realize how long it takes to design a chipset. Although it varies
somewhat with how much overtime the team can stand.

But what is the point of this SLI thing? Play games faster? Is that
really that lucrative a niche for Intel? They are selling the processors
anyway.

del
 
Ultimately he is, however folks like Louis Burns were also in the
management chain that signed off on Prescott and Tejas. The P4P was a
fine core, Prescott ran into problems (and is a very different core) as
did Tejas. A bunch of folks said that Prescott and Tejas were
basically nuts...and the management chose to ignore them.

Depends what you mean by "fine core" and P4P [I've seen Prescott called a
P4P xxx] - the original P4 was conceived with DRDRAM in mind and it showed.
It had several interesting innovations which didn't seem to pay off all as
well as might have been expected. I was never strongly tempted to go
there. said:
Braniac is a well established term for MPUs that attain high
performance by a low clock rate and high IPC. The two most prominent
examples are PA-RISC and IPF. The problem is that over time, speed
demon and braniac shift around.

Once upon a time, the Alpha was considered a pure speed demon, with a 6
(or 7) stage pipeline. Ironically, almost every Alpha outclocked the
K7 except at the end of it's life. However, with MPUs like the
POWER4/5/6 and the P4, the Alpha seems more balanced.

Yes I know what brainiac [braniac ?] means but your use seemed to hint
rather strongly at someone else's (Yousuf's) claim of such in relation to
K8... and Tony had used the word.

Classification of IPF as brainiac seems dubious to me in that it abdicates
scheduling and parallelism to the compiler... a strategy I'm still baffled
by on several counts, the main one being: how the hell did they get
hardware folks to swallow delegating this power to software "jockeys"?:-)
I just don't see the K7/8 as being branaics, they are very much middle
of the road designs as I said.

The hoss is dead.
It was their first time workign with Intel's vintage FSB. HT is a lot
prettier (I suspect), so they probably were in rather uncharted waters.

I don't see how the FSB figures here at all. Any interface between a video
card and a system depends for its performance on the ability to transfer
large amounts of data directly from main memory to local video memory by
DMA; with video, since AGP, there's not even any snooping of cache.
Another *vague* possibility might be a poor GART implementation... maybe
insufficient or slow lookaside cache for page tables. Another might be to
do with buffering of memory bursts destined for the video memory.
Doubt it. For single socket systems, it really matters very little.

Of course it matters, for the same DMA mentioned above: the crossbar is
used to switch between CPU<->memory and HT<->memory transfers. I'm not
sure how the clocks are distributed in the "northbridge" section of the
AMD64 CPUs but it's possible that things happen much faster than in the
memory transaction arbitration of an Intel compatible MCH... or as
previously suggested nVidia didn't do enough new work there?
 
David said:
No, actually Jen-Hsun Huang already said it, what's good for AMD is
good for Nvidia.

Well, regardless, why put out a high-end showcase product if you're
only interested in doing a half-assed job with it? In the end, you'd
just cut off one of your revenue streams.
LOL. Do you realize how long it takes to design a chipset? Besides,
Intel is strictly in the high volume business for x86 chipsets; they
prefer to leave ultra high performance to their partners (IBM for
servers, possibly nvidia for desktop).

I think there's not enough bandwidth on the FSB to make SLI work
properly on the Intel side. If anything the Nforce platform on the
Intel side looks more powerful on paper. Nforce for Intel has got 40
PCI-e lanes vs. 36 for AMD edition. Even with a 1066Mhz FSB, it's
barely able to keep up with dual-channel DDR2 memory, let alone all of
that SLI data thrown into the mix too. And that's probably why Intel
itself hasn't bothered to make an SLI, it knows its a hopeless case.

Yousuf Khan
 
yes I realize how long it takes to design a chipset. Although it varies
somewhat with how much overtime the team can stand.

But what is the point of this SLI thing? Play games faster? Is that
really that lucrative a niche for Intel? They are selling the processors
anyway.

Well, that's the major attraction for consumers. For some servers and
workstations, I could see it being useful. Certainly for some
rendering related stuff, more GPUs could be helpful. That's assuming
you're willing to deal with relatively imprecise data types.

I suspect a lot of this is for PR bragging rights. If your 3DMark
score is mostly determined by the GPU, and is speed up by having two of
em, that means that previously AMD would have had a significant
benchmarketing advantage.

DK
 
Del said:
But what is the point of this SLI thing? Play games faster? Is that
really that lucrative a niche for Intel? They are selling the processors
anyway.

It's more or less the same competition as dual-cores for CPUs, except
in the GPU market.

As for is it lucrative for Intel? Both Intel & AMD are investing a lot
of effort into the gaming market. Whether it's important for revenue
generation or marketing, that remains unclear. This category sells all
of those expensive Pentium EE/Athlon FX processors.

Yousuf Khan
 
YKhan said:
It's more or less the same competition as dual-cores for CPUs, except
in the GPU market.

Except it's a MUCH smaller niche. Even gamers think twice before
buying TWO expensive (and soon to be obsolete) video cards... For
what fraction of the market is one $400 video card not sufficient?
As for is it lucrative for Intel? Both Intel & AMD are investing a lot
of effort into the gaming market.

But SLI will remain a small niche of even that market.
Whether it's important for revenue
generation or marketing, that remains unclear. This category sells all
of those expensive Pentium EE/Athlon FX processors.

And what fraction of the market are they?
 
Del Cecchi said:
But what is the point of this SLI thing? Play games faster?

Sell expensive, profitable mobos and _two_ expensive, profitable video
cards instead of just one expensive, profitable video card. Not a
dime for Intel, which just might be why they're not in that market.
 
Back
Top