CPU vs GPU

Skybuck Flying · May 15, 2004

Hi,

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

I am wondering if GPU's are holding back game graphics performance because
their clock speed is so low ?

For example suppose someone programs a game with the following properties:

3 GHZ for os, game, network, etc logic
1 GHZ for graphics.

Would that game run faster than any other game ?

How about

2 GHZ for game
2 GHZ for graphics.

I know CPU's have generic instructions. so 4 GHZ means a cpu can do about
4000 miljoen generic instructions.

What does GPU 500 mhz mean ???? 500 miljoen pixels ? 500 miljoen triangles
? 100 miljoen T&L + 400 miljoen pixels ? what ?

Bye,
Skybuck.

cowboyz · May 15, 2004

Skybuck said:
Hi,

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

I am wondering if GPU's are holding back game graphics performance
because their clock speed is so low ?

For example suppose someone programs a game with the following
properties:

3 GHZ for os, game, network, etc logic
1 GHZ for graphics.

Would that game run faster than any other game ?

How about

2 GHZ for game
2 GHZ for graphics.

I know CPU's have generic instructions. so 4 GHZ means a cpu can do
about 4000 miljoen generic instructions.

What does GPU 500 mhz mean ???? 500 miljoen pixels ? 500 miljoen
triangles ? 100 miljoen T&L + 400 miljoen pixels ? what ?

Bye,
Skybuck.

Instead of looking at clock speed you should be paying more attention to
bandwidth. The latest CPU and GPU are far more powerful than they need to
be. We are just only seeing the beginning of games that take advantage of
the faster gear we have had for the last 6 months or so. You would hardly
notice on a PII 450 and FX5200 though.

Stephan Grossklass · May 15, 2004

[snip]

Please do not crosspost across language hierarchies (applying some
common sense in group selection might not hurt).

Thanks.

F'up2p.

Stephan

joe smith · May 15, 2004

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

And 10-30 times more powerful for doing per pixel computations at lower
frequency of the core clock without breaking a sweat. Since the 3dfx voodoo
graphics the tables were turned and the gap is just growing as time goes on.

^- gross simplifcation and inaccurate generalization!

Skybuck Flying · May 15, 2004

Stephan Grossklass said:
[snip]

Please do not crosspost across language hierarchies (applying some
common sense in group selection might not hurt).

Oops... I didn't notice it was a de language group. =)

Skybuck Flying · May 15, 2004

Stephan Grossklass said:
[snip]

Please do not crosspost across language hierarchies (applying some
common sense in group selection might not hurt).

Oops... I didn't notice it was a de language group. =)

Skybuck Flying · May 15, 2004

Stephan Grossklass said:
[snip]

Please do not crosspost across language hierarchies (applying some
common sense in group selection might not hurt).

Please do not change the original posting destinations... since you CC
yourself to it...

But in this case I forgive you since you seem interested in receiving
replies ?

( I was like huh ? why is outlook trying to send an e-mail !? didn't I click
reply group ?! )

Skybuck.

Alan Tiedemann · May 15, 2004

Skybuck Flying wrote:

Oh my god... Freaks like you are who make the "Netiquette" for german
speaking newsgroups useful.

Could you *please* leave the de.-hierarchy again and in future postings
*always* set a followup-to exactly *one* newsgroup where your topic is
*on* topic?

Crossposts without followup-to exactly ONE group are completely stupid.
If you set a followup-to, the people who are interested in your topic
will surely follow you to the ONE newsgroup that fits. If you do not set
any followup-to, all other readers are annoyed by the flood of useless
and off-topic messages.

So, please: GO. And do not come back unless you have at least a *tiny*
clue about how usenet works.

It's impossible for me to set a followup-to a correct group because
*none* of the groups you have chosen is correct for your topic. So,
please: DO NOT REPLY TO THIS POST IN USENET. The discussion is OVER for
me, I will NEVER EVER read any of your usenet postings again, as
probably any other reader of the de.*-groups you have randomly chosen
from your group list.

I've now set a followup-to poster. So, if you want to discuss with me, I
will *only* accept mails. No posts in usenet, as I will not read them
anymore and because this is now *completely* off topic in *all*
newsgroups this thread is posted to.

Filter score adjusted, discussion over for me.

Alan

Conor · May 15, 2004

Hi,

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

I am wondering if GPU's are holding back game graphics performance because
their clock speed is so low ?

Actually it is the other way around. The CPUs are holding up the GPUs.
Measuring by clock speed alone is completely pointless as Intel Pentium
M CPUs have shown.

Ralf Hildebrandt · May 15, 2004

Skybuck said:
Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

Did nobody tell you, that the clock rate alone is not the main
performance indicator?

I am wondering if GPU's are holding back game graphics performance because
their clock speed is so low ?

No, they don't. The do the work parallel, not serial.

Just think about doing an addition of N numbers:
Do it serially (a+b=s1; s1+c=s2 ...) or in parallel.

The algorithms computed in GPUs are much more complex than in CPUs.

For example suppose someone programs a game with the following properties:

3 GHZ for os, game, network, etc logic
1 GHZ for graphics.

Would that game run faster than any other game ?

The maximum clock rate is limited by the target process and the
complexity of the design. Nearly every IC is working at maximum clock rate.

Your topic is a little bit related to digital circuit design, but as
everybode can see, a newsgroup like comp.lang.vhdl is heavy for you. In
the chosen newsgroups about programming your question is completely
off-topic, because it's a hardware-related question. But in a custumer
newsgroup about hardware your topic is also a little bit too heavy.

f'up to poster set

Ralf

Dark Avenger · May 15, 2004

Skybuck Flying said:
Hi,

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

I am wondering if GPU's are holding back game graphics performance because
their clock speed is so low ?

For example suppose someone programs a game with the following properties:

3 GHZ for os, game, network, etc logic
1 GHZ for graphics.

Would that game run faster than any other game ?

How about

2 GHZ for game
2 GHZ for graphics.

I know CPU's have generic instructions. so 4 GHZ means a cpu can do about
4000 miljoen generic instructions.

What does GPU 500 mhz mean ???? 500 miljoen pixels ? 500 miljoen triangles
? 100 miljoen T&L + 400 miljoen pixels ? what ?

Bye,
Skybuck.

You know that much hardware contains such units... but most are made
of SPECIFIC functions, like the one that sits on server network
cards.... it's "slow" but ..it only does a SPECIFIC job and is good at
that! So even though it's only ..20Mhz... 50Mhz... it does such
Specifice job perfectly!

Speed is not everything... and GPU's nowadays run hot..very very hot.
And heat is what limits us the most... because going over 70 Degrees
Celsius is not positive for your hardware...the speed is limited by
how fine they can make technology to run below that mark....

Skybuck Flying · May 15, 2004

Well I still have seen no real answer.

Can I safely conclude it's non determinstic... in other words people dont
know shit.

So the thruth to be found out needs testing programs !

Test it on P4

Test it on GPU

And then see who's faster.

Since I don't write games it's not interesting for me.

I do hope game writers will be so smart to test it out

Skybuck.

Dr Richard Cranium · May 15, 2004

ahhwwweeee. you most likely didn't have the answer anyway.

i personally would like to read this discussion with diverse input and opinions from
around the known galaxy.

it is rare that discussions like these are put into an open forum with no malice intended.

so please refrain your little tantrum to simply a kill file huh.

** No Fate **

cheers,
dracman
Tomb Raider: Shotgun City
http://www.smokeypoint.com/tomb.htm
http://www.smokeypoint.org/traod/traod.html

**savegame editors all versions Tomb Raider & TRAOD godmode
http://www.smokeypoint.com/tr2code.htm

http://www.smokeypoint.com/tombraider1/tombraider1pictures.htm
http://www.smokeypoint.com/3dfx.htm
http://www.smokeypoint.com/banshee.html
http://www.smokeypoint.com/My_PC.htm
http://www.smokeypoint.com/tomb2.htm#Tova

** GTA III vice City Character MOD
http://www.smokeypoint.com/uzi.htm#gta3

NFS: drive at the limits of tyre adhesion ! snag Lara's Croftraider IV sports car ! !
NFS3:NFS4
http://www.smokeypoint.com/3dfx.htm#raider

http://www.smokeypoint.com/3dfx.htm#blondstranger
NFS:HS - Reg method to add your d3d card to NFS:HS
NFS III - Reg method to add your d3d card to NFS:HP
-=-=-
Mad Onion - 3DMark 2001 and 3DMark 2001se -
{not the Pro} 4x4 Missile Launcher Truck Secret Game Demo screen snaps
(and secret password)
http://www.smokeypoint.com/3dmark2001.html
-=-=-

: Skybuck Flying wrote:
:
: Oh my god... Freaks like you are who make the "Netiquette" for german
: speaking newsgroups useful.
:
: Could you *please* leave the de.-hierarchy again and in future postings
: *always* set a followup-to exactly *one* newsgroup where your topic is
: *on* topic?
:
: Crossposts without followup-to exactly ONE group are completely stupid.
: If you set a followup-to, the people who are interested in your topic
: will surely follow you to the ONE newsgroup that fits. If you do not set
: any followup-to, all other readers are annoyed by the flood of useless
: and off-topic messages.
:
: So, please: GO. And do not come back unless you have at least a *tiny*
: clue about how usenet works.
:
: It's impossible for me to set a followup-to a correct group because
: *none* of the groups you have chosen is correct for your topic. So,
: please: DO NOT REPLY TO THIS POST IN USENET. The discussion is OVER for
: me, I will NEVER EVER read any of your usenet postings again, as
: probably any other reader of the de.*-groups you have randomly chosen
: from your group list.
:
: I've now set a followup-to poster. So, if you want to discuss with me, I
: will *only* accept mails. No posts in usenet, as I will not read them
: anymore and because this is now *completely* off topic in *all*
: newsgroups this thread is posted to.
:
: Filter score adjusted, discussion over for me.
:
: Alan
:
: --
: Bitte nur in die Newsgroup antworten! Re:-Mails rufe ich nur selten ab.
: http://www.schwinde.de/webdesign/ ~ http://www.schwinde.de/cdr-info/
: Mail: at__0815<at>hotmail<punkt>com ~ news-2003-10<at>schwinde<punkt>de

.................................................................
Posted via TITANnews - Uncensored Newsgroups Access-=Every Newsgroup - Anonymous, UNCENSORED, BROADBAND Downloads=-

Damaeus · May 15, 2004

In "Skybuck Flying"

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

I am wondering if GPU's are holding back game graphics performance because
their clock speed is so low ?

The graphics card just has to process graphics. The CPU has many other
things to contend with. It's probably pretty well balanced, IMHO. You can
have too much video card. You wouldn't want to run a GeForce 6800 on a
Pentium 60 computer. The performance would be horrid. And if you ran a
GeForce 6800 on a 500MHz system, your performance would still be crappy.
And it would also be crappy if you ran a Creative Labs 3Dfx Banshee w/ 16mb
of VRAM on a 3.2GHz processor.

Dr Richard Cranium · May 15, 2004

the real answer ? those folks are busy getting the 64/128 bit ATI video card ready to
ship sometime in June.
Then the 64 bit sound card will follow. (PCI2)

Microsoft is just sitting on the 64 bit windows OS until INTEL is READY to ship. Gosh -
you wouldn't want AMD to grab the 64 bit OS market first, would you.

okay so you will have to use what available talent is here in the ATI ng huh, well at
least these folks are trying to help more ng'ers than wallets (like in their own).

my 2 cents follows.

The more you can off load the graphics API's from the cpu the faster the cpu can process
the game (software api's)
The cost/overhead of building a perfect GPU goes well beyond what you or i care to spend
for a graphics card.
How about $12,000 for the almost perfect CADD workstation video card. I don't think so.
$ 500 per card is my limit. ATI needs trade-offs, and you are purchasing those trade offs
and various nitch prices you and I can afford. This does NOT put much money back into
ATI's pocket so they can research an inexpensive GPU that performs like a CPU. ATI ain't
stupid either - you ain't getting that $12,000 video card for peanuts dude. What you ARE
going to get is value for your money. ATI can afford and make and sell you at a
reasonable price what you see on the market. You want CPU = GPU performance, so does ATI.

as GPU and CPU creation process's get improved and cheaper - then you will see GPU
performance approaching CPU
performance.

You need the GPU to run the CPU at good efficiency. Lose the GPU, and you bog/choke the
CPU.

CPU win's, GPU wins.

nuff said.

** no fate **

dracman
http://www.smokeypoint.com/My_PC.htm

: Well I still have seen no real answer.
:
: Can I safely conclude it's non determinstic... in other words people dont
: know shit.
:
: So the thruth to be found out needs testing programs !
:
: Test it on P4
:
: Test it on GPU
:
: And then see who's faster.
:
: Since I don't write games it's not interesting for me.
:
: I do hope game writers will be so smart to test it out

:
: Skybuck.
:
:

.................................................................
Posted via TITANnews - Uncensored Newsgroups Access-=Every Newsgroup - Anonymous, UNCENSORED, BROADBAND Downloads=-

joe smith · May 16, 2004

Well I still have seen no real answer.

Yes, you have. You just can't recognize them. Because of this, I will
explain things to you in "talking downwards" fashion because seemingly
taking you as someone who has basic grasp of the concepts involved did not
work for everyone who responded to you.

The basic 'wrong' in your question is what the CPU and GPU are designed to
do. If you just look raw number of scalar computations CPU at 2-3 Ghz and
GPU at ~500 Mhz can do per second the average GPU wins the CPU every single
time. GPUs are brutally efficient parallel signal processors.

GPU's win for raw processing power, but they have limitations. First, GPU's
have fixed feature set and are not generic purpose programmable processing
units like CPU's. This means there is no programmable flow control affecting
Program Counter (PC) also known as Instruction Pointer (IP). Latest GPU's
have dynamic branching etc. but the input for each fragment is still
configured the same and the IP is strictly just repeatedly executing the
same series of instructions (branching omitted). The currently real-world
deployed fragment and vertex processors implement "branching" as computing
both paths and discarding the results of the path that was "not taken". But
these are just details ; the important bit is that the GPU is specialized in
the kind of parallel computations generic purpose CPU's are very poor at.

CPU's do have SIMD instruction sets and possibility to handle multiple
elements of data simultaneously, adding into the mix are details such as
pipelining which means there are number of instructions "in flight"
simultaneously which means there are multiply elements of data being
processed simultaneously but this is not the same thing as processing
multiple elements of data in parallel (the 'model' the software is written
is still serial).

Can I safely conclude it's non determinstic... in other words people dont
know shit.

Oh it is very deterministic and precise. It just so happens that there are
literally hundreds of different GPU's and CPU's out there so giving you
one-fits-all answer is practically impossible. But if we limit the choises a
little bit, given a budget of $500 for a given task, GPU does yield superior
bang for buck for doing per fragment computations and CPU yields superior
bang for buck for generic purpose programmability.

If you look at only the raw computing power per cost unit then GPU is order
of magnitude 'superior' to the CPU. But since it can only do a certain kinds
of jobs CPU's are still kicking strong and very important component on
contemporary personal computer.

So the thruth to be found out needs testing programs !
Nonsense.

Test it on P4
Test it on GPU

And then see who's faster.

'****ing old' Voodoo2 graphics card will beat P4 for rasterization for
feature set the Voodoo2 supports. When we start doing work the Voodoo2 does
not support, the P4 will 'win' simple because the P4 can do things the
Voodoo2 was never designed to do in the first place.

If we simply talk about per fragment computations and with latest generation
GPU's geometry related work then the CPU has a change of a snowball in hell
beating the GPU. This is so obvious graphics programmers never even talk
about the topic, there is NOTHING to talk about. More and more work is moved
to the GPU now that their programmability increases steadily. The nVidia
GeForce 6800 generation hardware will be able to sample from textures in
vertex shader, this will be a Big feature for shader programmers. But that
is outside the scope of your question anyway.

Since I don't write games it's not interesting for me.
I do hope game writers will be so smart to test it out

They don't really have to, it's Common Knowledge, and I don't mean virtually
everyone takes it granted because 'I've heard from some bloke who said so',
but because it's so ****ing fundamental basics that it's unavoidable that
this is determined VERY early on anyone's incursion to GPU programming TO
BEGIN WITH.

Why?

I give a practical example. If you have a very basic filter, let's say ONLY
bilinear filter for sampling from textures. When you implement this with
CPU, you will basicly have to write something like this:

color = color0 * weight0 + color1 * weight1 + color2 * weight2 + color3 *
weight3;

Where color0 through color3 are four color samples from the texture from a
2x2 block, the weight0 though weight3 are computed from the fractional
texture coordinates in horizontal and vertical directions. This same
computation can also be done with three linear interpolations, two in one
dimension and one in other dimension. The point is that the blending alone
is four multiplications and three additions PER COLOR COMPONENT, where we
usually have four components (red, green, blue and alpha). Therefore the CPU
would have to do for the color mixing alone for bilinear filter 28
aritchmetic operations. Not to mention how to interpolate the texture
coordinates, how to compute the weight factors from fractional coordinates
and so on.

The CPU is serial: it has to do all these computations one after another.
The superscalarity and other factors can change this fact physically, but it
won't do magic. Now let's look at the GPU how it hacks this problem.

The GPU is cunning little ****er, it has own dedicated transistors for this
work. When you want to sample from texture, the hardware will use
transistors allocated for the job and do the FILTERED lookup virtually free.
Now, there is LATENCY in memory.. the results don't come out immediately,
but the GPU is processing pixels in parallel and knows what pixel will be
processed next and so on. If the GPU pixel processor is pipelined (you can
bet your ****ing ass it is) it can do the FILTERED lookup while it is
processing PREVIOUS pixel still with other parts of the hardware. When the
part of hardware that wants the filtered color value need the color value,
it has arrived where the value is needed. The key idea here is that the
FILTER "unit" in the chip does only look up filtered color values for other
parts of the chip.

This means that the computations are DELAYED, there is delay.. but it does
not matter because it is not critical if the results arrive a little bit
later.. no one any worse off because of this arrangement because of the
nature of the work. The work is to fill pixels with certain color and do a
LOT of that computation in given time. CPU on the other hand must do every
single step of the job as quickly as possible, because next instructions
rely on the previous ones being completed (if the results are needed, if
not, ofcourse instructions can be executed in different order which is why
it is called out-of-order execution

The key point here is that the
computations do not have DEPENDENCIES from earlier computations, each
fragment is unique entity and only intra-fragment computations matter hence
it is possible to speed up the design to speeds CPU can only dream of for
this kind of computational work. Example follows.

Now if you want to fill a 100 pixel triangle, and each pixel takes
approximately 40 clock cycles to complate it means we need 4000 clock cycles
to fill the triangle. This is very optimistic because memory latency will
make situation MUCH worse, but let's give the CPU as much advantage as we
can.

Now, let's look at how GPU does shit. Let's assume it takes 40 clock cycles
for the GPU aswell per pixel. Hell, let's give the GPU 200 clock cycles per
pixel (5x slower!!!). The GPU will still win the CPU hands down, even if it
is 5 times slower? You know why?

This is MAGIC! Look closely:

First 200 clock cycles are spent for the first pixel, then the color is
ready. But every clock cycle we becan work on next pixel.. so in the end we
have up to 200 clock cycles "in flight" (being processed). Now at cycle 200,
we still have 199 pixels to fill.. so the total time for our work will be
399 clock cycles. This is 10 times shorter time than what CPU used for the
job, even when cost per pixel was 5 times lower!

OK, getting 200 pixels to execute simutaneously would be GPU design that
does not really exist.. it would assume there would be 200 stages in the
fragment pipeline which is not even near the truth.. but I used this as
example to demonstrate you how GPU has "unfair" advantage over CPU for pixel
work. The real situation is there are are closer to tens than hundreds of
stages.. but each stage is very fast and "free" because each stage has own
transistors doing the computation. CPU has only so many adders, multipliers,
shifters and so on it can use simultaneously. The key principle for
efficient CPU design is to use as much of those units simutaneously as
possible. This is why Pentium PRO and later Intel processors break the IA32
instructions to internal micro ops (PPRO - P3) and Pentium4 goes even
further it has own microcode which the IA32 code is translated dynamically
on-fly but this is very expensive so Pentium4 design team implemented area
in Pentium4 chip where translated code is stored, this is called "trace
cache" you might have heard of it.

This over simplifies the situation A LOT, and I could write all day long to
fill-in the gaps to be more precise for the sake of vanity and avoid
procecution by my peers and colleagues but since who know who I am know
what I know I don't quite see the point.

Now is your question satisfied?

Charles Banas · May 16, 2004

Skybuck said:
Hi,

Today's CPUs are in the 3000 to 4000 MHZ clock speed range.

i haven't seen one on the market at 3.8GHz yet.

Today's GPU's are in the 250 to 600 MHZ clock speed range.

and they run much hotter than current CPUs because they perform many
times more calculations per cycle than a CPU does.

I am wondering if GPU's are holding back game graphics performance because
their clock speed is so low ?

no. current GPUs process 4-16 pixels per cycle simultaneously. (more
accurately, 4-16 texels. that is, a textured and colored pixel.)
graphics are inherently parallelizable and as a result, (barring the
logistical problems) a GPU could conceivably be designed to process
every pixel on the screen simultaneously in one cycle.

For example suppose someone programs a game with the following properties:

3 GHZ for os, game, network, etc logic
1 GHZ for graphics.

Would that game run faster than any other game ?

possibly slower.

How about

2 GHZ for game
2 GHZ for graphics.

even worse. that doesn't leave much for data processing, audio
processing, data dispatch, etc.

I know CPU's have generic instructions. so 4 GHZ means a cpu can do about
4000 miljoen generic instructions.

no. that means it's capable of processing instructions at the rate of 4
billion cycles per second. it has nothing to do with the speed
instructions are actually executed.

why?

because CPUs generally break down instructions into a series of jobs.
the pentium 4, for example, does this with 3 instructions at a time in
optimal conditions. each of these instructions is broken down into
about 15 or so jobs. (i forget how many micro-ops.) each one of these
jobs is performed sequentially as it is sent down the pipeline. once it
leaves the pipeline, the instruction is deemed complete.

long pipelines allow for high clock speeds, but because of this, delays
of tens of cycles can happen if the CPU is fed an instruction it can't
process right away.

What does GPU 500 mhz mean ???? 500 miljoen pixels ? 500 miljoen triangles
? 100 miljoen T&L + 400 miljoen pixels ? what ?

generally, it doesn't mean much. the GeForce 6800, for example, can
process up to 16 pixels per cycle, which includes texturing and up to
two math or texture calculations per pixel, for every cycle. if more
than one texture is used for a given pixel, then two pipelines are used
bringing it down to 8 pixels per cycle, which includes two textures and
the two calculations. so the theoretical peak fill rate of a GeForce
6800 is (i think) about 4 billion texels per second. (i'm probably a
bit off, but that should give you an idea.) a 4GHz pentium 4 doing the
*same* job might be able to do about 100 million texels per second,
assuming that's the only job it's doing.

ATis work a bit differently, but i skimmed over those articles and don't
recall the details.

C · May 16, 2004

Someone clue me in on all the attacks on this (mis)post.
Seems so much energy was expended on bashing this guys error of post. I'm
new to newsgroups, so please explain the presidential level error of such
magnitude that it required such negative attention.

Skybuck Flying · May 16, 2004

Dr Richard Cranium said:
the real answer ? those folks are busy getting the 64/128 bit ATI video card ready to
ship sometime in June.
Then the 64 bit sound card will follow. (PCI2)

Microsoft is just sitting on the 64 bit windows OS until INTEL is READY to ship. Gosh -
you wouldn't want AMD to grab the 64 bit OS market first, would you.

okay so you will have to use what available talent is here in the ATI ng huh, well at
least these folks are trying to help more ng'ers than wallets (like in their own).

my 2 cents follows.

The more you can off load the graphics API's from the cpu the faster the cpu can process
the game (software api's)
The cost/overhead of building a perfect GPU goes well beyond what you or i care to spend
for a graphics card.
How about $12,000 for the almost perfect CADD workstation video card. I don't think so.
$ 500 per card is my limit. ATI needs trade-offs, and you are purchasing those trade offs
and various nitch prices you and I can afford. This does NOT put much money back into
ATI's pocket so they can research an inexpensive GPU that performs like a CPU. ATI ain't
stupid either - you ain't getting that $12,000 video card for peanuts dude. What you ARE
going to get is value for your money. ATI can afford and make and sell you at a
reasonable price what you see on the market. You want CPU = GPU performance, so does ATI.

as GPU and CPU creation process's get improved and cheaper - then you will see GPU
performance approaching CPU
performance.

You need the GPU to run the CPU at good efficiency. Lose the GPU, and you bog/choke the
CPU.

CPU win's, GPU wins.

nuff said.

Absolutely not...

Since nowadays nvidia cards and maybe radeon cards have T&L... that means
transform and lighting.

That also means these cards can logically only do a limited ammount of
transform and lighting.

5 years ago I bought a PIII 450 mhz and a TNT2 at 250 mhz.

Today there are P4's and AMD's at 3000 mhz... most graphic cards today are
stuck at 500 mhz and overheating with big cooling stuff on it.

So it seems cpu's have become 6 times as fast... (if that is true) and
graphics cards maybe 2 or 3 times ?! they do have new functionality.

Engines are expensive to build.

Look at the doom3 engine... or any other engine... Suppose that engine uses
T&L... Suppose 5 years from now cpu's are again 6 times faster... and
graphics cards only twice as fast (the best most expensive one's I doubt
that is going to happen any time soon seeing the big heat problem ).

That could mean the doom3 engine or any other engine is seriously held back
if all T&L is done in the graphics card...

I think john carmack is really smart and codes flexible stuff... so maybe he
is smart enough to do T&L in cpu as well... Actually doom3 might not use T&L
at all... since I was able to run the doom3 alpha on a TNT2 which has no
T&L... also rumors are lol

that john carmack likes using opengl that's
true and doom3 only (?) uses opengl.

So for doom3 this might actually not be a problem... But for games like call
of duty, homeworld 2, halo... this could become a problem... a slow problem
that is =D

Seeing this post my coinfendence in john delivering awesome powerfull
flexible engines has just rissen

Bye, bye,
Skybuck

Charles Banas · May 16, 2004

Skybuck said:
card ready to

ship. Gosh -

huh, well at

their own).

cpu can process

care to spend

don't think so.

those trade offs

money back into

CPU. ATI ain't

dude. What you ARE

you at a

performance, so does ATI.

see GPU

bog/choke the

Absolutely not...

Since nowadays nvidia cards and maybe radeon cards have T&L... that means
transform and lighting.

do you know what T&L entails? one T&L operation on a GPU is roughly
equivalent to 10-15 operations on a CPU. video cards often use T&L that
uses lower precision than a CPU might use (16-bit vs. 32-bit z-buffers)
by default, but it does that work much more quickly. at a lwer clock rate.

That also means these cards can logically only do a limited ammount of
transform and lighting.

but they are designed to do it, and so are much more deft at it.

5 years ago I bought a PIII 450 mhz and a TNT2 at 250 mhz.

how nice.

Today there are P4's and AMD's at 3000 mhz... most graphic cards today are
stuck at 500 mhz and overheating with big cooling stuff on it.

i have one of those video cards, but AMDs aren't really up to 3GHz.
their performance rating is 3000+ or 3200+ etc., but they actually
operate down toward 2.5GHz.

So it seems cpu's have become 6 times as fast... (if that is true) and
graphics cards maybe 2 or 3 times ?! they do have new functionality.

graphics cards perform a function that is more parallelizable, so they
have more pipelines to perform calculations in. CPUs have relatively
few pipelines because they can't be parallelized as easily. as a
result, a GPU can do more work per clock than a CPU.

Engines are expensive to build.
really.

Look at the doom3 engine... or any other engine... Suppose that engine uses
T&L... Suppose 5 years from now cpu's are again 6 times faster... and
graphics cards only twice as fast (the best most expensive one's I doubt
that is going to happen any time soon seeing the big heat problem ).

That could mean the doom3 engine or any other engine is seriously held back
if all T&L is done in the graphics card...

WRONG.

the T&L by that time will be performed on /more vertices per clock/ than
it is now.

CPUs are not as capable.

I think john carmack is really smart and codes flexible stuff... so maybe he
is smart enough to do T&L in cpu as well... Actually doom3 might not use T&L
at all... since I was able to run the doom3 alpha on a TNT2 which has no
T&L... also rumors are lol that john carmack likes using opengl that's
true and doom3 only (?) uses opengl.

OpenGL is supported on more platforms. he doesn't have his sights so
narrow as other developers. Windows is not the only future. other
platforms exist, and those platforms have OpenGL.

as a side note, OpenGL is a standard. as such, anything that uses it
must follow the standard. OpenGL mandates that T&L be supported -
either by the OpenGL driver or by the hardware. ever since the GeForce
series started, T&L has been done in hardware. your argument holds no
water.

So for doom3 this might actually not be a problem... But for games like call
of duty, homeworld 2, halo... this could become a problem... a slow problem
that is =D

again, you're wrong. Halo makes extensive se of Pixel Shaders. in
another post ([email protected]) i noted that GPUs are
inherently parallelizable. as such, i expect to see GPUs coming out in
the future that support as many as 24 or 32 pixel pipelines with
multiple math and texture units per pipeline. as such, they will
process a great deal of pixel data every cycle - moe than a CPU can
dream of doing in a hundred cycles. Halo especially will reap the
benefits here, and in a little over a year or so, we can probably see
Halo doing upwards of 100+ frames per second with CURRENT CPUs.

Seeing this post my coinfendence in john delivering awesome powerfull
flexible engines has just rissen

John is an exceptional programmer. i can only hope that my talent
proves itself to be half what he is known for.

but honestly, you're giving him too much credit.

CPU vs GPU

Skybuck Flying

cowboyz

Stephan Grossklass

joe smith

Skybuck Flying

Skybuck Flying

Skybuck Flying

Alan Tiedemann

Conor

Ralf Hildebrandt

Dark Avenger

Skybuck Flying

Dr Richard Cranium

Damaeus

Dr Richard Cranium

joe smith

Charles Banas

C

Skybuck Flying

Charles Banas