Is there any point to manipulating AGP voltage or bus speed?

pigdos · Jan 12, 2006

I've always wondered if the AGP voltage or AGP bus speed BIOS settings made
any difference whatsoever in gaming performance.

rms · Jan 13, 2006

I've always wondered if the AGP voltage or AGP bus speed BIOS settings

made any difference whatsoever in gaming performance.

Raising agp voltage did help videocard stability in some cases when
these high-powered vidcards first started appearing, a year or two ago.
Nowadays it's rare to hear of it mattering. I routinely leave mine a notch
above default tho, it won't hurt anything.

rms

pigdos · Jan 14, 2006

Rms is the AGP voltage the signalling voltage (for data) or the voltage used
to power the card?

First of One · Jan 15, 2006

Normally the "1.5V" you see in BIOS is the signalling voltage. Actual power
is delivered to the AGP slot through 3.3 V, 5 V and 12 V pins, according to
this article:
http://www.xbitlabs.com/articles/video/display/ati-powercons.html . A good
motherboard should keep these voltages as close to nominal as possible.

pigdos · Jan 16, 2006

First of One could you take a look at my post on "A question about modern
GPU's"? I'd really appreciate your input.

BTW, what do you think of the Xbox 360's video graphics hardware? It would
seem to me, that regardless of the high bandwidth to/from main memory in the
Xbox 360 (700 Mhz) that the 500Mhz GPU would be a bottleneck. Does the Xbox
360 have things like geometry instancing or adaptive anti-aliasing? I've
noticed the specs on the XBox 360 are really vague. If it's based on the
PowerPC architecture (not IBM's Power 5 architecture) I'll bet it's probably
slower than any of the dual core AMD opterons or Athlon 64 X2's.

First of One · Jan 17, 2006

pigdos said:
First of One could you take a look at my post on "A question about modern
GPU's"? I'd really appreciate your input.

This is regarding the "dual-ported" memory interface? Real-world tests
obviously hasn't shown any performance gains going from AGP4x to 8x, in this
article that I reference often: http://www.sudhian.com/showdocs.cfm?aid=554
Today's video cards simply don't move enough data across the AGP bus to make
a difference.

The GPU doesn't have to be "synchronized" to the AGP bus or to video RAM or
to anything, especially when the AGP bus is quad- or eight-pumped, while the
video RAM can be DDR, GDDR3, etc. Modern GPUs also have small internal
caches. On the 7800GTX, different portions of the GPU run at different clock
speeds. See: http://www.anandtech.com/video/showdoc.aspx?i=2479&p=3 Throw in
things like a crossbar- or "ring bus" memory interface, and it becomes very
difficult to look at "bottlenecking" from clock speed alone.

BTW, what do you think of the Xbox 360's video graphics hardware?

I don't think the 360's hardware needs to be as powerful as PC hardware.
According to http://www.anandtech.com/systems/showdoc.aspx?i=2610&p=8 . The
360 can output to 720p or 1080i, which means the max effective resolutions
are 1280x720 or 1920x540, not that many pixels. Do they even test PC video
cards below 1280x1024 any more?

It would seem to me, that regardless of the high bandwidth to/from main
memory in the Xbox 360 (700 Mhz) that the 500Mhz GPU would be a
bottleneck.

There's 10 MB of embedded DRAM in a separate die on the GPU package. Suffice
to say it will act like a cache. Any performance loss arising from
"mismatched" clocks (which I'm not convinced to be an issue) probably will
disappear anyway.

Does the Xbox 360 have things like geometry instancing or adaptive
anti-aliasing?

No idea, though seeing both features are already functional on Radeon 9x00
hardware, I would expect them to be present in the 360. The GPU already
integrates the system northbridge / memory controller and the TV encoder. Is
geometry instancing even meaningful? There's 512 MB of unified RAM; and the
CPU can only access the RAM *through* the GPU.

I've noticed the specs on the XBox 360 are really vague. If it's based on
the PowerPC architecture (not IBM's Power 5 architecture) I'll bet it's
probably slower than any of the dual core AMD opterons or Athlon 64 X2's.

We know it's based on PowerPC. If I remember correctly, the XBox 360 demos
shown at E3 last year were all running off Macs hidden under the table.
Something like a Intel Core Duo probably would yield better performance,
while downsizing that massive power brick. :-)

The choice of CPU was a business decision, because IBM was likely the only
one willing to sell the design to Microsoft with no strings attached. Same
story with ATi. Microsoft can get the chips produced in Taiwan for the
lowest cost possible, and do a die-shrink any time for continued savings
year after year.

pigdos · Jan 17, 2006

That Sudhian article was testing on a relatively old, slow GPU (450 Mhz). It
shows how ignorant the authors were that they didn't realize this basic
fact: registers are implemented as clocked, d-flip-flops and you can't clock
in data faster than the clock rate fed to the d-flip-flop, so feeding data
faster than a GPU's registers can read it in is a waste. This is basic
digital logic & design. I haven't read anything, anywhere that contradicts
that. So, of course, their tests would indicate there is no difference
between AGP 4x and 8x. Sure you can buffer data in some sort of higher
clocked FIFO but if the GPU isn't reading that data out fast enough that
FIFO will fill up.

If you don't think mismatched clock rates are a problem you have a lot to
learn about digital logic and design. A lot of signalling lines (in PCI, ISA
and AGP buses, as well as in memory buses to say nothing of CPU's) are
dedicated to data rate management. If it wasn't a problem, these lines
wouldn't exist.

First of One · Jan 18, 2006

pigdos said:
That Sudhian article was testing on a relatively old, slow GPU (450 Mhz).
It shows how ignorant the authors were that they didn't realize this basic
fact: registers are implemented as clocked, d-flip-flops and you can't
clock in data faster than the clock rate fed to the d-flip-flop, so
feeding data faster than a GPU's registers can read it in is a waste. This
is basic digital logic & design. I haven't read anything, anywhere that
contradicts that. So, of course, their tests would indicate there is no
difference between AGP 4x and 8x. Sure you can buffer data in some sort of
higher clocked FIFO but if the GPU isn't reading that data out fast enough
that FIFO will fill up.

You missed the point. AGP8x apparently offers no gains over AGP4x on the
FX5900, not because the GPU is the bottleneck, but because there simply
isn't much data moving across AGP to begin with, in most games.

On the flip side, even assuming the GPU can saturate AGP to the absolute
theoretical limit of the AGP interface, the available bandwidth is still
orders of magnitude less than that afforded by the local video RAM. If an
application/game needs to move a lot of data across the AGP bus, it will be
slow, period.

If you don't think mismatched clock rates are a problem you have a lot to
learn about digital logic and design. A lot of signalling lines (in PCI,
ISA and AGP buses, as well as in memory buses to say nothing of CPU's) are
dedicated to data rate management. If it wasn't a problem, these lines
wouldn't exist.

Perhaps, but I think it's safe to assume the ATi and nVidia designers
know significantly more than you or me. :-)

The GPU internal registers do
not interface with video RAM directly, but rather through a data cache.

If what you were saying is true, the ideal GPU design would only have 4-8
pipelines, built on a tiny die with a small transistor count, allowing it to
be clocked much higher. It would also mean overclocking the memory on a
FX5900 has no impact on performance. All the 6800NU owners who unmasked the
4 extra "pipelines" would see no gains because the GPU still runs at the
same clock speed. Obviously none of these scenarios are true.

According to
http://www.anandtech.com/video/showdoc.aspx?i=2044&p=4 ,
the number of internal registers is highly variable from one GPU
architecture to the next. In fact, according to
http://www.beyond3d.com/reviews/ati/r420_x800/index.php?p=7
it scales with the number of quads. You cannot just assume the rate of data
fed to the GPU is only dependent on clock speed alone.

pigdos · Jan 19, 2006

No, you are missing the point, registers, in fact ARE implemented as
D-flip-flops and the clock rate fed to those D flip-flops determines, along
with bus width the data rate they can clock in. There's no magical way
around this fact. Maybe you should take a look at some basic digital logic
and design textbooks and try to understand the concept of registers and
their relationship to flip flops and clocks. A GPU clocked at 400Mhz cannot
clock in data faster than this fundamental rate. The clock rate at which a
GPU, CPU or MCU can clock in data is related to the bit-width and the clock
speed...that's it. Neither of your articles dispute this fact. Clock rate
and bus width determine the rate at which any GPU, CPU or MCU can consume
data (it's maximum data xfer rate). Have you ever actually designed or
implemented any bus designs in your life?

First of One · Jan 20, 2006

By your logic, it would make no sense whatsoever to have the GPU clocked
slower than the memory. Well, let's see.. The original Geforce256 had a 120
MHz core and 166 MHz SDR memory. Yet, the switch to DDR memory instantly
brought huge performance gains. Today's Geforce 7800GTX 512 runs at 550 MHz
core, but the memory is 1.7 GHz. The increased latency of GDDR3 doesn't
explain a clock speed three times as high! nVidia didn't specify such a high
memory clock just to make the cards more expensive to produce. nVidia did it
to improve performance.

There are enough real-world examples to make your theory look silly. No, I
haven't designed or implemented any bus designs in my life, nor do I intend
to in the future. Though it's clear you haven't designed anything remotely
approaching the complexity of a graphics card, either. Please think your
theories through before preaching basic textbooks like the holy grail.