PCI-Express over Cat6

David Schwartz · May 13, 2004

I'm not at all sure what point you're trying to make here.
Forgive me if I flounder around a bit. The graphics card
_does_ access main memory. AFAIK, for both 2D & 3D after
rendering in system RAM the CPU programs the GPU to do BM
DMA to load the framebuffer vram.

Most current graphics cards render in ram on the graphics card.
Therefore there is no need to DMA the data into the framebuffer, it's as
simple as changing a pointer for where the framebuffer is located in the
graphics card's RAM. This is true for all but the very cheapest graphics
systems today.

No-one in their right mind tries to get the CPU to read
the framebuffer. It is dead slow because vram is very busy
being read to satisfy the refresh rate. It is hard enough for
the GPU to access synchonously and this is what the multiple
planes and the MBs of vram are used for.

Right. Typically the CPU doesn't read the texture memory either and the
textures only cross the system memory or AGP bus once, to get loaded into
the graphic's card's RAM. From there thay are applied and rendered wholly on
the graphic's card's internal bus.

My understanding is that in 3-D the advanced functions in
the GPU (perspective & shading) can handle quite a number of
intermediate frames before requiring a reload from system ram.
But it does require a reload. How's the graphics card gonna
know what's behind Door Number Three?

That I don't know the answer to. Can the graphics card say, "this item
is visible, I need more details about it"? I don't think so. I think the
decision of what might be visible is made by the main processor and it must
tell the graphics card about every object or that object will not be
rendered.

DS

KR Williams · May 14, 2004

Most current graphics cards render in ram on the graphics card.
Therefore there is no need to DMA the data into the framebuffer, it's as
simple as changing a pointer for where the framebuffer is located in the
graphics card's RAM. This is true for all but the very cheapest graphics
systems today.

Exactly. AGP was an idea that was obsolete by the time it was
implemented. Memory is *cheap*.

KR Williams · May 14, 2004

roo@try- said:
I think you might be getting caught out by the wheel of
reincarnation. You can put lots of stuff into hardware or software,
the fashion has changed over the years, but usually the high-end
stuff has gone hardware. These days even the low-end stuff is
going hardware - silicon is cheap.

Now for a why you might want a two way pipe to a GFX card.

In interactive applications you'll want to do stuff like collision
detection which can make use of a lot of data that the 3D pipe
munges. It makes sense to use that data rather than replicate the
work done by that very fast silicon dedicated to the job, right ?

No. It's faster to do the work twice. Remember proceesor
bandwidth on the graphics card is "free".

In comp.arch the question of GFX hardware doing double precision
floating point keeps popping up. There are people who want that
hardware to be doing stuff other than blatting pixels around.

Yeah, you have RM, who wants to do doubles on a graphics card so
he doesn't have to pay for an expensive supercomputer. That's
not what we're talking about.

My advice is to read the first few chapters of the OpenGL spec,
then try and dig up some papers about 3D hardware. The opengl.org
site has a lot of useful and above all accessible info, it has a
fair bit of example code too.

My suggestion is to look into the architecture of a modern PC
graphics system (check the the buss, if need be). The graphics
card has more than enough memory to do all the rendering and
storage. There is *NO* reason the CPU has to have low-latency
access to the graphics card. That was the idea of AGP, but that
need went away before AGP was implemented.

KR Williams · May 14, 2004

Okay, then you tell me why things aren't rendered in memory and then
DMA'd to the graphics card.

Click to expand...

Are you slow? They're "rendered" IN THE GRAPHICS CARD'S MEMORY.
Sheesh!

KR Williams · May 14, 2004

roo@try- said:
KR Williams wrote:

[SNIP]

WHy don't you tell us why it's necessary, rather than spewing
some irrelevant web sites. THe fact is that the graphics channel

Click to expand...

OpenGL.org is hardly irrelevent with respect to 3D apps and
hardware. :/

....and your point?

No, the fact is : It isn't. I've given you some broad reasons
and I've given you some hints on where to start finding some
specifics.

Try look ing at the hardware, instead of the software. The
hardware on the graphics card holds the memory needed to do the
rendering.

[SNIP]

"They" are. ;-) Though you're still wrong about the graphics
pipe. It really isn't latency sensitive, any more than humans
are.

Click to expand...

As long as you consider sites like opengl.org to be irrelevant
you will continue to think that way regardless of what the
reality is.

I don't care a crap about software. I care about *HARDWARE*.
It's easier to put the hardware on the graphics card, so that's
where it is. AGP was a good idea, though five years late.

Robert Redelmeier · May 14, 2004

In comp.sys.ibm.pc.hardware.chips KR Williams said:
Exactly. AGP was an idea that was obsolete by the time it was
implemented. Memory is *cheap*.

OK, so stick the graphics card on PCI and free up that AGP
for a gigabit adapter. They normally saturate PCI around
35 MByte/s. Limited burst length prevents achieving the
theoretical PCI 33/32 throughput of 133 MB/s. Gigabit needs
125 MB/s each way.

-- Robert

KR Williams · May 14, 2004

OK, so stick the graphics card on PCI and free up that AGP
for a gigabit adapter. They normally saturate PCI around
35 MByte/s. Limited burst length prevents achieving the
theoretical PCI 33/32 throughput of 133 MB/s. Gigabit needs
125 MB/s each way.

To reasons. Marketing: AGP is a tick-box for graphics. PCI is
anti-tick-box.

Why even bother? Put the GBE on the HT link (other side of the
bridge)! PCI is just sooo, 90s! ;-)

The little lost angel · May 14, 2004

Okay, then you tell me why things aren't rendered in memory and then
DMA'd to the graphics card.

Erm, I'm no expert on graphics card... seeing that I have no need for
the latest & greatest. But reading the usual webzines/sites on new
stuff generally gives me the idea that the processor nowadays handles
setting up each scene as objects in a 3D space and then shoots these
to the GPU. The GPU then figure out how to put textures and other
effects on the objects and render the scene in local buffer. Then it
displays out.

Used to be the CPU has to do a lot of these stuff, but there came
along 3D GPU which started with basic stuff, then goes on to do
Transform & Lighting effects, then pixel shading and stuff (latest in
thing seems to be Pixel Shader 3.0)

Which I think makes much more sense than rendering the whole scene by
the CPU, then storing it in main memory before shooting a chunk of
some 24Mbits of data per frame, for some erm 720Mbps across the
AGP/PCI bus to maintain a half decent 30FPS at 1024x768x32? Or doesn't
it?

Of course, being the village idiot in CSIPHC, I could be talking about
the wrong stuff in the wrong places altogether

pPpPp

--
L.Angel: I'm looking for web design work.
If you need basic to med complexity webpages at affordable rates, email me

Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
If you really want, FrontPage & DreamWeaver too.
But keep in mind you pay extra bandwidth for their bloated code

Rupert Pigott · May 14, 2004

KR said:
My suggestion is to look into the architecture of a modern PC

My suggestion is that you look at some APIs, implementations and
some code. I know that all you're trying to do is grind your Intel
and AGP suck axe, but it doesn't really mean shit if you don't
actually look at the APIs, hardware and the code.

Cheers,
Rupert

Rupert Pigott · May 14, 2004

KR said:
roo@try- said:

KR Williams wrote:

[SNIP]

WHy don't you tell us why it's necessary, rather than spewing
some irrelevant web sites. THe fact is that the graphics channel

Click to expand...

OpenGL.org is hardly irrelevent with respect to 3D apps and
hardware. :/

Click to expand...

...and your point?

No, the fact is : It isn't. I've given you some broad reasons
and I've given you some hints on where to start finding some
specifics.

Click to expand...

Try look ing at the hardware, instead of the software. The
hardware on the graphics card holds the memory needed to do the
rendering.

Meanwhile the CPU + main memory holds the program that tells the
graphics card what to do. Unfortunately it isn't always a case of
build display list and fire it at the GFX card, just one of those
"devil is in the detail" things.

[SNIP]

I don't care a crap about software. I care about *HARDWARE*.

Hardware is just exotic boat anchor if it's not running software.

OpenGL is a 3D API that drives 3D hardware.

It's easier to put the hardware on the graphics card, so that's
where it is. AGP was a good idea, though five years late.

I don't really get why you're grinding an axe against AGP to be
honest, it's just a faster and fatter pipe than stock PCI. No
big deal, and it does appear to make a difference, ask folks
who have used identical spec cards in PCI and AGP flavours.

Cheers,
Rupert

David Schwartz · May 14, 2004

KR Williams said:
Are you slow? They're "rendered" IN THE GRAPHICS CARD'S MEMORY.
Sheesh!

Click to expand...

Yes, but *WHY*? Do you have a reading comprehension problem?

Let's start over. I was answering the question "Why wouldn't things be
rendered in memory and then DMA'd to the graphics card?". My answer was
"Because then the rendering process would be eating system mrmory
bandwidth". You said "Nope. You're thinking of AGP." So I said, "Okay, then
you tell me why things aren't rendered in memory and then DMA'd to the
graphics card".

So, if the answer "because then the rendering process would be eating
system memory bandwidth" is wrong, then please tell me *WHY* are the
rendered in the graphics card's memory? Why even have memory on the graphics
card at all?

Could it be because then the rendering process would be eating system
memory bandwidth? Just like I've been saying all along?!

DS

KR Williams · May 16, 2004

roo@try- said:
KR said:

roo@try- said:

KR Williams wrote:

[SNIP]

WHy don't you tell us why it's necessary, rather than spewing
some irrelevant web sites. THe fact is that the graphics channel

OpenGL.org is hardly irrelevent with respect to 3D apps and
hardware. :/

Click to expand...

...and your point?

is amazingly unidirectional. THe processor sends the commands to
the graphics card and it does it's thing in its own memory. AGP

No, the fact is : It isn't. I've given you some broad reasons
and I've given you some hints on where to start finding some
specifics.

Click to expand...

Try look ing at the hardware, instead of the software. The
hardware on the graphics card holds the memory needed to do the
rendering.

Click to expand...

Meanwhile the CPU + main memory holds the program that tells the
graphics card what to do. Unfortunately it isn't always a case of
build display list and fire it at the GFX card, just one of those
"devil is in the detail" things.

Please, tell us more...

Hardware is just exotic boat anchor if it's not running software.

Software won't even anchor the boat if there is no hardware. ;-)

OpenGL is a 3D API that drives 3D hardware.

Really? No shit!

I don't really get why you're grinding an axe against AGP to be
honest, it's just a faster and fatter pipe than stock PCI. No
big deal, and it does appear to make a difference, ask folks
who have used identical spec cards in PCI and AGP flavours.

Oh, my! I've gone and insulted Rupert's sensibilities again.

Your logic is impeccable. AGP is faster, and wider(?) than PCI,
so it's god's (or Intel, same thing I guess) gift to humanity.
Good grief, you compare a stripped point-to-point connection (PCI
cut to the bone, actually) to a cheap PCI 32/33 *BUS*
implementation and then proclaim how wonderful it is. Sure AGP
is faster than the cheapest PCI implementation. Was that your
whole point?

KR Williams · May 16, 2004

roo@try- said:
My suggestion is that you look at some APIs, implementations and
some code. I know that all you're trying to do is grind your Intel
and AGP suck axe,

Intel, like M$, makes decisions not based on what will push the
market forward, rather what will shore up their end of the box
profits. Hate Intel? No. Axe? For brain-dead technology, yes.
AGP was designed as a super-UMA. As such, yes it sucks.

but it doesn't really mean shit if you don't
actually look at the APIs, hardware and the code.

Instead of telling people how smart you are, why don't you tell
me what, in the graphics pipe, needs low-latency to the
processor. Or you could just say, "I'm right you're wrong, go
look for the needle in the hay-stack". Oh, you did.

Rupert Pigott · May 17, 2004

KR said:
Instead of telling people how smart you are, why don't you tell
me what, in the graphics pipe, needs low-latency to the
processor. Or you could just say, "I'm right you're wrong, go
look for the needle in the hay-stack". Oh, you did.

I gave you some examples, you ignored them. I gave you some
references to look at, you ignored them. I don't really see
the point of writing a 2000 word essay on OpenGL hardware,
API and a specific algorithm when you're saying that OpenGL
is irrelevant.

You can lead a horse to water, but you can't make it drink.

*shrug*

Rupert Pigott · May 17, 2004

KR said:
removing-this.darkboong.demon.co.uk says...
[SNIP]

I don't really get why you're grinding an axe against AGP to be
honest, it's just a faster and fatter pipe than stock PCI. No
big deal, and it does appear to make a difference, ask folks
who have used identical spec cards in PCI and AGP flavours.

Click to expand...

Oh, my! I've gone and insulted Rupert's sensibilities again.

Your logic is impeccable. AGP is faster, and wider(?) than PCI,
so it's god's (or Intel, same thing I guess) gift to humanity.

Not really. For me upping the framerate by ~20% made the difference
between a game being playable and it being unplayable. Not a big
deal in the world of rocket science, but that kind of thing matters
to a lot of folks who play games.

Good grief, you compare a stripped point-to-point connection (PCI
cut to the bone, actually) to a cheap PCI 32/33 *BUS*
implementation and then proclaim how wonderful it is. Sure AGP
is faster than the cheapest PCI implementation. Was that your
whole point?

In that case, yes. Where were the alternatives to AGP that would
have provided the extra bandwidth, yet kept the characteristics
required to maintain backward compatibility AND do all that at
a minimal price point for both the vendor and customer ? I didn't
see PCI Express or PCI-X leaping into the chipsets at the time.

As unclever or ugly as AGP maybe, it has been an effective and
inexpensive solution for it's vendors and customers.

Cheers,
Rupert

KR Williams · May 17, 2004

roo@try- said:
I gave you some examples, you ignored them. I gave you some
references to look at, you ignored them. I don't really see
the point of writing a 2000 word essay on OpenGL hardware,
API and a specific algorithm when you're saying that OpenGL
is irrelevant.

No, you didn't. You keep referring to the APIs, yet don't point
to anything specific. You don't teach anything with respect to
how these things affect performance. You can be as smug as you
wish, but...

You can lead a horse to water, but you can't make it drink.

....you lie, Rupert.

*shrug*

Indeed.

KR Williams · May 17, 2004

roo@try- said:
KR said:

removing-this.darkboong.demon.co.uk says...
[SNIP]

I don't really get why you're grinding an axe against AGP to be
honest, it's just a faster and fatter pipe than stock PCI. No
big deal, and it does appear to make a difference, ask folks
who have used identical spec cards in PCI and AGP flavours.

Click to expand...

Oh, my! I've gone and insulted Rupert's sensibilities again.

Your logic is impeccable. AGP is faster, and wider(?) than PCI,
so it's god's (or Intel, same thing I guess) gift to humanity.

Click to expand...

Not really. For me upping the framerate by ~20% made the difference
between a game being playable and it being unplayable. Not a big
deal in the world of rocket science, but that kind of thing matters
to a lot of folks who play games.

How much is tat due to the faster pipe? ...and how much to what
AGP brings to the table? AGP brings nothing other than a faster
pipe.

In that case, yes. Where were the alternatives to AGP that would
have provided the extra bandwidth, yet kept the characteristics
required to maintain backward compatibility AND do all that at
a minimal price point for both the vendor and customer ? I didn't
see PCI Express or PCI-X leaping into the chipsets at the time.

Backwards compatibility? AGP was compatible with exactly what?
AGP was *designed* to simply allow the textures to be put in
system memory. A *very* bad idea. Indeed, perhaps AGP put off
better solutions many years.

As unclever or ugly as AGP maybe, it has been an effective and
inexpensive solution for it's vendors and customers.

Bad ideas are often pushed on the consumer hard enough that there
is no choice. I can think of many such bad ideas (some even
worse than UMA and AGP). WinPrinters and WinModems come to mind.
Intel was right in there on these too.

I may have a tough spot in my soul for Intel, dreaming for what
might have been (and technically possible), but you're a lackey
for what is. I'm quite sure you don't treat M$ so kindly for
*WHAT IS*.

Jason Ozolins · May 17, 2004

KR said:
Nope. 3-D is no different. AGP wuz supposed to make the
graphics channel two-way so the graphics card could access main
memory. DO you know anyone that actually does this? PLease!
With 32MB (or 128MB) on the graphics card, who cares?

Read some of the "optimising your game for a modern 3D card"
presentations on the NVidia or ATI developer web sites. You want to
decouple the CPU from the graphics card as much as possible, to
eliminate "dead time" when the CPU waits for the card to finish
something, or the card waits for more data. The card has lots of RAM on
it, but the textures, vertex data, etc have to get into that RAM
somehow... and some applications have more texture or vertex data than
can efficiently fit into the card RAM. A 32MB card running at 1024x768,
with 24-bit colour, 8-bit alpha, 24-bit Z, 8-bit stencil, double
buffered, needs about 10MB of video RAM. Some games have more than 22MB
of total textures these days, and some vertex data is dynamically
generated for each frame. You need an efficient way to push the data up
to the card without forcing either the CPU or the card to wait.

Having the card do bus mastering allows the CPU to set up a big DMA ring
buffer for commands, which the card slurps from in a decoupled way, and
the card can then also slurp texture and vertex data from other memory
areas which are set up in advance by the CPU. There are special
primitives which allow the CPU to coordinate this bus mastering activity
so that they don't step on each other's data, while maintaining as much
concurrency as possible.

So that's the motivation for the card doing bus mastering. AGP brings
two extra things to the picture: higher speed than commodity PCI, and a
simple IOMMU, which gives the graphics card a nice contiguous DMA
virtual address space that maps onto (potentially) scattered 4K blocks
of memory.

Sure, so why does the 3-D card want to go back to main memory,
again? The graphics pipe is amazingly one-directional. ...and
thus not sensitive to latency, any more than in human terms.

Exactly. By using bus mastering, you let the CPU and card work in
parallel, at the expense of increased latency for certain operations.
Reading back the frame buffer contents in a straightforward way (i.e.
with core OpenGL calls) is a really great way to kill your frame rate in
3D games, because you cause all the rendering hardware to grind to a
halt while the frame buffer data is copied back. The graphics card
vendors really, really want you to use their decoupled "give us a lump
of memory and we'll DMA the frame buffer data back when it's finished
baking, meanwhile keep feeding me data!" OpenGL extensions to do this.

-Jason

Rupert Pigott · May 17, 2004

KR Williams wrote:

[SNIP]

Backwards compatibility? AGP was compatible with exactly what?

Compatibility with pre-AGP software.

AGP was *designed* to simply allow the textures to be put in
system memory. A *very* bad idea. Indeed, perhaps AGP put off
better solutions many years.

If you consider putting shitloads of RAM onto the graphics card
a solution I don't think it slowed that down at all. What it did
enable was low-cost solutions *at the time it came out*, the kind
of solutions that would suit kiddies who would break their piggy
bank to play a game.

[SNIP]

I may have a tough spot in my soul for Intel, dreaming for what
might have been (and technically possible), but you're a lackey

OK, I'll bite. What might have been when AGP was first mooted ?

for what is. I'm quite sure you don't treat M$ so kindly for

In the context of this discussion your assertion of being a "lackey
for what is" is wrong anyway. It flatly ignores my preference which
is render into main memory and DMA the framebuffer to the RAMDAC.
Nice and simple, lots of control for the programmer. However I do
recognise this is not a good solution right now because of the way
the hardware is structured and the design trade-offs.

*WHAT IS*.

I never have liked MS stuff to be honest. Never liked x86s either,
but on the other hand Intel contributed heavily to PCI and on
balance I think that has been a valuable contribution to the
industry as a whole.

Cheers,
Rupert

KR Williams · May 18, 2004

Read some of the "optimising your game for a modern 3D card"
presentations on the NVidia or ATI developer web sites. You want to
decouple the CPU from the graphics card as much as possible, to
eliminate "dead time" when the CPU waits for the card to finish
something, or the card waits for more data. The card has lots of RAM on
it, but the textures, vertex data, etc have to get into that RAM
somehow... and some applications have more texture or vertex data than
can efficiently fit into the card RAM. A 32MB card running at 1024x768,
with 24-bit colour, 8-bit alpha, 24-bit Z, 8-bit stencil, double
buffered, needs about 10MB of video RAM. Some games have more than 22MB
of total textures these days, and some vertex data is dynamically
generated for each frame. You need an efficient way to push the data up
to the card without forcing either the CPU or the card to wait.

Exactly! Indeed, you're making my point! Any piss-ant grunge
card has 32MB of memory these days. Hell, even my 2-D card has
32MB! Memory is *CHEAP*.

Having the card do bus mastering allows the CPU to set up a big DMA ring
buffer for commands, which the card slurps from in a decoupled way, and
the card can then also slurp texture and vertex data from other memory
areas which are set up in advance by the CPU. There are special
primitives which allow the CPU to coordinate this bus mastering activity
so that they don't step on each other's data, while maintaining as much
concurrency as possible.

Good grief! I'm not arguing against bus-mastering. I *certainly
am not in favor of PIO to the disk drive, much less graphics
card. You assume a *lot*.

So that's the motivation for the card doing bus mastering. AGP brings
two extra things to the picture: higher speed than commodity PCI, and a
simple IOMMU, which gives the graphics card a nice contiguous DMA
virtual address space that maps onto (potentially) scattered 4K blocks
of memory.

Good grief! That is *NOT* the issue at hand. Perhaps you want
to read the thread again?

My issue is simply AGP's founding principle; use of processor
storage to store textures. An *incredibly* dumb idea, rather
like it's younger sibling UMA (for 2D).

The second issue here was my statement that the graphics pipe is
amazingly unidirectional, which Rupert took umbrage on, yet never
supplied any information other than "look at the APIs", which is
less than useful (and so typically arrogant). Show me where data
goes the *OTHER* way! I never said DMA wasn't a good idea.
Stupid implementation, sure. ;-)

The third issue was about latency. No one has shown me where
latency matters to the graphics pipe. (see the paragraph above).

It's obvious to anyone with a functioning brain cell that AGP
(note 'P' == "PORT") is faster than the more general and poorly
optimized (cheap-ass 32/33) general purpose PCI (*BUS*) with
random crappy devices attached to it.

My argument is that AGP was a bad idea from day one. Because it
is what is and the best we've got, doesn't change my opinion.
....and is rather an insulting argument. I do understand the
hardware.

Exactly. By using bus mastering, you let the CPU and card work in
parallel, at the expense of increased latency for certain operations.

Tell that to Rupert.

Reading back the frame buffer contents in a straightforward way (i.e.
with core OpenGL calls) is a really great way to kill your frame rate in
3D games, because you cause all the rendering hardware to grind to a
halt while the frame buffer data is copied back.
Ok.

The graphics card
vendors really, really want you to use their decoupled "give us a lump
of memory and we'll DMA the frame buffer data back when it's finished
baking, meanwhile keep feeding me data!" OpenGL extensions to do this.

Ok, this is perhaps where I'm missing the boat. Why does the
processor care about the frame buffer?

In any case I don't see why the processor cares about the latency
of the graphics subsystem (above human levels; msec numbers).

What if Microsoft never existed?	14	Oct 11, 2005
Corsair DRAM Adds Cooling Fan	6	Aug 30, 2006
Making a DIY BTX system	7	Nov 16, 2004
Not so dead, dead, DEAD!	51	Jan 3, 2006
The Mysteries of Windows Vista User Accounts	2	Jun 6, 2006
ATI Driver Optimizations	1	Sep 24, 2004
Deep Blitz: taking on Deep Blue with a PC	33	Jan 27, 2006
Great little primer about current chipsets	3	Aug 9, 2005

PCI-Express over Cat6

David Schwartz

KR Williams

KR Williams

KR Williams

KR Williams

Robert Redelmeier

KR Williams

The little lost angel

Rupert Pigott

Rupert Pigott

David Schwartz

KR Williams

KR Williams

Rupert Pigott

Rupert Pigott

KR Williams

KR Williams

Jason Ozolins

Rupert Pigott

KR Williams

Ask a Question

Similar Threads