What core speed were/are north bridges and/or MCH's clocked at?

  • Thread starter Thread starter pigdos
  • Start date Start date
P

pigdos

I've always been curious about this because these devices have to bridge
multiple types of buses.
 
I've always been curious about this because these devices have to bridge
multiple types of buses.

I'm pretty sure, back in the 440BX days, they used to run at the FSB base
clock speed internally but now.... seems like it's a secret.:-) Judging by
what I'm hearing about the nForce 5xx series temperatures, the clocks have
just been ramped up again.
 
Because of DDR wouldn't the internal clock speed have to be at least double
the FSB base clock speed?

In the case of north bridges w/8x AGP interfaces it would seem to me that
the core clock speed would have to be at least 533Mhz to be able to keep up
w/a 2.133GB/s data xfer rate.
 
Because of DDR wouldn't the internal clock speed have to be at least double
the FSB base clock speed?

I'd think so but remember the Intel FSB is quad clocked on data.
In the case of north bridges w/8x AGP interfaces it would seem to me that
the core clock speed would have to be at least 533Mhz to be able to keep up
w/a 2.133GB/s data xfer rate.

With e.g. an Intel MCH, assuming an internal width of 64-bits, same as the
FSB, it'd have to run at 1066MHz to keep up with the latest FSB rates to
avoid addiing latency... which would also match a dual channel DDR2 memory
controller at 533MT/s. Since the FSB interface and memory controller are
allowed to run non-clock locked, and considering strategies like read
around write etc, I'm not sure how that works internally... buffering?
Maybe one of the hardware guys can comment further on chips which handle
multiple time domains.
 
George said:
With e.g. an Intel MCH, assuming an internal width of 64-bits, same as the
FSB, it'd have to run at 1066MHz to keep up with the latest FSB rates to
avoid addiing latency... which would also match a dual channel DDR2 memory
controller at 533MT/s.

Not really, if you think about what 'quad pumping' or 'double pumping'
is, then it should become clear that the issue is not the data bus, but
the addressing bus.

You're right that running the core clock at a non-integer multiple of
an I/O will increase latency due to strange gearing ratios (i.e. it's
simple to run at 100MHz and support 2.5GHz, 2GHz, 2.3GHz...running at
115Mhz and supporting those would be ugly).
Since the FSB interface and memory controller are
allowed to run non-clock locked, and considering strategies like read
around write etc, I'm not sure how that works internally... buffering?

Obviously there's quite a bit of buffering going on, and it grows as
the bandwidth of the IO grows.
Maybe one of the hardware guys can comment further on chips which handle
multiple time domains.

Certain parts of the chip run asynchronously, and you hope to hell that
the frequencies line up nicely as I said above.

If you think about the size of current chipsets, I/O controllers, etc.
etc. you will realize that a die that size at 2GHz would dissipate
vastly more heat than is reasonable. That alone should tell you that
the frequency is substantially lower than what you are guessing so far.

DK
 
For every given address generated (for a fetch) something like 4x (probably
more) that amount of data is fetched (at least from memory, to fill a cache
line) on a L1/L2 miss, so I don't see how this could be true. Since the
address bus is unidirectional there is no bus turnaound time either and I
don't think other devices share these address lines anymore (unlike say
ISA).
 
pigdos said:
For every given address generated (for a fetch) something like 4x (probably
more) that amount of data is fetched (at least from memory, to fill a cache
line) on a L1/L2 miss, so I don't see how this could be true.

Try thinking about the relationship between transfer rate (as measured
in MT/s or MHz) and bandwidth (GB/s).
Since the
address bus is unidirectional there is no bus turnaound time either and I
don't think other devices share these address lines anymore (unlike say
ISA).

Are you sure the address bus is unidirectional? How would you do cache
coherency with a unidirectional bus?

DK
 
David Kanter said:
Not really, if you think about what 'quad pumping' or 'double pumping'
is, then it should become clear that the issue is not the data bus, but
the addressing bus.

You're right that running the core clock at a non-integer multiple of
an I/O will increase latency due to strange gearing ratios (i.e. it's
simple to run at 100MHz and support 2.5GHz, 2GHz, 2.3GHz...running at
115Mhz and supporting those would be ugly).


Obviously there's quite a bit of buffering going on, and it grows as
the bandwidth of the IO grows.


Certain parts of the chip run asynchronously, and you hope to hell that
the frequencies line up nicely as I said above.

If you think about the size of current chipsets, I/O controllers, etc.
etc. you will realize that a die that size at 2GHz would dissipate
vastly more heat than is reasonable. That alone should tell you that
the frequency is substantially lower than what you are guessing so far.

DK
The frequencies don't have to line up. Fifos and synchronizers in the
appropriate places take care of it. Multiple clock domains are quite
common these days. If they can all be driven off a common refclk like
62.5MHz, that is nice but multiple oscillators and PLLs are no big deal.

The issue of power vrs frequency is not so clear cut as you might think.
You have to handle the data rates in any case. So the datapath for the
low frequency version has to be much wider, so more circuits, more fan
out, etc.

del
 
Del said:
The frequencies don't have to line up. Fifos and synchronizers in the
appropriate places take care of it. Multiple clock domains are quite
common these days. If they can all be driven off a common refclk like
62.5MHz, that is nice but multiple oscillators and PLLs are no big deal.

Aw....Del, you're no fun! I was hoping to use this as a thought
exercise to try and get him to figure out how it worked.
The issue of power vrs frequency is not so clear cut as you might think.
You have to handle the data rates in any case. So the datapath for the
low frequency version has to be much wider, so more circuits, more fan
out, etc.

Sure, but power is quadratic WRT frequency and only linear WRT
capacitance.

DK
 
Aw....Del, you're no fun! I was hoping to use this as a thought
exercise to try and get him to figure out how it worked.


Sure, but power is quadratic WRT frequency and only linear WRT
capacitance.

Oh, good grief! You want to try again?!!!
 
krw said:
Oh, good grief! You want to try again?!!!

Hmm. If you have to run at higher voltage to make it fast enough? If
freq proportional to voltage and power proportion to voltage squared then
power proportional to frequency squared times frequency or frequency
cubed? How's that?

del
 
krw said:
Oh, good grief! You want to try again?!!!

Whoops! You're right. Linear WRT frequency, and quadratic WRT
voltage, which is correlated with frequency. Either way, the cost of
high frequency is usually (not always) much higher than having a wider
implementation.

DK
 
Not really, if you think about what 'quad pumping' or 'double pumping'
is, then it should become clear that the issue is not the data bus, but
the addressing bus.

<sigh> I already mentioned "quad clocked", which is what quad-pumping
really is -- two double speed differential clocks -- and you conveniently
managed to snip... along with the attributions, as usual... no shame at
all!!

What is being discussed here is the internal core logic of the chipset,
specifically the clocking, and routing, of *data* through the chip, which
AFAIK has not been err, quad-pumped to date... so yes: REALLY...
You're right that running the core clock at a non-integer multiple of
an I/O will increase latency due to strange gearing ratios (i.e. it's
simple to run at 100MHz and support 2.5GHz, 2GHz, 2.3GHz...running at
115Mhz and supporting those would be ugly).

.... if it doesn't run at the speed I stated, it's a throttle.
Obviously there's quite a bit of buffering going on, and it grows as
the bandwidth of the IO grows.

Depends... if the chip has a mode which locks the clocks when they match.

I meant a "Hardware guy" David - ya know, somebody who does it for a
living!
Certain parts of the chip run asynchronously, and you hope to hell that
the frequencies line up nicely as I said above.

Stating the obvious does not umm, contribute.
If you think about the size of current chipsets, I/O controllers, etc.
etc. you will realize that a die that size at 2GHz would dissipate
vastly more heat than is reasonable. That alone should tell you that
the frequency is substantially lower than what you are guessing so far.

2GHz?? Where'd you pull that from?
 
I've always been curious about this because these devices have to bridge
multiple types of buses.

Intel's Blackford MCH chipset, for example, has a "core clock" (aka ""BCLK")
of either 250, 266, or 333 mhz, depending on FSB "speed" (1000, 1066 or
1333mhz, respectively).

This MCH requires three different reference clock inputs (BCLK - same as
processors receive, PCIE clock, and FBDIMM) which are typically generated from
a common reference clock. There are strict phase relationships between each of
these clocks if you want to end up with something that actually works, because
xfers from clock domain to clock domain are based on clock "phasors", not
through the use of classic "synchronizers".

And, fwiw, the path from FSB through to FBDIMM or PCI-E (or ESI, for that
matter) can be made deterministic. Believe it or not ;-)

Cheers

/daytripper
 
In a northbridge design the address lines would run one way -- from the CPU
to the N. Bridge.
 
In a northbridge design the address lines would run one way -- from the CPU
to the N. Bridge.

Perhaps in a cache-less system one could have processors existing blissfully
unaware of the rest of the universe...

/daytripper
 
Um, nothing writes to memory unless the CPU initiates it. DMA xfers are not
initiated without CPU intervention (at least to set up the starting and
ending addresses). I'm referring here to a single CPU situation w/a N.
Bridge.
 
Um, nothing writes to memory unless the CPU initiates it. DMA xfers are not
initiated without CPU intervention (at least to set up the starting and
ending addresses). I'm referring here to a single CPU situation w/a N.
Bridge.

Ummmmmmm....You really need to do some homework.
A LOT of homework.

There's nothing else to be said until then...

/daytripper
 
Back
Top