Athlon64 Spanks P4 in 90nm Power Consumption tests

keith · Oct 13, 2004

If you've got a few minutes,

Sure, that's why we're here, eh? ;-)

there's a couple things I'd like clarification
on ... My original graph was drawn on the basis that in modern CPUs, nothing
is ever so nice as to go bad in only a linear way

The more things stay the same... ;-)

The only transistor
physics I have done is for low frequency and theoretical transistors (and
from a physics as opposed to engineering point of view) in which case the
power usage is proportional to the switching frequency, all other things
remaining equal. Assumimg also that modern CPUs are FET-like instead of
bipolar.

CMOS, certainly. There are few "passive" devices (like bipolar or
N/PFET). ...at least not on purpose. The problem is in the details.

What actually happens in the *real* world? Assuming voltage remains
constant, how non-linear (with respect to frequency) is a transistor in
the range that it's typically being pushed in a modern CPU? And what is
the main contributor to that non-lineararity?

The "real world" five years ago could be modeled quite like you propose.
The power more-or-less proportional to the *active* CMOS power
equations, much like; P ~ kCFV**2. Forgetting the "static" (or leakage)
power, was easy since it wasn't a big issue (perhaps 10%). The world
changed at 130nm and is getting worse exponentially as the structures
shrink. (If you don't believe me look at the ratio of standby/active
power of a PII vs. PIV at the same voltage.)

Speed is still proportional to the voltage, but the power is proportional
to (at least) the square of the voltage. What's changed is that the static
(leakage) power is now a very significant part of the power budget. Since
leakage isn't a resistive effect (current goes up at a higher rate than
voltage) the power dissipated is even a higher-order function. Leakage
sux! ;-)

There are two major contributors to this power, sub-threshold leakage
(essentially current through the ever-shortening channel when the device
is "off"), and gate tunneling (current tunneling across the
few-atom thick gate oxide). Both of these currents are a huge function of
voltage. Both can be mitigated by a smart choice of devices and operating
condiditons.

A processor designed for a server may use lower threshold
devices (that leak like hell) and very thin gate oxide (likewise). ...and
pay for it in power dissipation. A laptop may make the oppposite choice.
Indeed within a single system one can control the voltage (the only
independent variable[*]) depending on the processing needs.

[*] suspending clocks doesn't change the power for the work done,
since 'f' is a linear function WRT power/performance.

My issue here is that voltage is *not* a constant. Even the Pentiums had
different voltage ratings across the product line. THe PII made it a
function of the processsor module (but was still static). TMTA (I
believe) introduced the concept of varying the voltage dependent on the
processing needs. This is now a requirement.

A single graph that shows power vs. frequency for a processor
family doesn't show anything close to the whole picture.

David Wang · Oct 13, 2004

In comp.sys.ibm.pc.hardware.chips keith said:
My issue here is that voltage is *not* a constant. Even the Pentiums had
different voltage ratings across the product line. THe PII made it a
function of the processsor module (but was still static). TMTA (I
believe) introduced the concept of varying the voltage dependent on the
processing needs. This is now a requirement.

A few more things.

IIRC Tom Burd's PhD thesis was on the dynamic voltage variance.
(The ex-Berkeley CPU infopad maintainer)

AMD/Intel had to be more conservative in doing dynamic voltage
supply because of more extensive use of custom/dynamic logic.
Some of the non standard/non static CMOS stuff doesn't like to have
the supply voltage change on them very much, but the pressure
to reduce power is leading everyone down similar paths.

Foxton is advertised as the enabling technology for Montecito.
(dynamic frequency/voltage adjustment for power-performance
optimization @ 90nm node)

Without Foxton, Montecito would have ended up having to eat
3X of the power of Madison. (So sayth the "whitepaper")

alexi · Oct 13, 2004

keith said:
What actually happens in the *real* world? Assuming voltage remains
constant, how non-linear (with respect to frequency) is a transistor in
the range that it's typically being pushed in a modern CPU? And what is
the main contributor to that non-lineararity?

Click to expand...

The "real world" five years ago could be modeled quite like you propose.
The power more-or-less proportional to the *active* CMOS power
equations, much like; P ~ kCFV**2. Forgetting the "static" (or leakage)
power, was easy since it wasn't a big issue (perhaps 10%). The world
changed at 130nm and is getting worse exponentially as the structures
shrink. (If you don't believe me look at the ratio of standby/active
power of a PII vs. PIV at the same voltage.)

Speed is still proportional to the voltage, but the power is proportional
to (at least) the square of the voltage. What's changed is that the static
(leakage) power is now a very significant part of the power budget. Since
leakage isn't a resistive effect (current goes up at a higher rate than
voltage) the power dissipated is even a higher-order function. Leakage
sux! ;-)

There are two major contributors to this power, sub-threshold leakage
(essentially current through the ever-shortening channel when the device
is "off"), and gate tunneling (current tunneling across the
few-atom thick gate oxide). Both of these currents are a huge function of
voltage. Both can be mitigated by a smart choice of devices and operating
condiditons.

A processor designed for a server may use lower threshold
devices (that leak like hell) and very thin gate oxide (likewise). ...and
pay for it in power dissipation. A laptop may make the oppposite choice.
Indeed within a single system one can control the voltage (the only
independent variable[*]) depending on the processing needs.

[*] suspending clocks doesn't change the power for the work done,
since 'f' is a linear function WRT power/performance.

My issue here is that voltage is *not* a constant. Even the Pentiums had
different voltage ratings across the product line. THe PII made it a
function of the processsor module (but was still static). TMTA (I
believe) introduced the concept of varying the voltage dependent on the
processing needs. This is now a requirement.

A single graph that shows power vs. frequency for a processor
family doesn't show anything close to the whole picture.

You are confused with a bunch of different concepts.

Your first confusions is about "voltage" and "product line".
Voltage is something engineers are using at will, to make
the product work for it's targeted frequency but staying
within reasonable reliability of devices and total power
envelope. The voltage is a variable that you can control
during your experiments with processor clocking and plotting
charts of processor power versus core clock. If you
want to compare two processors, one at 130nm, and another
at 90nm, to see how their power grows with frequency,
you run them at their corresponding _constant_ voltages
(and temperatures), and you get the picture I charted before.
Even if you are correct that "Pentiums had different voltage ratings
across the product line", you still can run each processor
at the same voltage while lowering core frequency down to
get data points. Or increasing frequency, as all overclockers do,
and they do change voltage at will.

If you are trying to compare same (similar? if can find one!)
processors on 0.35um, 0.25um, 0.18um, etc. process along the
frequency axis, then this is a different story, the story called
"CMOS scalability". Then yes, the classic CMOS scalability
is severely broken starting from 0.25um, so the voltages
between process generations are not going _down_ as they are
expected from proportional geometry shrinks (because you can't
shrink transistors proportionally any more because of various
atomic-level limitations). I guess one can call it "non-linear"
shrinks :-(

The original question was, as I read it, "what is non-linear"
with "transistors" as designs are pushed to higher frequencies
(as the original chart clearly implies). The correct answer is that
there is nothing really "non-linear" with CMOS transistor _GATES_.
The whole known methodology of designing processors is still
based on "flip-stay-flop" concept, therefore even today's
processors still follow the basic power formula. A quick
summary of CMOS power consumption basics can be found, e.g. here:

http://www.cse.psu.edu/~vijay/iscatutorial/tutorial-sources.pdf

Some (not very nice) discussion about common misconceptions
about leakage, dynamic power, and heat sinks can be found at RWT:

http://www.realworldtech.com/forums/index.cfm?action=detail&PostNum=1224&Thr
ead=1&entryID=14502&roomID=11

You can find there an example of how 0.13um Pentium-4 power scales
with frequency, all based on published specification data for Icc.
From these documents it is clear that I am the last person
who should be accused of "forgetting leakage".

What happens in "real world"? In high-performance processor world,
very simple - you push your design to the limits dictated by
throat-cutting competition, by tweaking geometry, process
corners, voltage, leakage, making thermal slabs and heat-pipe
based heat sinks, all for one reason - to maintain a foothold
in market share. When you hit a wall, everything is very
strongly non-linear - a brick wall :-)

- aap

Athlon64 Spanks P4 in 90nm Power Consumption tests

keith

David Wang

alexi