Pretty impressive, isn't it? And just to be clear, I had practically
nothing to do with this - the author of that Vail presentation, and a host
of others who actually implemented it, deserve all the credit.
I agree that there are some pretty impressive advances in there, in
particular the fact that it isn't monitoring temperature but actual
power usage -- and does it on a real time basis, rather than something
controlled by the OS or BIOS making changes happen much more slowly.
However, I'm unclear on how much of the savings in getting the 100w TDP
specced for Montecito was done merely with the "guardband removal" by
defining TDP using some sort of "average case" power usage, measured
using SPEC2000 or similar test software. Since the TDP of McKinley
listed as 130w is measured using a different method, they really aren't
comparable. What would be the TDP of the McKinley measured the way
Montecito intends to? Suddenly its 130w might be 90w or something,
making the improvements in Montecito somewhat less impressive.
I'm thinking more here in terms of actual power usage and heat production,
i.e., what datacenter people are thinking of, rather than the problems
that board designers face in terms of insuring there is sufficient power
to the CPU socket for worst case power usage. Up until now, CPUs have
solved that by speccing the measured worst case (like AMD does, referred
to as a "power virus" in this slideset) or taking 90% of theoretical max
power (like Intel does, at least for x86, for IA64 I believe TDP was
specced as 100% of theoretical max power)
When you look at the power usage of McKinley on SPEC2000 on page 9, you
could quite reasonably define TDP as 97.5w based on the max power usage
of 75% within that suite. If you were willing to give up a little bit
of performance in exchange for power (as you might if you wanted to cram
two cores onto a die) you might define it as 65% and your TDP is now 84.5w.
Add the 90nm shrink and some Pentium M-like tweaks with lower power and
less leaky transistors on non critical paths, etc. and 100w with two cores
is suddenly well within reach, but most of it has been reached by defining
TDP differently, and applying existing techniques to IA-64 for the first
time, and not by some amazing leaps that reduced power 3x of where it
would have been with a simple shrink as the slides wish to imply.
Given that you can define the TDP as almost anything you like, using the
ability of power control, you could use the same part as a 100w TDP normal
Montecito, and as a 50w LV version, 25w ULV version, etc. Given that
board designers for the kind of high end systems running Montecito are
likely to over spec their designs, seeing boards capable of delivering
130w+ to Montecito despite the 100w TDP seems reasonable. So I wouldn't
be surprised that in addition to being able to toggle between normal mode,
LV mode and ULV mode, you might also be able to toggle to a turbo mode
that tosses that undoes that "guardband removal" that gets used for SPEC.
Might be worth checking the kernel parameters/firmware settings section on
the SPEC disclosures for Montecito quite carefully!