Could you be more specific, or at least point me to a link (or two) that can
sort of lay out the nuts & bolts of it? Thx.
Most of this was beaten to death when the NetBurst architecture first
came out. Netburst could run faster because it did less work in each
clock. One way in which that effect is obvious even from a high-
altitude view of the architecture is that NetBurst had a much longer
pipeline than competing architectures, including the Pentium III.
Each stage in the pipeline had less to do, but there were more stages.
That made pipeline stalls on NetBurst much more expensive than on PIII
or competing AMD designs, because it took many more clocks to clear
the pipeline of useless instructions. NetBurst would have done better
if code were less branchy and more predictable in memory fetches,
because it is branch mispredictions and cache misses that most
frequently stall the pipeline.
Intel made two heavy bets that software could be made predictable:
IA-64 and NetBurst. Andy Glew, who works for Intel, has said that
Intel doesn't understand software, and he may be right. Because of
Intel's acquisition of Multiflow and heavy investment in compiler
research, I think it more likely that Intel bet incorrectly that it
could bend software to its architecture (and that, in fact, it
understood software better than anyone else). Much has been learned
in the process but, at the moment, it looks like Intel lost heavily on
its bet. The processors that are most popular and successful at the
moment are successful precisely because they cope well with
unpredictable code.
Robert.