Intel strikes back with a parallel x86 design

prep · Sep 28, 2005

No. When MS finanly arrived at a good OS, it was designed for MIPS
with ports to x86 and Alpha and others possible. Just look at what
the address map looks like.

Have a VMS manual at hand when you do...

--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be.

YKhan · Sep 28, 2005

Nathan said:
LOL. "data structures can remain 32-bit".
Khan was either momentarily confused or is perpetually ignorant
of computer architecture. Instruction operands and data structures are
different concepts.

In x86-64 long mode, a REX byte prefix is required to access
a full 64-bit register. One of many cases is the fact that
C++ uses pointers pervasively which means loading a 64-bit address
into a 64-bit register happens a zillion times.
Designing additional code bloat into the ISA is foolish
as it wastes memory and impedes code prefetching/decoding.

You do realize that anytime you work with the pointer registers (eg.
RIP, RSP, RDI, etc.) that it will automatically default to 64-bit
registers without requiring a REX prefix? Even if you're transferring
data between a pointer register to a general register, the general
register will automatically go to its 64-bit mode. It's only when
general registers are accessed by themselves that they'll default to
32-bit without the REX byte.

Regardless, you're quibbling over one extra byte for the REX. Code
bloat is never about instruction re-encoding, it's always about adding
more functions and features.

We'll see if AMD can implement anything truly challenging such as
HyperThreading or a novel revolutionary design such as Itanium 2.
Intel implemented EMT64 and dual-cores without even sweating.

Silly Bates, (those) Trix are for children. Let's hope AMD never tries
to implement Hyperthreading in the future, that would be the first
indication that their future core is in real trouble. Hyperthreading is
something Intel implemented to fill up the pipeline of a woefully
inefficient core that was highly susceptible to pipeline stalls. Even
Intel is heading for the hills away from Hyperthreading as fast as
possible, now that their new Pentium-M-based architecture can actually
keep instructions flying inside it without dumping it all out at the
first bump on the road. Now, they don't have all of those idle cycles
in their new cores, so they can't implement Hyperthreading anymore. Oh
wait, were you suggesting that Hyperthreading was the same as SMT, and
that Intel now knows how to implement SMT? Nah, Intel doesn't have any
experience with SMT yet, that Hyperthreading wasn't it.

Intel implemented EM64T without a sweat -- absolutely, but I hope you
don't really think that's what makes AMD's architecture superior? AMD
has said in the past that the 64-bit instructions only added 5% to
their core size. If AMD had stopped at just developing a 64-bit ISA,
then yes, AMD's entire advantage would've quickly disappeared. Hell,
they could've probably grafted 64-bit onto their last generation K7
cores, if it was just an ISA upgrade. It's not the instruction set
that's challenging these days. But AMD spent two years developing
Direct Connect Architecture (i.e. Hypertransport + integrated memory
controller). DCA is Intel's real challenge, and why Intel can't catch
up now.

Then you just have to look at the difference between AMD and Intel's
virtualization technologies (Pacifica vs. VT). The AMD technology is
already heavily designed to do virtualization across multiple
processors, since multiprocessing is what AMD is designing for these
days. Intel is already talking about coming out with a newer VT 2.0 in
order to add multiprocessor virtualization features.

As for that innovative Itanium 2 design. If it's so innovative, why
does it still have a FSB and an external memory controller? Why do you
have to design special NUMA chipsets for it, just to allow it to scale
beyond 2 processors?

Yousuf Khan

YKhan · Sep 28, 2005

Nick said:
Sigh. For the Nth time, that was true by the time that IBM actually
delivered but was NOT true when they COULD have delivered (about 2
years before). For the first 5 years of its life, the 80386/486
design languished in the desktop and el cheapo (reliability and security
no target) commercial markets. Masses of sales but not much margin.
Intel was sweating blood to get out of that and break into the high
end desktop and medium to large server markets.

You really don't know anything about PC history do you? Applying the
term "low-margin" and "el-cheapo" to the 386 or 486 takes a leap that
you couldn't even make on the Moon. PCs cost between $2000 to $3500. At
that time, Intel still didn't have competition in the form of AMD and
then later Cyrix, so prices were kept high. AMD introduced the first
clones of 386 towards the end of the 386 era, and then later introduced
clone 486's. It's only after that point that prices started to plummet.
Until then, Intel was rolling in both volume *and* high-margins. We
never arrived at the era of sub-$1000 PCs until about five years ago.

Yousuf Khan

David Gay · Sep 28, 2005

No, I am not missing that. One of the objectives of the PowerPC
project was to bring the prices of such 'quality' software down
to comparable to the x86-based stuff.

I like the quotes. You may be surprised to learn that pspice appeared
to be noticeably better (easier to use, faster) than the version of
spice motorola was using on work stations at the time (mspice IIRC,
whoever that was from).

IBM had plans to twist arms of the third party people, but never got
their systems out, and never proceeded to that stage.

How do you twist the arms of random small software companies? You can
"bribe" them to some extent, aka, pay them to port s/w - but it's far from
clear that they will be interested. Also that's a LOT of money to get
anywhere close to providing equivalent s/w to x86.

Anton Ertl · Sep 28, 2005

YKhan said:
You do realize that anytime you work with the pointer registers (eg.
RIP, RSP, RDI, etc.) that it will automatically default to 64-bit
registers without requiring a REX prefix?

There are no pointer registers in the AMD64 architecture, or in the
IA-32 architecture. RIP, RSP, and RDI are general-purpose registers.

In the AMD64 long mode, if a register (any of the 16 general-purpose
registers) is used in a pointer context (e.g., as destination of the
LEA instruction), it is used at 64-bit width. If the same register is
used in an integer context (e.g., as destination of an ADD
instruction), it is used as 32-bit register by default, as 64-bit
register with the appropriate REX prefix.

As for the reasons for making 32-bit the default non-pointer size, I
have wondered about that myself. Some of the answers can be found in
<[email protected]>.

Followups set to comp.arch.

- anton

Nick Maclaren · Sep 28, 2005

I like the quotes. You may be surprised to learn that pspice appeared
to be noticeably better (easier to use, faster) than the version of
spice motorola was using on work stations at the time (mspice IIRC,
whoever that was from).

Not at all - I knew that. Spice was one of the first 'performance'
applications to make good use of the 80386. But it was and is
completely atypical of commercial software - e.g. it was run on
dedicated systems and could (and did) run without any security or
even an operating system (on some systems). And the occasional
crash wasn't a major issue, either.

How do you twist the arms of random small software companies? You can
"bribe" them to some extent, aka, pay them to port s/w - but it's far from
clear that they will be interested. Also that's a LOT of money to get
anywhere close to providing equivalent s/w to x86.

At that time, you made a decent development environment available at
a reasonable price, and they beat a path to your door. With the
PowerPC project, they did - they were hammering on IBM's door for the
whole period I am referring to demanding that IBM release the thing!

Remember that the Microsoft environment was absolutely crap for
development, and a lot of people were having to reboot every time
their application failed (and then try to guess WHY it crashed, with
an arbitrary amount of lost output). I did some investigations, and
found that most compiler/debuggers couldn't survive even the cleanest
real SIGSEGV, and an unclean one often crashed the system. That is
why many/most such people were prepared to pay 3-5 times as much for
a Sun/Apollo/HP/etc. workstation for development. The PowerPC project
would have cut that by a factor of 2.

Seriously. I was then closely associated with a lot of such people
and companies, and was one myself.

Regards,
Nick Maclaren.

Oliver S. · Sep 28, 2005

We'll see if AMD can implement anything truly challenging

such as HyperThreading or a novel revolutionary design such
as Itanium 2.

The Intanium may be a revolutionary design - but this design isn't
able to revolutionarize the market because of its ridiculous integer
-performance and its even more ridiculous value/price-relationship.
It's good for number-crunching because this can usually be parallel-
ized by the compiler; but nothing else.
And I expect furure CPUs to to go a step back and completely drop
out-of-order and speculative execution and to implement massive SMT
like that of the Niagara and multi-core-designs.

Intel implemented EMT64 and dual-cores without even sweating.

I think you don't have access to the details of the process it
needed to change the P4-architecture to become AMD64-compatible.
So you can't say that definitely.

keith · Sep 29, 2005

The minicomputer market.

IBM didn't own the minicomputer market in the early '80s. DEC did. IBM
captured a good chunk with the AS/400, but that was because the software
was better. There was no OS/400 running on x86.

The rack mount server of the mid-90's was the
R20 and R30, which did not compete well against the 386/496 in terms of
compute power. They were used because they had a large address space and
lots of i/o bandwidth, and because there was a server OS (AIX) for them.

....and IBM still sold *tons* of '386/'486 boxen. Go figure.

By 2000 IBM was offering rack mount Intel systems, ostensibly for
Windows server, but many running that "toy OS" Linux.

2000? Gee, I thought the '386 was out a tad before that.

keith · Sep 29, 2005

They came from `Entry Level Systems' remember, and where considered a

^^^^^^^^^^^^^^^^^^^

Entry Systems Division

slighly brain-dead terminal by almost all of IBM. IBM *HAD* to be very
open, and touted as open as IBM knew that there was a large number of
the early adapters on the watch for `lock-in' and would screem about
it.

....and did when IBM had the better idea (MCA).

keith · Sep 29, 2005

I don't think you should lecture Keith about 615. I wouldn't be
surprised to find out that he worked on it.

No, 615 was before my PPC stint. I managed to skirt the event-horizon of
that one. I was in the gravitational field (required to interview),
but had other offers (x86).

And the 615 chip guys got
everything that wanted so far as I know... People, hardware, money.
Other folks got squeezed to feed the 615.

They had everything they wanted! ...right down to dinner served,
literally. It was a sink hole from which little came, other than
promotions for the guilty and blame for the non-participants.

When you say things like this, Nick, it makes me wonder about the other
things you say.

Nothing to be gained here.

keith · Sep 29, 2005

That's entirely possible. Lots of midrange offering got canceled or
crippled because they competed with the low end of the mainframe lines.
The RT (early powerpc) systems were rumored to be crippled to make them
fit into IBM's product line price/performance curve. IBM couldn't have
cheaper machines outperforming more expensive machines.

Calling ROMP/RIOS an "early PPC" is a bit of a stretch. That's not
much different than saying a 4004 was an early P4 or a Model-T was an
early Jaguar.

keith · Sep 29, 2005

You really don't know anything about PC history do you? Applying the
term "low-margin" and "el-cheapo" to the 386 or 486 takes a leap that
you couldn't even make on the Moon. PCs cost between $2000 to $3500. At
that time, Intel still didn't have competition in the form of AMD and

Remember, AMD made even the lowly 8088 (the D8088, IIRC). They *were* a
second source for even the first generation x86 processors.

keith · Sep 29, 2005

What makes you think there wasn't going to be a software solution.
There were several operating systems, including "workplace", pink,
talegent etc.

....and when management found out how miserable they were (as the
key developers had been telling them) the whole thing swirled
counter-clockwise.

Software can be ported. And at the time there wasn't as much.

....and much of what was, was x86 assembler. There wasn't room for
code-bloat on an 8088.

Sony and Microsoft arguably are doing it for game consoles. What market
would you target this here wonder cpu at?

I'm wondering what the difference is between a "game console" of tomorrow
and today's "PC". Hardware flexibility, sure, but we're all just hardware
gamers. ;-)

Microsoft and IBM did it. Apple did it. Hell, IBM did it with 360 and
OS. It could happen again.

IBM did it with the /360 and couldn't afford to ever do it again.
"Remember the FS."

The most likely candidate that I see is the
game console, although the cell phone is a secondary possibility.

Nah, eveyone's already declared that a failure. ;-)

Anne & Lynn Wheeler · Sep 29, 2005

Joe Seigh said:
That's entirely possible. Lots of midrange offering got canceled or
crippled because they competed with the low end of the mainframe lines.
The RT (early powerpc) systems were rumored to be crippled to make them
fit into IBM's product line price/performance curve. IBM couldn't have
cheaper machines outperforming more expensive machines.

fort knox was going to convert all the large number of different
corporate microprocessors & microengines to 801s. the 4341 follow-on
was going to be an 801 based microprocesser; there was a brand new
building built in endicott just for the group.

an issue was that the vertical microcode engines were getting about
ten microcode instructions per 370 instruction ... and every microcode
engine tended to be unique/different. fort knox was converting this
plethera of microcode engines all to 801 base. however by the 4341
follow-on eare, the technology was starting to be available to
implement some amount of the 370 instruction set directly in
silicon. the result was that direct silicon implemented 370 was going
to be faster than continuing the microcoded emulation paradigm
.... even using a 801 risc microprocessor. i helped with some of the
sections of the document that killed the 801 follow-on to 4341 ... in
support of direct silicon implementation.

801/ROMP was a joint research & office products division (OPD) to do a
follow-on to the OPD displaywriter. ROMP had the original, traditional
801 hardware/software design trade-offs ... with single integrated
domain (no hardware protection features) with close, proprietary CP.r
operating system with everything written in PL.8. I believe the
business analysis eventually was that while ROMP price/performance was
great ... the entry level configuration was still more expensive than
the top of the displaywriter market price-point. In any case, the
displaywriter follow-on project got killed. As a strategy to save the
group in Austin, it was decided to retarget the hardware platfrom to
the unix workstation market. The company that had done the AT&T unix
for PC/IX was hired to do a similar port to ROMP. However, you still
had all these PL.8 programmers ... so the scenario was to create a
project called VRM ... which would be an abstract virtual machine
implementation in PL.8 ... and the UNIX port would be done to the
abstract virtual machine interface ... rather than to the bare
hardware.

hardware protection had to be added to 801 in order to support the
unix execution model (as opposed to the closed, proprietary cp.r/pl.8
model). also the claim for doing the VRM+unix hybrid was that it could
be done significantly faster, cheaper, and fewer resources than a
direct unix port to the bare iron. this was subsequently disproved
when the BSD port was doine to the bare iron for "AOS" offering on the
pc/rt.

some number of collected 801, romp, rios, power/pc, etc postings
http://www.garlic.com/~lynn/subtopic.html#801

now there was some crippling of the subsequent RIOS/RS6000 systems
.... not particularly the processor ... but the overall systems.
RS6000 moved to microchannel and the 6000 group were mandated to use
PS/2 microchannel cards ... not so much to cripple them vis-a-vis
mainframe midrange ... but to help with the PS2 card volumes. The
following has a drawn out descussion of the microchannel 16mbit t/r
card for the RS/6000 ... which was actually slower than the 16bit ISA
4mbit t/r card that had been developed for the PC/RT (there were
similar issues with many of the other PS2 microchannel cards)
http://www.garlic.com/~lynn/2005q.html#20 Ethernet, Aloha and CSMA/CD

now there is rumored to have been some flavor of attempting
to protect mainframe operation in the wake of the following
http://www.garlic.com/~lynn/95.html#13

when my wife and I were told that we couldn't work on configurations
with more than four processors (somewhat numerical intensive wasn't
considered much of a threat ... but moving into hardcore commercial
database processing became much more of an issue). when we were doing
ha/cmp, we had also coined the terms disaster survivability (as
opposed to disaster/recovery) and geographic survivability ... and I
got asked to write a section in the corporate continuous availability
strategy document. Both Rochester and POK complained about the section
(they couldn't meet) and it was removed. some collected ha/cmp
postings:
http://www.garlic.com/~lynn/subtopic.html#hacmp

Possibly the most flagrant scenario of such, was POK trying to protect
their mainframe machines from Endicott's mid-range 4341
mainframe. 4341 had higher thruput and cheaper than POK's 3031. A
cluster of six 4341s also had higher thruput and cheaper than a POK's
3033. a recent posting discussion some of the 4341/3033 issues ...
that also included a large number of past postings on the subject
http://www.garlic.com/~lynn/2005q.html#30 HASP/ASP JES/JES2/JES3

=?ISO-8859-1?Q?Niels_J=F8rgen_Kruse?= · Sep 29, 2005

Oliver S. said:
The Intanium may be a revolutionary design - but this design isn't
able to revolutionarize the market because of its ridiculous integer
-performance and its even more ridiculous value/price-relationship.

Actually, integer performance is pretty good on benchmarks that compiler
writers have had opportunity to game.

And I expect furure CPUs to to go a step back and completely drop
out-of-order and speculative execution and to implement massive SMT
like that of the Niagara and multi-core-designs.

That will sure do wonders for integer performance, right?

Tony Hill · Sep 29, 2005

|> Hypertransport is an act that Intel still hasn't been able to
|> follow; the FSB bottleneck kills you at two CPUs.

My guess is that the New Microarchitecture will drop the FSB and
move to the same sort of design as AMD and most other vendors.
Does anyone know for certain?

They do indeed plan to follow AMD's lead in this area as well. Intel
calls their implementation "Common Serial Interconnect", and it ties
in with Intel's plan to use a common system interface for both Itanium
and Xeon chips:

http://www.xbitlabs.com/news/cpu/display/20050615232538.html

Unfortunately we're going to have to wait until about 2007/08
timeframe before this actually happens.

Tony Hill · Sep 29, 2005

Sigh. For the Nth time, that was true by the time that IBM actually
delivered but was NOT true when they COULD have delivered (about 2
years before).

By 1989 (2 years before IBM delivered PowerPC), x86 was FIRMLY
entrenched in the software market.

For the first 5 years of its life, the 80386/486
design languished in the desktop and el cheapo (reliability and security
no target) commercial markets. Masses of sales but not much margin.
Intel was sweating blood to get out of that and break into the high
end desktop and medium to large server markets.

For languishing in the desktop and el cheapo markets, Intel was doing
pretty damn good for themselves. The earliest earnings I could dig up
were from 1990, which shows that Intel's net income for the year was
$650M on $3,921M worth of revenue. Any way you slice it that is a
VERY successful company.

Those of us who knew something about both technologies and, more
importantly, those markets felt that IBM's original PowerPC design
(which was NOT just a CPU, but a complete system) would have quickly
dominated the high-end desktop and commercial server markets - as I
said, every company bar Intel had signed up, and most were actively
planning products. But the 2 year delay was long enough for Intel
to get its act together, and the rest is history.

It was history long before then, it's just that IBM didn't realize it.
Think about IBM's PS/2 design. Better design than all the other PCs
out there and it damn near killed the companies PC division. Why?
Because everyone else could do a good enough job for a whole lot less.
If IBM had really pushed PowerPC at the end of the 80's it almost
certainly would have suffered a similar fate.

Sigh. Remember who you are responding to.

Back in 1996, I pointed out that the IA64 software was predicted on
solving at least three problems that had defeated the best computer
scientists and vendors for 25 years, and were possibly insoluble.
HP persuaded Intel that they could be solved "to order" - I said
that they couldn't be. I was right and HP/Intel were wrong.

Fine, we can agree on that much at least.

$30 million is MASSES for a half-sane system if managed competently.

I think you would be VERY surprised at just how quickly a single large
project could eat up $30M. Competent management is hard to come by
and often expensive (perhaps not as expensive as incompetent
management, but expensive none the less).

Both BSD and Linux distributions have done it for a tiny proportion
of the cost.

Sure, just look at the millions upon millions of PCs out there using
the Linux ports of MIPS, HP-UX, SPARC, S/390, etc. etc. Sure, Linux
is ported to TONS of architectures, but other than x86 the support for
such architectures is VERY poor. Drivers are virtually non-existent,
commercial application support is worse and even most open source
applications end up being YEARS behind their x86 counterparts (if they
are ported at all).

The only non-x86 architectures were Linux really succeeds are in the
embedded space (using the term "embedded" somewhat loosely to
encompass things like set-top boxes).

Going forward I really only see 3 ISA's with any sort of future. ARM
for the low-cost/low-end embedded space. PowerPC for the higher-end
of the embedded space and particularly in the set-top box and console
market, as well as for high-end IBM's POWER servers, and x86 for
everything else. All other architectures, past, current and future,
are pretty much doomed to obscurity. There just isn't enough
improvements that can be made from the ISA side of things to come
remotely close to off-setting the HUGE inertia required to get off the
ground.

Nick Maclaren · Sep 29, 2005

They had everything they wanted! ...right down to dinner served,
literally. It was a sink hole from which little came, other than
promotions for the guilty and blame for the non-participants.

Yup. It was an extreme example of where IBM surveyed its customers,
completely ignored the geeks, and regarded the technical responses
of the statospheric suits as gospel. In the technical user groups,
IBM reps were amazed at the negative reaction when the project was
announced.

There was a later example of this when the PowerPC was FINALLY
released. It was downgraded from SCSI to IDE, and I rang the head
of UK marketing. He said that customers had demanded it because
they said that the needed to reuse theor old disks in new machines.
I told him that nobody had done that in commerce, and it would
cut the machines off from the high-quality peripherals. He didn't
believe me and said "well, we will see".

Well, we did. Guess who was right?

Regards,
Nick Maclaren.

Nick Maclaren · Sep 29, 2005

Calling ROMP/RIOS an "early PPC" is a bit of a stretch. That's not
much different than saying a 4004 was an early P4 or a Model-T was an
early Jaguar.

I like the analogy, but I feel that both the 4004 and the Model-T
had an engineering coherence that the ROMP/RIOS seriously lacked!

Regards,
Nick Maclaren.

Nick Maclaren · Sep 29, 2005

They do indeed plan to follow AMD's lead in this area as well. Intel
calls their implementation "Common Serial Interconnect", and it ties
in with Intel's plan to use a common system interface for both Itanium
and Xeon chips:

http://www.xbitlabs.com/news/cpu/display/20050615232538.html

Unfortunately we're going to have to wait until about 2007/08
timeframe before this actually happens.

Yes, I know about that one. However, the New Microarchitecture is
slated for late 2006, so predates that. I am somewhat puzzled at
exactly where each change is likely to be introduced, and how.
It is, of course, possible that Intel's engineers are, too, and
are running round in circles attempting to make reality out of
executive decisions.

Regards,
Nick Maclaren.

Intel strikes back with a parallel x86 design

prep

YKhan

YKhan

David Gay

Anton Ertl

Nick Maclaren

Oliver S.

keith

keith

keith

keith

keith

keith

Anne & Lynn Wheeler

=?ISO-8859-1?Q?Niels_J=F8rgen_Kruse?=

Tony Hill

Tony Hill

Nick Maclaren

Nick Maclaren

Nick Maclaren