details about dual-core Yonah emerge

YKhan · Jun 7, 2005

keith said:
It was *not* in any way "tiny". It was a significant performance hit, so
much so that the P6 was unsuable in Win. To be fair, it was supposed to
be a server chip and 32bit only.

It was around a 5% drop as I recall. I remember reading it at the time,
and thinking it was much ado about nothing.

I thought they renamed the segment registers, rather than cache them, per
se. Felg has the skinny here.

If they used register renaming then that's a form of caching the
registers anyways.

Yousuf Khan

keith · Jun 8, 2005

It was pretty good with NT 4.0

....and NT4 was available when PPro shipped? It really wasn't that great
with NT4. The PII would be a more contemporary comparison. ...and NT4
wasn't a screamer.

keith · Jun 8, 2005

It was around a 5% drop as I recall. I remember reading it at the time,
and thinking it was much ado about nothing.

It was far woese than that. The PPro was slower than the P5. Segment
register reloads were *expensive*, and they're all over Win9x.

If they used register renaming then that's a form of caching the
registers anyways.

A maggot is a form of a fly too.

keith · Jun 8, 2005

The Pentium Pro was lacking one small feature that hurt performance in
16-bit code by about 10-15%. This feature was added back to the
architecture with the PII.

I don't think this is quite accurate. The PPro was not *supposed* to
execute 16bit code, thus the architects didn't see any harm in making
segment register reloads expensive. They *added* the segment register
renaming (cacheing) to ameliorate this problem, in the PII. Anyway, if
you can wake up Felg, he has the real deal. I looked through my email
archives and couldn't find the info.

Note that referring to the Pentium-M and the PentiumPro as being of the
same architecture is really stretching things. While the Pentium-M
might be able to trace it's roots back to the PentiumPro, the two chips
are VERY different in virtually every aspect.

Gee, I thought all Intel's been doing for a couple of decades is
cranking up the clock! ;-)

Felger Carbon · Jun 8, 2005

keith said:
Anyway, if
you can wake up Felg, he has the real deal. I looked through my email
archives and couldn't find the info.

I _had_ the real deal. I did some housecleaning and tossed the really
old stuff, like the above. No longer interested in the PPro. Sorry.

Tony Hill · Jun 8, 2005

It was *not* in any way "tiny". It was a significant performance hit, so
much so that the P6 was unsuable in Win. To be fair, it was supposed to
be a server chip and 32bit only.

Come now Keith, that's a HUGE exaggeration. The drop in performance
really was TINY, and yes I did use some PPro systems in Win95.
Honestly a PPro 200MHz with 256KB of cache was about 10-15% slower
than a Pentium 200 in most 16-bit Win95 apps, and it was FASTER any
time you started running any 32-bit Win95 apps (of which there were
quite a number when these chips were common) and MUCH faster as soon
as you hit any heavy 32-bit FP code.

Tony Hill · Jun 8, 2005

...and NT4 was available when PPro shipped? It really wasn't that great
with NT4. The PII would be a more contemporary comparison. ...and NT4
wasn't a screamer.

The PPro was first shipped late 1995, NT4.0 was first shipped mid 1996
and the PII was first shipped mid 1997. So in a sense the PPro and NT
4.0 were out in the same basic timeframe.

That being said though, NT 4.0 wasn't really useable until Service
Pack 4 was released, and that didn't happen until sometime in '99.

David Wang · Jun 8, 2005

keith said:
Well the 486 was much more efficient at running 32-bit code than the
386. The Pentium was better than the 486. And the PPro was better than
the Pentium.

But yes, there was a tiny drop off in performance at 16-bit code when
using the PPro compared to the previous Pentium. That was due to the
fact that there wasn't a segment register cache in the PPro.

[/QUOTE]

It was *not* in any way "tiny". It was a significant performance hit, so
much so that the P6 was unsuable in Win. To be fair, it was supposed to
be a server chip and 32bit only.

It wasn't that bad.

Intel was able to crank out 150 MHz PPro's on the same process as 120
Pentium's (0.6um BiCMOS), and If you compared them that way, the 150 MHz
PPro's were ever so slightly faster on Win 3.x. If you compared them on
equal MHz situation, it was about 15~20% hit.

For reference, take a look at Linley Gwennap's article in the
Microprocessor Reports. July 31, 1995 (Volume 9, Number 10)
"P6 Underperforms on 16-bit code"

Comparing a P6-150 (0.6um BiCMOS) against a P5-133 (0.35um BiCMOS), a
figure was drawn, and I'm roughly transcribing the relative performance
numbers from the figure. (performance relative to P5-100)

P5-133 P6-150
Win 3.1 (Sysmark) ~1.2 ~1.05
Win95 (per Intel) ~1.1 ~1.4
WinNT (Sysmark NT) ~1.2 ~1.75
Unix (SPECint92) ~1.3 ~2.1

David Wang · Jun 8, 2005

I don't think this is quite accurate. The PPro was not *supposed* to
execute 16bit code, thus the architects didn't see any harm in making
segment register reloads expensive. They *added* the segment register
renaming (cacheing) to ameliorate this problem, in the PII. Anyway, if
you can wake up Felg, he has the real deal. I looked through my email
archives and couldn't find the info.

You're off base in this case.

The PPro was supposed to execute everything, there wasn't a
conscious effort to optimize "32 bit code" and leave out "16 bit". It
was just that the architects didn't realize how important some of
these things such as partial register usage were. The architects
came from a non-x86 background, and they were thinking about high
performance. There were talks about the segment registers during
the architecture phase of the processor, but the ball was dropped,
and there were some miscommunication about how important segment
register rename was, and partial register usage too. I remember
Andy Glew talking quite a bit about the partial register usage,
and he took the blame for that as well IIRC.

The basic idea of the PPro was to make the common case fast, and
the not-so-common case, not-so-fast. Unfortuantely, some of the
cases thought to be not-so-common turned out to be more common
than believed. This whole thing about "16 bit software" is just
a cover all term to mean "legacy software that contained a bunch
of weird hand coded stuff that the architects of P6 didn't think
would be common."

Nate Edel · Jun 9, 2005

keith said:
...and NT4 was available when PPro shipped? It really wasn't that great
with NT4. The PII would be a more contemporary comparison. ...and NT4
wasn't a screamer.

As long as you had enough memory, NT4 ran pretty well on machines as slow as
a Pentium(classic) 75mhz ... for native Win32 applications, generally better
than Win 95.

The key thing was that you really needed a good chunk of memory to run it
comfortably. Of course, a good chunk of memory by today's standards is
tiny, but back in mid-96 when NT 4.0 came out...

nobody · Jun 10, 2005

As long as you had enough memory, NT4 ran pretty well on machines as slow as
a Pentium(classic) 75mhz ... for native Win32 applications, generally better
than Win 95.

The key thing was that you really needed a good chunk of memory to run it
comfortably. Of course, a good chunk of memory by today's standards is
tiny, but back in mid-96 when NT 4.0 came out...

Heck, even a 486 (actually, AMD 586-133) with 32 MB RAM was good
enough to run NT4 until it came to SP4 (or was it even SP3?) that
slowed things down quite noticeably. As for P75 with 48MB RAM running
NT4 - that's exactly the config Chase provided to me in late 1998 to
do some consulting work for them. Well, they paid by an hour, and I
always could go get a cup of coffee while that clunker was crunching
the data at 100% CPU load...

details about dual-core Yonah emerge

YKhan

keith

keith

keith

Felger Carbon

Tony Hill

Tony Hill

David Wang

David Wang

Nate Edel

nobody