65nm news from Intel

=?ISO-8859-1?Q?Jan_Vorbr=FCggen?= · Sep 3, 2004

ODEs, to a great extent.

Parallelize over initial conditions?

A great deal of transaction processing.

Parallelize over transactions? OK, the commit phase needs to be
serialized.

A great deal of I/O.

Why that?

Event handling in GUIs.

Is that really limiting performance in any way, nowadays?

Jan

Russell Wallace · Sep 3, 2004

|> What are some examples of important and performance-limited
|> computation tasks that aren't run in parallel?

ODEs, to a great extent.

Can you be more specific? What sort of jobs using ODEs? Why can't they
be parallelized?

A great deal of transaction processing.

Any references? Everyone I've heard of with heavy transaction
processing workloads is buying SMP servers; I haven't heard of anyone
saying "well we could afford a 32-way box, and our workload sure as
hell needs it, but we're just sticking with a 1-way box because the
software can't handle more".

Nor have I seen anyone advertise "our server has only 1 processor
because your software probably can't use more, but it has the storage,
reliability etc for heavy-duty transaction processing"; there should
be a big market for such if your statement is correct.

A great deal of I/O.

I was under the impression I/O was I/O limited, not CPU limited?

Event handling in GUIs.

That's not CPU-limited either; it runs plenty fast enough on a single
processor.

TOUATI Sid · Sep 3, 2004

Yousuf said:
http://www.reuters.com/locales/c_newsArticle.jsp?type=technologyNews&localeKey=en_IN&storyID=6098883

Yousuf Khan

"This is evidence that Moore's Law continues," said Mark Bohr( Intel's
director of process architecture and integration).

I remember that I read some months ago an interesting study of two intel
researchers who had shown the end of Moore Law: they said that we have
now a real wall that we cannot cross.

Can a company contradict itself like this (within one year) ?

S

Nick Maclaren · Sep 3, 2004

|>
|> > ODEs, to a great extent.
|>
|> Parallelize over initial conditions?

If that is what you are doing. If they are a component of a more
complex application, you will find that unhelpful :-)

|> > A great deal of transaction processing.
|>
|> Parallelize over transactions? OK, the commit phase needs to be
|> serialized.

Think multi-component and parallelising WITHIN transactions. In
theory, it can often be done. In practice, doing it and maintaining
consistency is hard enough that it isn't. Why do you think that so
many electronic transactions are so slow, and often getting slower?

Note that this is not a CPU limitation as such, but is a different
level of parallelism. But it is the same class of problem.

|> > A great deal of I/O.
|>
|> Why that?

Incompetence and historical, unparallelisable specifications.

|> > Event handling in GUIs.
|>
|> Is that really limiting performance in any way, nowadays?

Yes. I am sent gibbering up the wall by it, and am not alone in
that. The reason is that I am using some fairly ancient machines
with more modern software. Answers:

Never upgrade software, and don't connect to parts of the net
that need newer versions.

Upgrade your system. Oops. A few years down the line, you
will have the same problem. And remember that Not-Moore's Law
has reached the end of the line - so, while I can upgrade by a
healthy factor and remain serial, people with the latest and
greatest systems can't.

Regards,
Nick Maclaren.

Nick Maclaren · Sep 3, 2004

|> >
|> >|> What are some examples of important and performance-limited
|> >|> computation tasks that aren't run in parallel?
|> >
|> >ODEs, to a great extent.
|>
|> Can you be more specific? What sort of jobs using ODEs? Why can't they
|> be parallelized?

You need to ask an ODE expert. I am not one, and am relying largely
on information provided by one. A Web search will help (I checked
my memory that way).

|> >A great deal of transaction processing.
|>
|> Any references? ...

See my other posting. I am talking about the latency of a single
transaction.

|> >A great deal of I/O.
|>
|> I was under the impression I/O was I/O limited, not CPU limited?

(a) Not if it is Ethernet and TCP/IP, it isn't.

(b) Parallelism is parallelism. The same issues arise and similar
approaches work.

(c) People (well, Microsoft, at least) are starting to put that
level of I/O into hardware - God help us all.

|> >Event handling in GUIs.
|>
|> That's not CPU-limited either; it runs plenty fast enough on a single
|> processor.

(a) See (b) above.

(b) Don't bet on it and, no, it doesn't. Every system I have used
is slow enough that it misses and mishandles events, and that has
included the latest and greatest workstations. Yes, my reactions
are unusually fast for an old fogey.

Regards,
Nick Maclaren.

Paul Repacholi · Sep 3, 2004

Stefan Monnier said:
Getting back to the issue of multiprocessors for "desktops" or even
laptops: I agree that parallelizing Emacs is going to be
excrutiatingly painful so I don't see it happening any time soon.
But that's not really the question.

In fact, Emacs IS a good candidate. Very little context that is not
buffer or window/frame local. Going into that swamp is another issue!

--
Paul Repacholi 1 Crescent Rd.,
+61 (08) 9257-1001 Kalamunda.
West Australia 6076
comp.os.vms,- The Older, Grumpier Slashdot
Raw, Cooked or Well-done, it's all half baked.
EPIC, The Architecture of the future, always has been, always will be.

=?ISO-8859-1?Q?Jan_Vorbr=FCggen?= · Sep 3, 2004

Think multi-component and parallelising WITHIN transactions. In

theory, it can often be done. In practice, doing it and maintaining
consistency is hard enough that it isn't.

What kind of transaction - by itself - would take long enough to warrant
that?

Why do you think that so
many electronic transactions are so slow, and often getting slower?

I wonder myself. I put it down to general incompetence - in particular,
because some much data is unnecessarily slung around over none-too-fast
networks. Of course, anything XML-based will make things only worse.

|> > Event handling in GUIs.
|>
|> Is that really limiting performance in any way, nowadays?

Yes. I am sent gibbering up the wall by it, and am not alone in
that. The reason is that I am using some fairly ancient machines
with more modern software.

Ancient as in a 30 MHz (IIRC) 68040 running NeXtStep - which is the most
responsive UIs I've ever seen? That is to say: any performance problem
with UIs is a problem of design and/or implementation, not the problem
as such. Not that that helps you any if the application you are using
is programmed on such a UI...cue WIN32 woes...

Jan

Sander Vesik · Sep 3, 2004

In comp.arch Nick Maclaren said:
A good question. But note that "by '08" includes "in 2005".

By whom is it expected? And how is it expected to appear? Yes,
someone will wave a chip at IDF and claim that it is a Montecito,
but are you expecting it to be available for internal testing,
to all OEMS, to special customers, or on the open market?

Is any kind of itanium actually available on the open market (and
i mean openmarket for new chips, not resale of systems)?

Sander Vesik · Sep 3, 2004

In comp.arch Nick Maclaren said:
Event handling in GUIs.

GUIs and event processing and the inability to trivialy allow for
at least topwindowlevel total paralellism is just a complete screwup.
Its made worse by middleware (like say java and swing) exporting
such braindeadness to application level.

So instead of "write to tolerate or take advantage of parallelism
if present" everybody writes for "serial everything in GUI is the
one andonly true way".

Nick Maclaren · Sep 3, 2004

|> > Think multi-component and parallelising WITHIN transactions. In
|> > theory, it can often be done. In practice, doing it and maintaining
|> > consistency is hard enough that it isn't.
|>
|> What kind of transaction - by itself - would take long enough to warrant
|> that?

Anything that is built up of a couple of dozen steps, with the
various components scattered from here to New Zealand!

In practice, the cumulative latency issue bites earlier, but that
one is imposed by physical limits. Again, I am not denying the
overriding cause of incompetence.

|> > Why do you think that so
|> > many electronic transactions are so slow, and often getting slower?
|>
|> I wonder myself. I put it down to general incompetence - in particular,
|> because some much data is unnecessarily slung around over none-too-fast
|> networks. Of course, anything XML-based will make things only worse.

There is no doubt that General Incompetence is in overall command,
but the question is what form the incompetence takes :-)

|> Ancient as in a 30 MHz (IIRC) 68040 running NeXtStep - which is the most
|> responsive UIs I've ever seen? That is to say: any performance problem
|> with UIs is a problem of design and/or implementation, not the problem
|> as such. Not that that helps you any if the application you are using
|> is programmed on such a UI...cue WIN32 woes...

No :-(

Ancient as in a 250 MHz processor with lashings of memory, and the
need to run Netscape 6 or beyond, because of the ghastly Web pages
I need to access.

Look, I was asked

What are some examples of important and performance-limited
computation tasks that aren't run in parallel?

not WHY are they not run in parallel, nor WHY they are performance-
limited, nor WHETHER that is unavoidable. As you point out, it
is due to misdesigns at various levels. But it IS an example of
what I was asked for.

Regards,
Nick Maclaren.

Nick Maclaren · Sep 3, 2004

|>
|> GUIs and event processing and the inability to trivialy allow for
|> at least topwindowlevel total paralellism is just a complete screwup.
|> Its made worse by middleware (like say java and swing) exporting
|> such braindeadness to application level.
|>
|> So instead of "write to tolerate or take advantage of parallelism
|> if present" everybody writes for "serial everything in GUI is the
|> one andonly true way".

I should like to be able to disagree, but regret that I am unable
to. The one niggle that I have is that a FEW applications do
allow for parallelism at the top level which is, as you say,
trivial.

There is no reason why most of the underlying morass ("layers"
implies a degree of structure that it does not possess) should
not be fully asynchronous and parallel. Well, no good reason.
But it isn't in most modern designs.

Regards,
Nick Maclaren.

Robert Myers · Sep 3, 2004

Rupert said:
Robert Myers wrote:

[SNIP]

The smallest unit that anyone will ever program for non-embedded
applications will support I hesitate to guess how many execution
pipes, but certainly more than one. Single-pipe programming, using
tools appropriate for single-pipe programming, will come to seem just
as natural as doing physics without vectors and tensors.

The fact that this reality is finally percolating into the lowly but
ubiquitous PC is what I'm counting on for magic.

Click to expand...

I really wouldn't hold your breath. Look how long it took for SMP to
become ubiquitous with major league UNIXen ... Has it had much of an
impact on the code base at large ? IMO : It hasn't.

UNIX had three stumbling blocks :

1) UNIX does let you make use of multiple CPUs at a coarse grained level
with stuff like pipes (ie : good enough).

2) The predominance of single threaded language that promotes single
threaded thinking.

3) Libraries designed for single-threaded non-rentrant usage.

I wouldn't have the slightest clue, were it not for Gnu. As it is, I
have a clue, but just barely. What I see happening is that, if there is
a better way to do business, people want to find a way to get there.
Given the millions of lines of code that are written in a language and
with an OS descended from ones written for the PDP-11, fundamental
change is very hard. To make change, though, the first thing is that
you have to want to make change, and I'm optimistic enough to believe
that the will is there.

By all accounts Windows NT suffers from the same, but to be fair it
has supported threading for a very long time and MS has been pushing
it very hard too. The codebase is positively riddled with threads by
comparison to UNIX, but I haven't seen much that is genuinely scalable.

Why should Microsoft make the necessary investment? The truly obscene
margin they are making on an OS they have foisted on the world by
illegal means keeps the empire running. Because they need that margin
to keep the empire running, it is never going to be invested in the
kinds of radical rework that would be needed to fix the supposedly
already fixed Windows NT/2000/XP stuff (as opposed to the Windows
95/98/ME stuff that even Microsoft effectively admits is hopelessly broken).

The fact that Microsoft has Tony Hoare and Leslie Lamport on staff and
_still_ manages to produce such horrifying stuff argues for your point
of view, and I think Microsoft is a dead waste to the world in terms of
making any kind of fundamental contribution to software.

I'm prematurely gloating over the fact that Microsoft isn't going to do
an IBM redux. IBM has software still in use so ancient it should be in
a Museum of Natural History. IBM got that software situated at a time
when the style of business that made such hopelessly proprietary and
hermetic software possible dominated the industry. IBM also understood
that, no matter what it took to avoid them, surprises were unacceptable.
Microsoft hasn't understood that or practically anything other than
the way that IBM's hermetic, proprietary software has, in the end, been
its passport to survival.

I don't believe that some kid will have a stunning insight as a result
of having a 2 or a 4P NT/Linux box sat on their desk either. Such boxes
have been around a *long* time and in the hands of some very clever
people who have already cleaned out the low-hanging fruit and are about
1/3rd the way up the tree at the moment.

Stunning insights are hard to predict.

The lesson of the fundamental disciplines where I should have some
capacity to judge progress is that it is the kids who make the
breakthroughs. Nothing that I have learned about the world of software,
mostly as an outsider, would suggest to me that it works any differently
in that respect from physics or mathematics.

I think hard graft is needed, perhaps having more boxes in more hands
will help increase the volume of hard graft and in turn that might get
us a result.

The respectable argument at the core of what I have said is somewhat
akin to the arguments that free market theoreticians make. A modest
number of people, no matter how smart, are unlikely to come up with the
best solution to a problem like planning a national economy. Turn a
large number of even modestly endowed free agents loose, though, and
amazing things will happen.

RM

Mitch Alsup · Sep 3, 2004

Robert Myers said:
I sometimes think: no one experienced the microprocessor revolution.

Indeed. One thing we noticed in the RISC revolution (may it rest in
peace) was that a dual processor workstation did not get an application
done any faster, but it made the person interacting with the application
a lot happier!

One of the big benefits to a dual processor that is difficult to measure
is the improvement in hand eye coordination with the application. Lets
say a heavy CAD application is using 10% of a CPU for keyboard and mouse
activity, and 100% of the other CPU for application processing. This dual
processor arrangement is much better hand->(KB->app->graphics)->eye
coordination than a single CPU with 110% the processing power.

Stephen Fuld · Sep 3, 2004

snip

Look, I was asked

What are some examples of important and performance-limited
computation tasks that aren't run in parallel?

not WHY are they not run in parallel, nor WHY they are performance-
limited, nor WHETHER that is unavoidable. As you point out, it
is due to misdesigns at various levels. But it IS an example of
what I was asked for.

OK, let me rephrase the original question to more reflect what I think the
OP was asking.

What are some examples of important, CPU bound applications that are limited
by not being parallelized?

I mean this to eliminate answers that depend on improving the latency
between the UK and New Zealand, which is a different sort of research
program. :-)

I also mean to eliminate transaction processing, at least as
most commercial systems use it as it is already highly parallel between
transactions and very few individual transactions use enough CPU to benefit
much by within transaction CPU parallelism. I also mean to eliminate I/O as
that has been parallelized for decades (as you well know).

So from your original list we still have ODEs and perhaps UIs, though there
the benefit may be limited to relativly simple things like what was
mentioned earlier - dedicating a CPU to user interactions to assure
responsivness. Are there others?

Yousuf Khan · Sep 3, 2004

TOUATI said:
"This is evidence that Moore's Law continues," said Mark Bohr( Intel's
director of process architecture and integration).

I remember that I read some months ago an interesting study of two
intel researchers who had shown the end of Moore Law: they said that
we have now a real wall that we cannot cross.

Can a company contradict itself like this (within one year) ?

Only if one of the company reps is an executive. :-)

Yousuf Khan

Robert Myers · Sep 3, 2004

Stephen Fuld wrote:

So from your original list we still have ODEs and perhaps UIs, though there
the benefit may be limited to relativly simple things like what was
mentioned earlier - dedicating a CPU to user interactions to assure
responsivness. Are there others?

I take the question to be: how many applications have been created for
which appropriate hardware doesn't yet exist?

In the broad class of applications that will spring into existence when
appropriate resources become available, I would place those that depend
on brute force search.

RM

Nick Maclaren · Sep 3, 2004

I also mean to eliminate transaction processing, at least as
|> most commercial systems use it as it is already highly parallel between
|> transactions and very few individual transactions use enough CPU to benefit
|> much by within transaction CPU parallelism. I also mean to eliminate I/O as
|> that has been parallelized for decades (as you well know).

Actually, no, it doesn't eliminate it. I am not an expert on what
is normally known as transaction processing, but most of the things
that I have seen that fall under that have various steps. Now, in
many cases, many of those steps could be done in parallel, but aren't
(for the reasons I gave). Locking is all very well for some problems,
but not for others; Alpha LDC/STC designs can be applied more generally;
and so on.

Also, some I/O has been parallelised for decades, but modern forms
typically aren't. TCP/IP over Ethernet is usually dire, and that is
today's de facto standard.

If, however, you are referring to problem areas where there is no
known way of parallelising them, and yet they are bottlenecks, I
should have to think harder. I am certain that there are some, but
(as I said) a lot of people will have abandoned them as intractable.
So I should have to think about currently untackled requirements.

|> So from your original list we still have ODEs and perhaps UIs, though there
|> the benefit may be limited to relativly simple things like what was
|> mentioned earlier - dedicating a CPU to user interactions to assure
|> responsivness. Are there others?

Protein folding comes close. It is parallelisable in space, but
not easily in time. There are quite a lot of problems like that.

Regards,
Nick Maclaren.

Russell Wallace · Sep 3, 2004

|>
|> OK, let me rephrase the original question to more reflect what I think the
|> OP was asking.
|>
|> What are some examples of important, CPU bound applications that are limited
|> by not being parallelized?

Yes, that would be a better way of phrasing it.

|> I mean this to eliminate answers that depend on improving the latency
|> between the UK and New Zealand, which is a different sort of research
|> program.

Right

I'll agree it's an answer to the question I asked, but it's
not the sort of problem I'm interested in here.

If, however, you are referring to problem areas where there is no
known way of parallelising them, and yet they are bottlenecks, I
should have to think harder. I am certain that there are some, but
(as I said) a lot of people will have abandoned them as intractable.
So I should have to think about currently untackled requirements.
Okay.

Protein folding comes close. It is parallelisable in space, but
not easily in time. There are quite a lot of problems like that.

Speaking of which: It seems to me that a big problem with protein
folding and similar jobs (e.g. simulating galaxy collisions) is:

- If you want N digits of accuracy in the numerical calculations, you
just need to use N digits of numerical precision, for O(N^2)
computational effort.

- However, quantizing time produces errors; if you want to reduce
these to N digits of accuracy, you need to use exp(N) time steps.

Is this right? Or is there any way to put a bound on the total error
introduced by time quantization over many time steps?

(Fluid dynamics simulation has this problem too, but in both the space
and time dimensions; I suppose there's definitely no way of solving it
for the space dimension, at least, other than by brute force.)

Alex Johnson · Sep 3, 2004

Sander said:
Is any kind of itanium actually available on the open market (and
i mean openmarket for new chips, not resale of systems)?

Searching PriceWatch I got 11 offers for boxed Itanium 2 CPUs.

Nick Maclaren · Sep 3, 2004

Speaking of which: It seems to me that a big problem with protein
folding and similar jobs (e.g. simulating galaxy collisions) is:

- If you want N digits of accuracy in the numerical calculations, you
just need to use N digits of numerical precision, for O(N^2)
computational effort.

More or less.

- However, quantizing time produces errors; if you want to reduce
these to N digits of accuracy, you need to use exp(N) time steps.

Is this right? Or is there any way to put a bound on the total error
introduced by time quantization over many time steps?

There are ways, but they aren't very reliable. The worse problem
is that many such analyses are numerically unstable (a.k.a. chaotic),
and that the number of digits you need in your calculations is
exponential in the number of time steps. Also, reducing the size
of steps reduces one cause of error and increases this one.

You don't usually have to mince time as finely as you said, but
the problem remains. This is alleviated by the fact that most
numerical errors merely change one possible solution into another,
which is harmless. Unfortunately, there is (in general) no way of
telling whether that is happening or whether they are changing a
possible solution into an impossible one.

(Fluid dynamics simulation has this problem too, but in both the space
and time dimensions; I suppose there's definitely no way of solving it
for the space dimension, at least, other than by brute force.)

The same applies to the other problems. The formulae are different,
but the problems have a similar structure.

All this is why doing such things is a bit of a black art. I know
enough to know the problems in principle, but can't even start to
tackle serious problems in practice.

Regards,
Nick Maclaren.

Intel Cancels Digital TV Chip Market - What Next?	9	Oct 22, 2004
retro: Intel to use SOI	3	Aug 25, 2005
Dell to stop free home delivery	2	Sep 30, 2005
Reuters: FTC Says Spyware Assassin Vendor Shut Down	1	Mar 12, 2005
How long does MS sit on Patches before releasing them?	12	Mar 11, 2005
Intel laying off 1000 managers	2	Jul 13, 2006
Intel is infringing 10 Transmeta patents	4	Oct 12, 2006
news: Intel to pay AMD $1.25 billion, settle disputes	18	Nov 12, 2009

65nm news from Intel

=?ISO-8859-1?Q?Jan_Vorbr=FCggen?=

Russell Wallace

TOUATI Sid

Nick Maclaren

Nick Maclaren

Paul Repacholi

=?ISO-8859-1?Q?Jan_Vorbr=FCggen?=

Sander Vesik

Sander Vesik

Nick Maclaren

Nick Maclaren

Robert Myers

Mitch Alsup

Stephen Fuld

Yousuf Khan

Robert Myers

Nick Maclaren

Russell Wallace

Alex Johnson

Nick Maclaren

Ask a Question

Similar Threads