Interesting interview with Myerson on Cell, scaling etc.

  • Thread starter Thread starter George Macdonald
  • Start date Start date
Robert Myers said:
Regardless of what Myerson thinks, people will be looking at Cell's
~28 gigaflop double precision floating point performance, and
generally not for image processing.

I could point out that the Cell is simply the second iteration of
Sony's PS1 "Emotion Engine", which was predicted when announced to
take over the world. But I won't, because that's irrelevant.

What's relevant is that there is no software to make use of Cell, and
there never will be, except for PS2 games. Thus, the question of
whether there are other architectural deficiencies are irrelevant.

A beast like the Cell is veddy hard to program. Not something to
program in C. Or Cobol. Or Fortran. Or Algol. Hundreds or
thousands of megabytes of assembly code, anyone? ;-)
 
I could point out that the Cell is simply the second iteration of
Sony's PS1 "Emotion Engine", which was predicted when announced to
take over the world. But I won't, because that's irrelevant.

PS1 is a MIPS processor based machine. "Emotion Engine" went into
PS2. There's a MIPS processor that normally handles I/O for the
PS2, "Emotion engine" was supposed to take over the world. I was
at Hotchips when Sony came out and did the demo there. Very cool
to see the duck swimming around the bath tub in real time rendering
with the level of detail that it had (very cool for that era).

CELL will go into PS3.
What's relevant is that there is no software to make use of Cell, and
there never will be, except for PS2 games. Thus, the question of
whether there are other architectural deficiencies are irrelevant.

I'm not sure about the "there'll never be" part. The question is,
"If you build it, will they come?"

The CELL architecture is intriguing enough that I'm sure it'll
get some looks. Whether those looks turn into serious development
work some 3~5 years down the road is yet unknown, since STI is
still mum on the programming model and software stack that they
are willing to provide to developers.
A beast like the Cell is veddy hard to program. Not something to
program in C. Or Cobol. Or Fortran. Or Algol. Hundreds or
thousands of megabytes of assembly code, anyone? ;-)

I think Sony learned quite a bit from the experience of the EE.
Developers complained quite loudly about moving data in and out
of EE's tiny on-chip memory, and the fact that they had to hand
code a lot of stuff to get the promised performance was unattractive
also. However, as the PS2 platform matured, developers got better
at extracting the performance out of the same platform, and games
got better in terms of details. CELL architecture's learning curve
will likely be slightly less steep because of EE's learning
experience. Not going to be nearly as easy as just write C code
and throw it to the compiler, but we'll see where that goes.

(And the SPE's "local memory" is considerably larger than EE's)
 
What's relevant is that there is no software to make use of Cell, and
there never will be, except for PS2 games. Thus, the question of
whether there are other architectural deficiencies are irrelevant.

A beast like the Cell is veddy hard to program. Not something to
program in C. Or Cobol. Or Fortran. Or Algol. Hundreds or
thousands of megabytes of assembly code, anyone? ;-)

You do know there are software stacks for GPU's?

As the acting student was heard to say, "What's my motivation?"

Can you *imagine* how hard computers were to program when Dijkstra was
doing it just for fun?

People wrote Fortran compilers before Chomsky had a chance to utter
"context free grammar."

If the motivation is strong enough, absolutely nothing is impossible.
So the question here is, not whether the problem is hard (it is), but
whether the motivation is strong enough to justify new ways of doing
business.

As I have already rambled on about elsewhere, people are manifestly
doing stuff with GPU's, and I think they'll do stuff with Cell, and
I'll be real surprised if we don't get linux on playstation again.

But will anything really interesting happen?

What would interesting be?

Molecular dynamics simulations sufficiently faster than what we've got
now to change the rules. Cell is fast, but probably not _that_ fast.
That is not to say that a future generation won't be.

An AI or robotic application that otherwise wouldn't be interesting.
I'm not sufficiently clued in to make that judgment. Probably a
robotic application, because I don't offhand see how cell (as
configured) would be useful for AI. Embedded is where the action is,
and Cell has some serious competition there.

As to your prediction that there never will be software for anything
but PS2 games, I think it's a safe bet that you're wrong about that.
Will the software we see be a revolution? Unlikely, but that doesn't
mean we're not seeing the start of a revolution.

RM
 
from the said:
A bird's-eye view, with the bird in geosynchonous orbit, I'm afraid.

Regardless of what Myerson thinks, people will be looking at Cell's
~28 gigaflop double precision floating point performance, and
generally not for image processing.

Yup .. now how can I interface it to an x86?? I mean it can't be that
hard, we did it with an x87, and even with those whacky 3rd party FPUs,
back in the dark ages ....
 
Oh geez, I spelt the guy's name wrong.:-P
A bird's-eye view, with the bird in geosynchonous orbit, I'm afraid.

Just broad strokes. It was only an interview.
Regardless of what Myerson thinks, people will be looking at Cell's
~28 gigaflop double precision floating point performance, and
generally not for image processing.

And so... on to the bandwidth "problem".:-) I haven't paid much attention
to the details of Cell but outside its intended solution, that kind of FP
performance usually implies huge memory - i.e. is Cell flexible enough to
go there or do you have to do a mod on the basic customized design? Of
course with the volume of PSx you can afford that initial customization -
not clear that you can extrapolate to other problems.
 
I haven't paid much attention
to the details of Cell but outside its intended solution, that kind of FP
performance usually implies huge memory - i.e. is Cell flexible enough to
go there or do you have to do a mod on the basic customized design?

The "Memory capacity problem".

It's addressed in my followup article.

http://www.realworldtech.com/page.cfm?ArticleID=RWT022805234129

Bottomline, IBM didn't want to confirm that the CELL processor will
definitely support up to 72 XDR devices, but it shouldn't be a real
problem. Not to say that the engineering challenges aren't trivial,
but the processor should be able to handle it. It's likely they're
still working on the sub column command support, so they're not
disclosing the x4, x2 and x1 device configuraiton support, even
though the XDR devices support the sub column command modes.

The problem would be to figure out if it's worth it (economics) to
put a bunch of XDR devices on memory modules (saves board space),
and build the memory system that way. (lots of power too)

At it stands, XDR devices are available in 512 Mbit densities,
72 of these suckers gets you 32 GB of ECC (and chipkill too!)
supported memory. That ought to be enough for HPC applications.
Hopefully 1 Gbit XDR devices will arrive on the scene before
someone complains that 32 GB of memory (per processor) isn't
enough.

BTW, I link to Peter Hofstee's paper and presentations given @
HPCA in the reference section. Peter's slides show the Prototype
Sony/IBM CELL rack that's been powered up.
 
The "Memory capacity problem".

It's addressed in my followup article.

http://www.realworldtech.com/page.cfm?ArticleID=RWT022805234129

Bottomline, IBM didn't want to confirm that the CELL processor will
definitely support up to 72 XDR devices, but it shouldn't be a real
problem. Not to say that the engineering challenges aren't trivial,
but the processor should be able to handle it. It's likely they're
still working on the sub column command support, so they're not
disclosing the x4, x2 and x1 device configuraiton support, even
though the XDR devices support the sub column command modes.

The problem would be to figure out if it's worth it (economics) to
put a bunch of XDR devices on memory modules (saves board space),
and build the memory system that way. (lots of power too)

At it stands, XDR devices are available in 512 Mbit densities,
72 of these suckers gets you 32 GB of ECC (and chipkill too!)
supported memory. That ought to be enough for HPC applications.
Hopefully 1 Gbit XDR devices will arrive on the scene before
someone complains that 32 GB of memory (per processor) isn't
enough.

64x512Mb == 4GB??
BTW, I link to Peter Hofstee's paper and presentations given @
HPCA in the reference section. Peter's slides show the Prototype
Sony/IBM CELL rack that's been powered up.

Tnx
 
Back
Top