Robert said:
Rupert said:
Robert said:
Rupert Pigott wrote:
[SNIP]
I did. Blue Gene was the best contrast I could think of to a single
image Itanium machine in terms of cost, energy efficiency, and
scalability. There is no fundamental reason why BlueGene couldn't
become widely used and accepted, but it probably won't be because it
won't show up in the workspace of your average graduate student or
postdoc.
Cluster style systems should be fairly easy to come by at that level.
They are, indeed, and they are widely used.
Apps written for clusters should port to another cluster system easier
than apps written for a shared memory system to a cluster.
You are apparently arguing for the desirability of folding the
artificial computational boundaries of clusters into software. If
That happens with SSI systems too. There is a load of information that
has been published about scaling on SGI's Origin machines over the
years. IIRC Altix is based on the same Origin 3000 design. You may
remember that I quizzed Rob Warnock on this, he said that there were
in practice little gotchas that tend to crop up at particular #'s of
procs. He even noted that the gotcha processor counts tended to change
with the particular generation of Origin.
that's a necessity of life, I can learn to live with it, but I'm having
a hard time seeing it as desirable. We are so fortunate as to live in a
universe that presents itself to us in midtower-sized chunks? I'm
worried. ;-).
In my mind it's a question of fitting our computing effort to reality
as opposed to living in an Ivory Tower. Some goals, while worthy,
desirable, or even partially achievable, are basically impossible to
achieve in reality. A genuinely *flat* address space is impossible
right here and now. That SSI Altix box will *not* have *flat* address
space in terms of time. It is a NUMA machine.
Can you give an example of something that you think would happen?
Depends on the app. Stuff like memory mapping one large file for read
and occasional write could cause some fantastic locking + latency
issues when it comes to porting.
[SNIP]
Even more depressing, if your goal is to crank out papers and Ph.D.
theses, you may do pretty well with beige boxes and cheap labor and have
very little impact on applied science and technology, because people
trying to solve real world problems can't wait for a grad student and a
post doc to spend a semester getting the cluster shaken down, and even
if they could it wouldn't make any economic sense because the labor
costs are too high.
Shaking down large + fast machines has traditionally been a costly
and risky business. Look at all those machines that spent hours
with grads all over them and didn't really make an impact, thinking
of stuff like the bigger ETAs, TM-5s didn't seem to do much either.
Shaking down Crays took some time too, although to be fair they do
have a good rep for reliability once setup. However Crays are toys
by comparison to contemporary big systems (component count etc)...
In terms of sorting out clusters and stuff there is obviously a
niche there, from what I read it appears to be getting filled too.
I really do think, now that PCI Express is here, that the day of
infiniband, at least for this particular space, is finally at hand.
Yeah, interconnect is catching up at bloody last. You will always
have latency problems while we're communcating < c m/s though,
regardless of whether you present your network to the application
as a single address space or not.
I was actually imagining that there is really nothing to keep the
prerequisites for a single image box from becoming more of a commodity.
I mentioned Opteron, if HT really does suffer from crash+burn on
comms failure then it is holding itself back. If that ain't the
case I'd have figured that a tiny form factor Opteron + DRAM +
router cards would be a reasonable component for high-density
clusters and beige SSI machines. You'd need some facility for
driving some links for longer distances than HT currently allows
too ($$$). The next thing holding you back is tuning the OS + Apps
to a myriad of possible configurations...
[SNIP]
The optimistic view is that the chaos we currently see is the HPC
equivalent of the pre-Cambrian explosion and that natural selection will
eventually give us a mature and widely-adopted architecture. My purpose
in starting this discussion was simply to opine that single image
architectures have some features that make them seem promising as a
survivor--not a widely-held view, I think.
I'm sure they'll have their place. But in the long run I think that
PetaFLOP pressure will tend to push people towards message passing
style machines. Consdier this though : Internet is becoming more and
more prominent on daily life. The Spooks must have a fair old time
keeping up with the sheer volume of data flowing around the globe.
Distributed processing is a natural fit here, SSI machines just would
not make sense. More and more governments and their civil servants
will want to make use of this surveillance resource too, check out
the rate at which legislation is legitimising their intrusion on the
individual's privacy. The War on Terror has added more fuel to that
growth market too.
Geez, Rupert, they couldn't possibly be as bad as IBM used to be.
.
Probably not because they are a niche player beholden to a few very
powerful customers.
I can live with clusters. It may be that living with clusters is an
inevitable necessity. I'm not yet ready to give up on a single address
space, though.
Fair enough. Just don't hold your breath waiting for a kilonode SSI
machine to fall into your lap.
Cheers,
Rupert