F
Felger Carbon
Red Storm is not scheduled to be fully operative until Aug, and the
Super Bowl kickoff is some hours away. Let's kill some time here.
Robert, Yousuf and I have been discussing Red Storm's characteristics.
This is a problem since the details of the mesh connections have not
been revealed yet. I believe that Robert and Yousuf started as
programmers who picked up some hardware knowledge. I was a nuts &
bolts hardware engineer (now retired) who had to make things work. In
the latter part of my career, the things were computers or computer
subsystems. Along the way I picked up some knowledge of programming.
I'm gonna apply my hardware point of view to some things we actually
know about Red Storm.
10,368 CPUs connected in a 27 by 16 by 24 mesh (X, Y, Z).
Topologically, this is a cube. But, the map is not the territory, as
someone once said. Let's discuss the territory - the physical
implementation of this cubic mesh.
There are 4 CPUs per board.
There are 8 boards per card cage.
There are 3 card cages per cabinet.
There are 108 cabinets. If you've been keeping track, that's 10,368
CPUs.
The cabinet measures 2ft wide, 4ft long (in the direction of air
movement), and about 7 or 8 feet high.
The physical arrangement of the cabinets is 27 cabinets wide by 4
cabinets deep.
Question: what's the simplest wiring arrangement for the X Y Z mesh?
Since the cabinets are arranged 27 wide, it's obvious how the X mesh
(dimension 27) is wired up; there is one cabinet wide by 4 deep for
each value of X. So the wire for X = 3 (for example) must connect to
just 4 cabinets, but to every CPU in those cabinets.
Hmm. If one axis is limited to 4 cabinets per value, I got a hunch
another axis can be limited to 27 cabinets per value, and the final
axis has to connect to every cabinet for every value. Let's identify
the 27 cabinets per value:
Let's conceptually divide each card cage in half. Each half-cage
contains 16 CPUs, and there are 6 half-cages per cabinet. Each
half-cage can contain all the 16 possible values of Y for given values
of X,Z. Six half-cages per cabinet times 4 cabinets deep equals 24,
which is the mesh's Z dimension. So:
Each of 27 X values must connect to 4 cabinets.
Each of 16 Y values must connect to all 108 cabinets.
Each of 24 Z values must connect to 27 cabinets.
Within each cabinet, just one X value connects to all 96 CPUs.
Within each cabinet, a given Y value connects to one CPU in each of
six half-cages.
Within each cabinet, a given Z value connects to all 16 CPUs in a
given half-cage.
Only one X value per cabinet.
All 16 Y values per cabinet.
Six Z values per cabinet.
The above information is what it takes to convert the Red Storm map
(mesh) to the Red Storm territory (physical implementation).
The message you should get out of this is that if a given 3D mesh-type
MPU doesn't conform in some logical way to the 2-D physical cabinet
layout, there's big trouble.
It's easier to see this if you consider the physical layout to be 27
cabinets wide by 24 half-cages deep, with 16 CPUs per half-cage.
Voila, a simple 27 by 16 by 24 mesh. ;-)
<http://www.lanl.gov/orgs/ccn/salishan2003/pdf/camp.pdf>
The above link is for a 77-slide presentation by the customers, not by
Cray. ~3.6 meg download. It covers (justifies) the design choices
made.
I have a bone to pick with slide 59, the one that describes a "full
crossbar" as some folks' Holy Grail. I call a 10,368 by 10,368
crossbar a "bottomless money pit". When you fill the pit with money,
you have a really well-connected MPU. But the pit is bottomless...
Super Bowl kickoff is some hours away. Let's kill some time here.
Robert, Yousuf and I have been discussing Red Storm's characteristics.
This is a problem since the details of the mesh connections have not
been revealed yet. I believe that Robert and Yousuf started as
programmers who picked up some hardware knowledge. I was a nuts &
bolts hardware engineer (now retired) who had to make things work. In
the latter part of my career, the things were computers or computer
subsystems. Along the way I picked up some knowledge of programming.
I'm gonna apply my hardware point of view to some things we actually
know about Red Storm.
10,368 CPUs connected in a 27 by 16 by 24 mesh (X, Y, Z).
Topologically, this is a cube. But, the map is not the territory, as
someone once said. Let's discuss the territory - the physical
implementation of this cubic mesh.
There are 4 CPUs per board.
There are 8 boards per card cage.
There are 3 card cages per cabinet.
There are 108 cabinets. If you've been keeping track, that's 10,368
CPUs.
The cabinet measures 2ft wide, 4ft long (in the direction of air
movement), and about 7 or 8 feet high.
The physical arrangement of the cabinets is 27 cabinets wide by 4
cabinets deep.
Question: what's the simplest wiring arrangement for the X Y Z mesh?
Since the cabinets are arranged 27 wide, it's obvious how the X mesh
(dimension 27) is wired up; there is one cabinet wide by 4 deep for
each value of X. So the wire for X = 3 (for example) must connect to
just 4 cabinets, but to every CPU in those cabinets.
Hmm. If one axis is limited to 4 cabinets per value, I got a hunch
another axis can be limited to 27 cabinets per value, and the final
axis has to connect to every cabinet for every value. Let's identify
the 27 cabinets per value:
Let's conceptually divide each card cage in half. Each half-cage
contains 16 CPUs, and there are 6 half-cages per cabinet. Each
half-cage can contain all the 16 possible values of Y for given values
of X,Z. Six half-cages per cabinet times 4 cabinets deep equals 24,
which is the mesh's Z dimension. So:
Each of 27 X values must connect to 4 cabinets.
Each of 16 Y values must connect to all 108 cabinets.
Each of 24 Z values must connect to 27 cabinets.
Within each cabinet, just one X value connects to all 96 CPUs.
Within each cabinet, a given Y value connects to one CPU in each of
six half-cages.
Within each cabinet, a given Z value connects to all 16 CPUs in a
given half-cage.
Only one X value per cabinet.
All 16 Y values per cabinet.
Six Z values per cabinet.
The above information is what it takes to convert the Red Storm map
(mesh) to the Red Storm territory (physical implementation).
The message you should get out of this is that if a given 3D mesh-type
MPU doesn't conform in some logical way to the 2-D physical cabinet
layout, there's big trouble.
It's easier to see this if you consider the physical layout to be 27
cabinets wide by 24 half-cages deep, with 16 CPUs per half-cage.
Voila, a simple 27 by 16 by 24 mesh. ;-)
<http://www.lanl.gov/orgs/ccn/salishan2003/pdf/camp.pdf>
The above link is for a 77-slide presentation by the customers, not by
Cray. ~3.6 meg download. It covers (justifies) the design choices
made.
I have a bone to pick with slide 59, the one that describes a "full
crossbar" as some folks' Holy Grail. I call a 10,368 by 10,368
crossbar a "bottomless money pit". When you fill the pit with money,
you have a really well-connected MPU. But the pit is bottomless...