Rambus aims for 1 TeraByte per second memory bandwidth by 2010

Ken Hagan · Dec 10, 2007

So, in short, you don't think the biggest problem confronting processor
design and performance isn't important because "it's hard"...

/daytripper (well, that's one way to go, I guess ;-)

I dunno if its a fair summary of Robert's position, but it is a fair
piece of strategy. It is silly to try to solve an impossible problem.
It is almost as silly to try to solve an almost impossible problem.

already5chosen · Dec 10, 2007

Sure there is -- SRAM and other designs which take more xtors
per cell. With the continually decreasing marginal cost
of xtors and a shortage of useful things to do with them,
I expect this transition to happen at some point.

SRAM could shave off 15 ns in the case of DRAM page miss. Or 50-55ns
in the case of page conflict, but those are very rare. In the
supposedly most common case of DRAM page hit SRAM doesn't help at all.
Actually, you will have hard time finding commodity SRAMs that is as
fast as now common DDR2-800 CL5 at page hit.
Another potential saving with SRAM comes from the fact that memory
controller is simpler. Don't know how much it could bring. The likes
of Opteron and Power6 run their MCs at very high speed so I'd guess it
would be hard to shave off more than 1-2 ns here.

Now look at the flop side:
1. Pins - SRAM address bus is up to twice wider than the DRAM. You can
construct SRAM with pseudo-pages and multiplexed address bus, but then
you give up on part of the latency advantage.
2. Capacity. The big one. SRAM capacity lags behind DRAM by factor of
5-10. It means that you will either need more channels (expensive
motherboard, expensive packaging of MPU/NB; not always possible due to
mechanical constrains) or more DIMMs per channel. The later noramally
means more buffering = higher latency. For example, for DDR2-667 one
can put on one channel 2 unbuffered DIMMs (lowest latency), 4
registered DIMMs (medium latency) or up to 8 fully-buffered DIMMs (the
highest latency).
3. Power consumption. I'm not an expert in this area, but according to
my understanding under heavy load SRAM consumes 2-3 times more power
than the equivalent DRAM. That's partly compensated by lower idle
power consumption (no need for refresh).
4. Cost. That's the other unfortunate effect of lower capacity.

John Ahlstrom · Dec 10, 2007

Ken said:
I dunno if its a fair summary of Robert's position, but it is a fair
piece of strategy. It is silly to try to solve an impossible problem.
It is almost as silly to try to solve an almost impossible problem.

How about
It's not important because it is not cost-effective?

Robert Redelmeier · Dec 11, 2007

In comp.sys.ibm.pc.hardware.chips (e-mail address removed) wrote in part:

SRAM could shave off 15 ns in the case of DRAM page miss. Or
50-55ns in the case of page conflict, but those are very
rare. In the supposedly most common case of DRAM page hit
SRAM doesn't help at all. Actually, you will have hard
time finding commodity SRAMs that is as fast as now common
DDR2-800 CL5 at page hit.

You are talking device response times, and I appreciate your
information. However, I am interested in system response (software
performance), and my measurements are far less encouraging:

Latency CPU@MHz mem.ctl RAM
ns

88 k8@2000 NForce3 DDR400
144 P3@1000 laptop SO-PC133?
148 2*P3@860 Serverworks ??
178 P4@1800 i850 RDRAM
184 K7@1667 SiS735 PC133
185 P3@600 440BX PC100
217 2*Cel@500 440BX PC90
234 P2@350 440BX PC100?
288 P2@333 440BX PC66

I do need to find & test some more modern systems, but I'm
underwhelmed by the slowness of latency improvement. CPU has
increased min 4x, latency response at best 2.5x . Run this
pgm from L2 (small set) and it comes back around 10 ns.

compile: $ gcc -O2 lat10m.c
run: $ time ./a.out [multiply user time by 100 to give ns]

/* lat10m.c - Measure latency of 10 million fresh memory reads
(C) Copyright 2005 Robert Redelmeier - GPL v2.0 licence granted */
int p[ 1<<21 ] ;
main (void) {
int i, j ;
for ( i=0 ; i < 1<<21 ; i++ ) p = 0x1FFFFF & (i-5000) ;
for ( j=i=0 ; i < 9600000 ; i++ ) j = p[j] ;
return j ; }

-- Robert

Rambus working on next-gen RAM memory with 1 TeraByte/sec bandwidthperformance by 2010	1	Dec 3, 2007
ASUS Radeon R9 Fury STRIX Graphics Card	1	Apr 7, 2016
CELL 2 "Enhanced Cell Broadband Engine" to be revealed soon	2	Apr 12, 2007
IBM claims new eDRAM will double processor performance	4	Feb 16, 2007
RAMGATE in full swing. (GPU/Cuda Memory Bandwidth Performance Test now available) nice for GTX 970 t	4	Jan 28, 2015
Xbox 360 graphics capability vs PlayStation3 (X360 is superior)	18	May 2, 2007
AMD new chip	3	Dec 20, 2006
XBox 2 graphics & bandwidth	23	Sep 9, 2003

Rambus aims for 1 TeraByte per second memory bandwidth by 2010

Ken Hagan

already5chosen

John Ahlstrom

Robert Redelmeier

Ask a Question

Similar Threads