Intel's FB-DIMM, any kind of RAM will work for your controller?

  • Thread starter Thread starter Yousuf Khan
  • Start date Start date
Y

Yousuf Khan

Intel is introducing a type of DRAM called FB-DIMMs (fully buffered).
Apparently the idea is to be able to put any kind of DRAM technology (e.g.
DDR1 vs. DDR2) behind a buffer without having to worry about redesigning
your memory controller. Of course this intermediate step will add some
latency to the performance of the DRAM.

It is assumed that this is Intel's way of finally acknowledging that it has
to start integrating DRAM controllers onboard its CPUs, like AMD does
already. Of course adding latency to the interfaces is exactly the opposite
of what is the main advantage of integrating the DRAM controllers in the
first place.

http://arstechnica.com/news/posts/1082164553.html

Yousuf Khan
 
A buffer is meant to reduce overall latency, not to increase it AFAIK.

Not necessarily, a buffer is also meant to increase overall bandwidth, which
may be done at the expense of latency.

Yousuf Khan
 
Not necessarily, a buffer is also meant to increase overall bandwidth, which
may be done at the expense of latency.

Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
latency, unless there's some serious design flaws.
I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
ratio.
 
Cache on CPU is not meant to increase bandwidth but to decrease overall latency to retrieve data
from slower RAM. More cache-like buffers in the path thru the memory controller can only improve
latency, unless there's some serious design flaws.
I never seen a CPU that gets slower in accessing data when it can cache and has a good hit/miss
ratio.

You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
would never, ever make. Caches and their effects aren't pertinent to a
discussion of the buffering technique found on Fully Buffered DIMMs and their
effects on latency and bandwidth...

/daytripper (hth ;-)
 
You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
would never, ever make. Caches and their effects aren't pertinent to a
discussion of the buffering technique found on Fully Buffered DIMMs and their
effects on latency and bandwidth...

FB-DIMMs are supposed to work with an added cheap CPU or DSP with some fast RAM, I doubt embedded
DRAM on-chip simply due to higher costs but you never know how much they could make a product cheap
if they really want to and no expensive DSP or CPU is needed there anyway for the FB-DIMM to work.
I know how both caches and buffers work (circular buffering, FIFO buffering and so on) and because
they're used to achieve similar results sometimes (like on DSPs architectures where buffering is a
key to performance with proper assembly code...) , it's not that wrong to refer to a cache as a
buffer even if its mechanism it's quite different the goal it's almost the same. The truth is that
both ways of making bits data faster to be retrieved are useful and a proper combination of these
techniques can achieve higher performance both at the bandwidth and latency levels.
 
Cache on CPU is not meant to increase bandwidth but to decrease overall
latency to retrieve data
from slower RAM.

Yes, but not by making the RAM any faster, but by avoiding RAM accesses.
We add cache to the CPU because we admit our RAM is slow.
More cache-like buffers in the path thru the memory controller can only
improve
latency, unless there's some serious design flaws.

That makes no sense. Everything between the CPU and the memory will
increase latency. Even caches increase worst case latency because some time
is spent searching the cache before we start the memory access. I think
you're confused.
I never seen a CPU that gets slower in accessing data when it can cache
and has a good hit/miss
ratio.

Except that we're talking about memory latency due to buffers. And by
memory latency we mean the most time it will take between when we ask the
CPU to read a byte of memory and when we get that byte.

DS
 
FB-DIMMs are supposed to work with an added cheap CPU or DSP with some fast RAM, I doubt embedded
DRAM on-chip simply due to higher costs but you never know how much they could make a product cheap
if they really want to and no expensive DSP or CPU is needed there anyway for the FB-DIMM to work.
I know how both caches and buffers work (circular buffering, FIFO buffering and so on) and because
they're used to achieve similar results sometimes (like on DSPs architectures where buffering is a
key to performance with proper assembly code...) , it's not that wrong to refer to a cache as a
buffer even if its mechanism it's quite different the goal it's almost the same. The truth is that
both ways of making bits data faster to be retrieved are useful and a proper combination of these
techniques can achieve higher performance both at the bandwidth and latency levels.

Ummm.....no. You're still missing the gist of the discussion, and confusing
various forms of caching with the up and down-sides of using buffers in a
point-to-point interconnect.

Maybe going back and starting over might help...

/daytripper
 
..
AFAIK.

Not necessarily, a buffer is also meant to increase overall bandwidth, which
may be done at the expense of latency.

This particular buffer reduces the DRAM interface pinout by a factor
of 3 for CPU chips having the memory interface on-chip (such as
Opteron, the late and unlamented Timna, and future Intel CPUs). This
reduces the cost of the CPU chip while increasing the cost of the DIMM
(because of the added buffer chip).

And yes, the presence of the buffer does increase the latency.

There are other tradeoffs, the main one being the ability to add lots
more DRAM into a server. Not important for desktops. YMMV.
 
Do you ever get it right, Geno? I don't think I've seen it...


-------

http://www.faqs.org/docs/artu/ch12s04.html

Caching Operation Results
Sometimes you can get the best of both worlds (low latency and good throughput) by computing
expensive results as needed and caching them for later use. Earlier we mentioned that named reduces
latency by batching; it also reduces latency by caching the results of previous network transactions
with other DNS servers.

------
 
Not necessarily, a buffer is also meant to increase overall bandwidth, which
may be done at the expense of latency.

Yousuf Khan

http://www.analog.com/UploadedFiles/Application_Notes/144361534EE157.pdf


As you can see this Analog Devices DSP uses a mixed technique of buffering/caching to improve
latency in the best case scenario. Obviously if the caching doesn't work and the data it's not
locally available then the latency has to be higher because you've to get data from slower memory
but when the data is locally available the latency can be reduced down to zero approx in some cases.
 
FB-DIMMs are supposed to work with an added cheap CPU or DSP with
some fast RAM, I doubt embedded DRAM on-chip simply due to higher
costs but you never know how much they could make a product cheap
if they really want to and no expensive DSP or CPU is needed there
anyway for the FB-DIMM to work. I know how both caches and buffers
work (circular buffering, FIFO buffering and so on) and because
they're used to achieve similar results sometimes (like on DSPs
architectures where buffering is a key to performance with proper
assembly code...) , it's not that wrong to refer to a cache as a
buffer even if its mechanism it's quite different the goal it's
almost the same. The truth is that both ways of making bits data
faster to be retrieved are useful and a proper combination of
these techniques can achieve higher performance both at the
bandwidth and latency levels.

cache is a form of a buffer
buffer is not necesarly a cache, imagine one byte buffer, would you
call it a cache ?

Pozdrawiam.
 
You're using "buffer" interchangeably with "cache" - a mistake our Yousuf
would never, ever make. Caches and their effects aren't pertinent to a
discussion of the buffering technique found on Fully Buffered DIMMs and their
effects on latency and bandwidth...

Ah! I was getting quite confused by his statement about the buffer &
cache until you said this. Makes it perfectly clear now! :PppP

--
L.Angel: I'm looking for web design work.
If you need basic to med complexity webpages at affordable rates, email me :)
Standard HTML, SHTML, MySQL + PHP or ASP, Javascript.
If you really want, FrontPage & DreamWeaver too.
But keep in mind you pay extra bandwidth for their bloated code
 
RusH said:
cache is a form of a buffer
buffer is not necesarly a cache, imagine one byte buffer, would you
call it a cache ?

Sure; you can think of it as a *really* small cache, which will
therefore have a terrible hit ratio, thus (most likely) increasing latency.
 
As you can see this Analog Devices DSP uses a mixed technique of buffering/caching to improve
latency in the best case scenario. Obviously if the caching doesn't work and the data it's not
locally available then the latency has to be higher because you've to get data from slower memory
but when the data is locally available the latency can be reduced down to
zero approx in some cases.

In this case the buffer is used to eliminate DRAM interface differences when
going from one technology to a new one.

Yousuf Khan
 
zero approx in some cases.

In this case the buffer is used to eliminate DRAM interface differences when
going from one technology to a new one.

"But wait! There's more!"

The "FB" buffer on an FBdimm is also a bus repeater (aka "buffer") for the
"next" FBdimm in the chain of FBdimms that comprise a channel. The presence of
this buffer feature allows the channel to run at the advertised frequencies in
the face of LOTS of FBdimms on a single channel - frequencies that could not
be achieved if all those dimms were on the typical multi drop memory
interconnect (ala most multi-dimm SDR/DDR/DDR2 implementations).

Anyway...

I thought I knew the answer to this, but I haven't found it documented either
way: is the FB bus repeater simply a stateless signal buffer, thus adding its
lane-to-lane skew to the next device in the chain (which would imply some huge
de-skewing tasks for the nth FBdimm in - say - an 8 FBdimm implementation). Or
does the buffer de-skew lanes before passing the transaction on to the next
node?

/daytripper
 
Back
Top